AI Writing Detection: Red Flags

Currently, no software is able to detect AI-generated content with 100% certainty. As you review student submissions, in addition to using feedback from AI-detecting software, check for the following red flags.

AI-generated content is often:

  • affected by factual errors. Generative AI models work by predicting the next word based on the previous context. They do not “know” things. Because of that, they tend to output statements that look plausible, but are factually incorrect. This phenomenon is known as AI hallucination. If a submission contains many such errors, or one or two very dramatic ones, it is likely to be AI-generated.
    • Additionally, OpenAI’s models have been trained on data that only go up to 2021, so ChatGPT outputs are more likely to be incorrect when it comes to recent data or events.
    • Made-up sources is a very common issue in texts generated by GPT-3.5.
    • Features that come with paid upgrades may mitigate these concerns and make plagiarism less obvious. GPT-4 outputs still show hallucination issues, but to a lesser extent. The plugin support introduced in March 2023 (also available with paid subscription) will also help address some of these issues. Specifically, the browsing plugin supported by OpenAI uses Bing to browse the internet, which helps compensate for the training data being several years old. Both GPT-4 and the plugins are only available via paid subscription and waitlists, but some students may have it.
  • Not consistent with assignment guidelines. A submission that is AI-generated may not be able to follow the instructions, especially if the assignment asks students to reference specific data and sources. If a submission references data and sources that are unexpected or unusual, that is a red flag.
  • Atypically correct in grammar, usage, and editing.
  • Voiceless and impersonal. It is correct and easy to read, but without any sense of a human person behind it — fallible, uneven, passionate, awkward. You will not be able to see a student behind the writing.
  • Predictable. It follows predictable formations: strong topic sentences at the top of paragraphs; summary sentences at the end of paragraphs; even treatment of topics that reads a bit like patter: On the one hand, many people believe X is terrible; on the other hand, many people believe X is wonderful.”
    • Note: ChatGPT can be instructed to take on a voice: write like a senior in high school, or write like a marketing executive working in financial services. It does not do a particularly good job with shapeshifting, but its register can move some.
  • Directionless and detached. It will shy away from expressing a strong opinion, taking a position on an issue, self-reflecting, or envisioning a future. With the right kind of prompting, it can be coaxed to do some of those things, but only to an extent (GPT-4 seems to do better than others), as it will continue sounding unnaturally cautious, detached, and empty of direction/content.
    • For example: when asked to express an opinion, AI tends to describe a variety of possible perspectives, without showing preference for any one of them. When prompted to self-reflect, it either discloses that it is unable to do so (weak case), or enumerates all possible weaknesses and logical flaws that could, theoretically, be found in its argument. There is never much of a direction or commitment to be found in its “positions” and “reflections.”