Stop Policing Punctuation Now: Why AI Detection Needs a Rethink

By copying the HTML below, you will be adhering to all our guidelines.

By Andrew Welsman and Janine Arantes

Like many educators, our social media feeds have been filled with commentary on the impact of AI on teaching, learning, and assessment. One problem appears intractable; namely, how can we tell when students have used AI-generated text in their work? We’re not writing to offer an answer to that question; indeed, at this point, it’s clear that there isn’t a reliable method of separating ‘synthetic’ text from ‘organic’. Instead, we want to bring attention to a troubling possibility, one that is of perhaps greater significance to our role as educators who don’t necessarily teach the finer points of literacy and writing while teaching just about every subject. It’s this: In our efforts to police the use of AI in learning and assessment, are we likely to diminish the quality and character of our own writing and that of our students in the process?

While many of us have tried to keep up with new tools, evolving policies, and changes with detection software, one strategy has become increasingly popular: look for stylistic ‘tells’. It appears to us that the search for shortcuts in AI detection has led to what psychologists Shah and Oppenheimer call the “path of least resistance”. That is, we gravitate to cues that are easily perceived, easily evaluated, and easily judged. In other words, heuristics. Em dashes? Colons in titles? Specific words or expressions? All have been called out as signs of AI authorship. But here is the problem: these shortcuts don’t work. Worse, they normalise suspicion of well-crafted, edited and even creative writing. When we start to see polished punctuation and consistent tone as evidence of cheating, we inadvertently signal to students and our peers, that good writing is suspect.

Why?

Let’s start with an example. On social media, we have seen commentators breezily claim that the use of the em dash—the long dash that can be used in place of parentheses or a semi-colon—is the “smoking gun” betraying that a text was authored by AI. As a self-professed fan of the em dash, this prompted Andrew to go searching. A cursory googling using the search phrase “the em dash as an indicator of AI content” revealed that this is a popular topic, with plenty of commentary being traded for and against the notion. Some suggest that using em dashes makes for well-styled and cadenced writing, while others claim that em dashes appear so infrequently in so-called ‘normal’ writing that seeing an em dash ‘in the wild’ is always suspect. But the conjecture and speculation doesn’t end with the em dash.

The colon in titling: Another AI tell?

So, we dug deeper. Another purported “give-away” is supposedly the use of the colon to separate titles and sub-titles in a body of text. This seemed a bit of a reach, as academic writing in particular often employs the colon to sub-divide or elaborate on concepts in titles and subtitles. At this point, we realized we needed to consult the proverbial source of these claims, so off to prominent large language models (LLMs) we went.

We each tried out different LLMs. In ChatGPT, Andrew started with the prompt “What are some common tells that a text is in fact AI-authored?” It came back with a list of 10. The list ranged from “Overly Formal or Polished Language” and “Overuse of Transitional Phrases” to “Too Balanced or Fence Sitting,” all of which could be claimed to be common in academic writing. Whereas, when Janine asked the same question, Gemini (2.4 Flash) replied: “repetitive phrasing, formulaic sentence structures, overly polished or generic language lacking deep insight or personal nuance, the frequent use of certain transitional phrases and words (e.g., “delve,” “tapestry”), and occasionally, factual errors or fabricated citations”.

Great questions

Albeit, many of these were stylistic claims, rather than observations about punctuation, so we decided to dig deeper. When ChatGPT was asked “What about punctuation?” it replied, “Great question – punctuation is another area where AI-generated text can give itself away.” It seemed that ChatGPT was confirming the blogosphere’s punctuation and style concerns in relation to authenticity, noting that the overuse of things like commas, em dashes, colons, and parentheses are “punctuation-related tells.” Janine asked the same question, and Gemini replied “AI-authored text can exhibit “tells” through the overuse of specific punctuation marks like the em dash or colon, the consistent application of flawlessly textbook-correct punctuation lacking human variation or “casual inconsistencies,” and the absence of the nuanced stylistic choices typical of human writers.”

Both of our responses of “AI tells” included overly polished work, overuse of some phrases and the consistent, almost perfect use of specific punctuation like em dashes and colons. The similarities were obvious: consistently correct or textbook punctuation and limited to no typos or casual inconsistencies. Was grammatically correct and proof-read work now concerning? Or worse, the sole domain of LLMs? Should we, as authors and educators, be aiming (as it were) to be more “casually inconsistent” in our writing so as to not appear like we have used an LLM? And to teach our students this in-turn?

On the spread of lazy AI detection heuristics

In a fascinating paper, “The Path of Least Resistance: Using Easy-to-Access Information,” Princeton psychologists Shah and Oppenheimer have proposed a framework for understanding how people might use highly accessible cues in everyday decision making. Their framework, in which they explain how people tend to use easily perceived, produced, and evaluated cues to makes decisions, has particular relevance for a scenario in which a teacher is attempting to detect AI -generated text. As visible linguistic markers, punctuation could be regarded as an example of a highly accessible cue. After all, types and patterns of punctuation are easily perceived and evaluated by readers, much more so than more nebulous concepts such as “tone” and “complexity” of vocabulary. One could imagine why punctuation as a cue for detecting AI-generated work might make for a seductive proposition, and why it has become the subject of so much social media speculation.

Whether the punctuation “rules of thumb” for AI detection being promoted on social media are credible or not is one matter. One thing is nevertheless certain: the idea of punctuation as a tool for AI detection is pernicious—the em dash and other proposed AI detection heuristics are now in the public consciousness and being talked about as if they are useful, despite noteworthy appeals to reason here, here, and here. Our concern as educators is this: Collectively, we may be in real danger of assimilating these “easy” cues and applying them (whether consciously or otherwise) to our own writing and when assessing the work of our students.

Where might this end?

Educators are not immune to bias. In the absence of certainty, it’s natural for us to lean into intuition. But intuition shaped by social media tropes is not a sound basis for academic judgement. Perhaps the deeper danger here is that lazy heuristics for AI detection reduce our ability to actually teach and also lower our expectations, as they cast suspicions on students and peers who have worked hard to improve their writing.

What if we ‘normalise’ our expectations for authentic writing to be automatically suspicious of polish, punctuation, and proof-reading in our student’s work? Rules-of-thumb and looking for “AI tells” is not the answer. They may make for seductive heuristics in decision-making in the present AI-policing climate; but let’s be clear—they’re lazy and specious within the domains of scholarship and academic writing. Noone has an answer to AI detection; there is no silver bullet, algorithmic or otherwise, to help us. In some meaningful ways, the Turing test appears to have been passed. But what is for sure: we need a new baseline.

Make sure you know the human

What that looks like is currently being debated across the globe. Some have returned to pen and paper, handwritten notebooks and oral defences, but in a context where good writing is suspect, this suggests one thing: if you can’t distinguish AI-generated text from human, make sure you know the human (in this case, your students). That, at least, is what we should be aiming for in Higher Ed. Small classes help; getting to know your students early in a course—even better. Whether you re-engage students with pen and paper, or utilize verbal presentations as components of teaching, learning, and assessment, one thing is clear: In the arms race of academic integrity in the age of AI, knowing your student, rather than relying on rules of thumb or expensive detection algorithms, is the path forward. And to Andrew’s earlier point—no, he will not stop using the em dash.

Andrew Welsman is an educator in initial teacher education at Victoria University. A former secondary school teacher, his teaching and research interests centre on issues affecting STEM education and the pedagogical impacts of digital technologies. Janine Arantes is a senior lecturer and Research Fellow at Victoria University with expertise in AI governance, education policy, and digital equity.

This article was originally published on EduResearch Matters. Read the original article. "AARE"

Stop Policing Punctuation Now: Why AI Detection Needs a Rethink

Tags: