We tested a new ChatGPT-detector for teachers. It flagged an innocent student.

But AI alone won’t solve the problem AI created. The flag on a portion of Goetz’s essay was an outlier, but shows detectors can sometimes get it wrong — with potentially disastrous consequences for students. Detectors are being introduced before they’ve been widely vetted, yet AI tech is moving so fast, any tool is likely already out of date.

Fowler, G. A. (2023, April 3). We tested a new ChatGPT-detector for teachers. It flagged an innocent student. Washington Post. https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-turnitin/

To see what’s at stake, I asked Turnitin for early access to its software. Five high school students, including Goetz, volunteered to help me test it by creating 16 samples of real, AI-fabricated and mixed-source essays to run past Turnitin’s detector.

The result? It got over half of them at least partly wrong. Turnitin accurately identified six of the 16 — but failed on three, including a flag on 8 percent of Goetz’s original essay. And I’d give it only partial credit on the remaining seven, where it was directionally correct but misidentified some portion of ChatGPT-generated or mixed-source writing.

Digital Shred

Privacy Literacy Toolkit

We tested a new ChatGPT-detector for teachers. It flagged an innocent student. – Washington Post