Over the course of the spring semester, Hahn noticed a pattern of these false positives. Turnitin’s tool was much more likely to flag international students’ writing as AI-generated. As Hahn started to see this trend, a group of Stanford computer scientists designed an experiment to better understand the reliability of AI detectors on writing by non-native English speakers. They published a paper last month, finding a clear bias. While they didn’t run their experiment with Turnitin, they found that seven other AI detectors flagged writing by non-native speakers as AI-generated 61 percent of the time. On about 20 percent of papers, that incorrect assessment was unanimous. Meanwhile, the detectors almost never made such mistakes when assessing the writing of native English speakers.
Mathewson, T. G. (2023, August 14). AI Detection Tools Falsely Accuse International Students of Cheating. The Markup. https://themarkup.org/machine-learning/2023/08/14/ai-detection-tools-falsely-accuse-international-students-of-cheating