OpenAI launches tool to catch AI-generated text3 min read
It’s AI hunting AI.
OpenAI, the startup that created the text generator ChatGPT, launched a tool Tuesday to identify text generated by artificial intelligence.
The “AI Text Classifier,” as the company calls it, is a “fine-tuned GPT model that predicts how likely it is that a piece of text was generated by AI from a variety of sources,” OpenAI said in a blog post.
The classifier will label text as “very likely,” “unlikely,” “unclear if it is,” “possibly” or “likely” AI-generated.
“Our intended use for the AI Text Classifier is to foster conversation about the distinction between human-written and AI-generated content,” the blog post said. “The results may help, but should not be the sole piece of evidence, when deciding whether a document was generated with AI.”
ChatGPT, which became popular online late last year, is a free AI tool that can generate dialogue based on user prompts, and it has gone viral for producing poems, recipes, emails and other text samples. The chatbot has passed graduate-level exams in multiple fields, including the final exam for the University of Pennsylvania’s master of business administration program and exams for four law courses at the University of Minnesota. It also performed “comfortably within the passing range” of the U.S. medical licensing exam.
The accessibility and capabilities of ChatGPT have raised concerns among many educators. The New York City Education Department banned ChatGPT from school devices and networks this month, citing concern over the “negative impacts of student learning.” A spokesperson for the department said that the tool can provide “quick and easy answers to questions” but that it “does not build critical-thinking and problem-solving skills.” Some schools and colleges have considered amending their honor codes to address the rise of ChatGPT and other text generators.
That has also sparked efforts to create programs to detect AI-generated writing. Edward Tian, a senior at Princeton University, developed GPTZero late last year to combat AI plagiarism in academia. The plagiarism detection tool Copyleaks launched its own AI Content Detector this month for educational institutions and publishing. The Giant Learning Model Test Room, a 2019 collaboration between the MIT-IBM Watson AI Lab and the Harvard Natural Language Processing Group, identifies AI-generated writing using predictive text.
OpenAI’s classifier has some limitations. Writing samples must be at least 1,000 characters, or about 150 to 250 words. The blog post noted that the tool isn’t always accurate — AI-generated text can be edited to evade detection tools, and the text classifier may misidentify both AI-generated and human-written samples.
OpenAI also acknowledged that the tool was trained using English text samples written by adults, so it may misidentify content written by children or in languages other than English.
OpenAI said it has “not thoroughly assessed” the classifier’s effectiveness in “detecting content written in collaboration with human authors.”
To train the text classifier model, OpenAI used human-written text from a Wikipedia dataset, a 2019 WebText dataset and human demonstrations that were used to train InstructGPT, another language model. The company said it used “balanced batches that contain equal proportions AI-generated and human-written text” to train the text classifier.
Still, OpenAI said, the classifier may be “extremely confident in a wrong prediction,” because it hasn’t been “carefully evaluated” on “principle targets” like student essays, chat transcripts or disinformation campaigns.
“Because of these limitations, we recommend that the classifier be used only as one factor out of many when used as a part of an investigation determining a piece of content’s source,” OpenAI said.
Morgan Sung is a trends reporter for NBC News Digital.