Become a member

Get the best offers and updates relating to Liberty Case News.

― Advertisement ―

HomeBusinessThe Best AI Detectors, Tested for Accuracy

The Best AI Detectors, Tested for Accuracy

The use of generative AI continues to soar. We’re seeing it in our workplaces, classrooms, and personal lives. As technology advances, it gets harder and harder for the human eye to parse out what is generated by AI — also called large language models (LLMs) — and what is human-written. In circumstances when AI-generated copy is prohibited or discouraged, you may need extra help determining if content is AI-generated. The best AI detectors scan text to find signs of AI use.

AI detectors are controversial, with studies showing that the results are unpredictable at best and biased against non-native English speakers. To help better understand these tools and how they work, I talked to Jonathan Gillham, CEO and founder of Originality.ai, Shu Hu, Ph.D., assistant professor at the School of Applied and Creative Computing at Purdue University, who has conducted extensive research on AI, and Soheil Feizi, Ph.D., founder and CEO of the startup RELAI and assistant professor in computer studies and director of Reliable AI Lab at the University of Maryland, a prominent voice in AI detector research.

After running 60 different documents of known origins, including essays, résumés, and cover letters, through eight AI detectors and recording their accuracy, I’ve narrowed the results down to the two best AI detectors. Copyleaks is the best AI detector overall, with excellent accuracy in detecting AI across materials. GPTZero is the best free AI detector, with good accuracy that’s best for quick use if you only have a couple of things to scan. Read my thoughts on each detector, and my key takeaways from trying these testy tools, including why I think you should use them with caution, below.

How do AI detectors work?

AI detectors are models trained on a huge body of AI-generated and human-written text, according to Gillham. By comparing AI and human text, the models are able to find patterns and signs that indicate whether AI was used in the writing process. The detectors then give a probability percentage that a text is AI. They’re trained on a diverse array of content, including AI content from numerous AI models like ChatGPT, Claude, and more. Detectors are updated as LLMs update, as well.

AI detectors are popular tools for teachers, professors, editors, and hiring managers to find AI in text that is meant to be human-generated, such as essays and cover letters to verify originality. However, AI detectors aren’t perfect, and results are often inaccurate and can be viewed as misleading. Especially when someone uses AI to refine or polish their text, false positives are common, according to Feizi.

How we tested AI detectors

I ran 60 documents through each AI detector to test its accuracy. My testing materials included an academic essay, a first-person essay, a résumé, a cover letter, a proof in LaTeX code, and a poem. Each piece of content included versions written 100% by AI, partially by AI, with assistance from an AI grammar check, and without any AI. I generated or edited the AI content with ChatGPT, DeepAI, and Claude. I then tracked the composition or confidence percentage for each material, analyzing 480 scans.

Our top picks for AI detectors

Best overall: Copyleaks – See at Copyleaks

Best free: GPTZero – See at GPTZero

Best overall

Copyleaks AI Detector

Copyleaks is the best AI detector for businesses, editors, and teachers. It performed well in our tests, especially when scanning academic essays, first-person essays, and cover letters.

Out of the eight AI detectors I tested, Copyleaks was the most accurate at identifying human vs. AI-generated copy. Copyleaks excelled at identifying AI content in academic essays, first-person essays, and cover letters. It also outperformed all other detectors during the résumé, proof, and creative writing scans — something all other detectors significantly struggled with.

While all detectors had trouble determining when a piece of writing was partially written or edited by AI, Copyleaks stood out in knowing when something definitely was or was not generated with AI. All wholly AI-generated materials I ran through it came back near 100%, with the lowest percentage being the AI-generated résumé at 83.3%. On the flipside, all original content came back as 0% AI, aside from an anomaly that flagged the résumé I wrote as 100% AI.

There’s a price you’ll have to pay for that accuracy, though: Copyleaks is the second-most expensive detector we tested, only behind Undetectable AI, which costs $19 a month. The cheapest Copyleaks plan is $13.99 a month if you commit to an annual plan, or $16.99 a month. The plans don’t give you unlimited characters; rather, they work off a credit system. The basic plan includes 100 credits a month. Each credit is 250 words or fewer, amounting to about 25,000 words a month. This should be plenty for most people, but it could pose a problem for schools or companies scanning a large amount of content.

Unlike other AI detectors, I was unable to purchase more credits as needed once I spent my allotted 100. Instead, you have to either upgrade your plan to a more expensive subscription or put your credits on auto-refill, meaning you’re charged for more credits and your billing date changes to the time of auto-refill. I found this process confusing and unclear. I avoided using the auto-refill feature out of fear that I would scan too much and charge my credit card. If you’re planning on scanning more than 25,000 words a month, prepare to spend extra for more credits.

Copyleaks saves your scans, which is handy if you need to go back and check exactly what was flagged as AI. For those concerned about security, saved scans are encrypted. It also identifies specific phrases that AI models use in their writing to help you better understand what is being flagged and why. The basic subscription comes with a plagiarism checker, too.

Overall, I’d recommend Copyleaks for repeated or business use. It is expensive and billed monthly or annually, so it’s not the most accessible for one-time use or quick checks. If you’re looking to try it out, a free trial gives you four credits, equivalent to 1,000 words. There’s also a pro subscription, costing $99.99 monthly, that should take care of the large scan needs of most businesses and teachers. Schools and businesses can take advantage of the enterprise and education plans, both of which have custom pricing.

Best free

GPTZero

GPTZero AI Detector

GPTZero’s free version lets you scan up to 10,000 words a month at 10,000 characters at a time. It’s the most accurate free AI detector we’ve tried.

Free AI detectors come in handy when you have a short document that you’d like to check for AI text. They’re usually best for one-time use, since most free AI detectors have a character or word limit.

Of all the unpaid versions and trials we tried, GPTZero is the best free AI detector.

The free version of GPTZero allows you to scan up to 10,000 words a month for free without creating an account. You can only scan up to 10,000 characters at once with the free version, so you might have to break up larger pieces of text. Opting for a paid plan can get you between 150,000 words and 500,000 words a month, plus extra features like plagiarism checks and citation generation.

Generally, free AI detectors are less accurate than their paid counterparts — GPTZero is, for example, less accurate than Copyleaks — but GPTZero succeeded in identifying whether a text was fully AI-written or human-written. For the most part, GPTZero measured fully AI content in the 70-100% range. Similarly, human-written text registered between 0-5% AI-generated. Using GPTZero can give you a good idea of whether the writer ran a prompt through ChatGPT or if they wrote it themselves.

It also registered middling levels for content edited by AI, which was what I was looking for in my tests. Unlike other detectors that noted high levels for AI editing, GPTZero found 40-50% results and excelled in tracking down AI edits in the first-person essay.

GPTZero isn’t perfect, though — it really struggled with detecting AI content in résumés and cover letters. It flagged a ChatGPT-generated cover letter as only 43% AI, and it didn’t do well detecting partial writing or edits from AI in either category, either. For this reason, I wouldn’t recommend it to hiring teams looking to parse out AI-generated cover letters and résumés.

Overall, I recommend GPTZero to people looking for quick, one-time use. As always, be sure to take any results — especially with résumés and cover letters — with a grain of salt.

Our takeaways

No AI detector is perfect

After testing and researching AI detectors, it’s abundantly clear that no AI detector is perfect. None that I tested fulfilled the Platonic ideal of an AI detector: a flawless product that always detects AI in the correct amount every time. Instead, every detector struggled in some category, at some point. When using an AI detector, it’s unavoidable to receive false positive or false negative results.

Because of their occasional inaccuracy, results from AI detectors should never be taken alone to make serious decisions about academic honesty or a person’s job. “Use them as evidence, not verdicts, especially in high-stakes contexts,” Gillham recommends. “We explicitly advise against using a single score for disciplinary action without context, such as draft history or prior writing samples.”

Feizi recommends avoiding the use of AI detectors for three main reasons. First, the results are rarely accurate, and false positives are common. Second, he encourages professors, teachers, and editors to accept the use of AI in writing. Lastly, his 2023 study found that the humanizers (also called paraphrasers) used to bypass AI detectors are becoming stronger, reducing AI detector accuracy.

Each AI detector has different strengths and weaknesses

I found that all AI detectors struggled with code, math, and creative writing, like poetry. Additionally, AI detectors are not made to find AI in text shorter than 250 words. Otherwise, all the detectors I tested had different strengths and weaknesses. I’ve summarized them below:

Quillbot: Quillbot had accurate results when scanning cover letters and résumés, but struggled with everything else. It struggled particularly with scanning for ChatGPT, finding that 100% ChatGPT-generated content contained no AI.
Originality AI: Originality AI is good for separating refined vs. generated content, giving the user a breakdown of whether the writing appears to have been completely written or edited by AI. It also gives its results in confidence percentages rather than composition percentages. It struggled with accuracy when scanning cover letters and résumés.
Winston AI: Winston AI defaulted to a very high AI percentage, regardless of what it scanned. Most results were in the high 90% range, whether or not a text was written 100% by AI or edited by AI. That said, it accurately identified AI and human-generated résumés.
Grammarly: Grammarly’s AI detector is free and built into the extension, so it’s very user-friendly. It also helps to cite AI if it notices AI-generated content in its scan. Despite that, it was one of the least accurate AI detectors that I used.
ZeroGPT: ZeroGPT was middlingly accurate on everything but cover letters, where it excelled at correctly identifying human and AI text.
Undetectable AI: Undetectable AI defaults to a low percentage, with many results reading at 1%. It also checks through multiple detectors to determine the final percentage, but the accuracy is mixed.

Text refined by AI usually comes back as high AI levels

I noticed that text refined by AI (meaning: checked for grammar errors or enhanced the vocabulary) usually registers as having high — 85% or higher — amounts of AI. I’d keep this in mind when reviewing results, and definitely be wary if you’re using AI to edit or enhance your work after you’ve written it. Feizi said that this is because AI detectors are usually looking for signatures that indicate AI has been used — think em-dashes, awkward phrasing, and overstructured sentences. If an LLM has inserted any of these AI hallmarks into your writing, it’s likely to be flagged by a detector.

AI completely fabricates information, especially personal details

Regardless of the AI model I used to generate content, I found that it inserted fabricated information into my content. In my personal essay, I wrote about playing Pokémon TCG, and the version ChatGPT wrote said I learned all I know from a player named Marcus. Lovely, but completely untrue. Similarly, a Claude-written cover letter indicated I was located in New York, which I read from my home in Indiana.

Since AI-generated content fabricates information, an excellent way to check if a writer used AI is to ask follow-up questions. For example, if someone had asked me how Marcus taught me how to play, I wouldn’t have been able to answer, because those were not my words nor my experience.

ChatGPT is the least detectable AI

Interestingly, ChatGPT writing is the best at flying under the radar. It was the least likely model to be identified by an AI detector, with many actually indicating that a 100% ChatGPT-generated text contained zero AI. Claude was the second least detectable, while DeepAI was easily detected, even by the less accurate detectors.

Always disclose that you’re using an AI detector and encourage transparency

Above all, experts we spoke to stressed being transparent when you’re using AI detectors to vet someone’s work. AI detection shouldn’t be a secret to “catch” someone in the act of using AI. Instead, set a clear AI policy and expectations so writers know what to expect. Similarly, encourage writers to let you know when they’ve used AI and how they’ve used it for utmost transparency.

“Detectors sit in a broader integrity stack which starts with a clear policy on AI use for a writer,” Gillham said. Just like writers should be clear when they’ve used AI, you should be clear about the tools you’re using, too.

Feizi has a different recommendation: encourage writers to use AI to generate ideas or polish existing work. Then, judge a piece based on its quality, not on its impact from AI.

What we looked for

Accuracy: The most important evaluation point, by far, is accuracy. Most hesitancy around using AI detectors is the fact that they’re not always accurate. I don’t expect our best overall pick to be 100% accurate all of the time, but I do expect it to be mostly right most of the time. Not only does it need to be accurate in catching text wholly generated by AI, but it should also check text with some parts generated by AI or using help from AI.
Features: Some AI detectors come with extra features, like plagiarism detection or grammar and writing tools. I considered extra features included when making my final picks.
Ease of use: The AI detector should be intuitive to use, and uploading text should be easy and clear. This is a subjective evaluation point based on user experience. I also searched for a design that made user experience easier, like mass uploads or confidence levels for individual sentences. I also researched each AI detector to determine how it works.
Value: This technology should be available to all, whether you’re willing to pay or need a free resource. We also like to see free trials when considering value, along with how users are charged. Some AI detectors charge a monthly fee, whereas others charge by the word. I read through the terms and conditions to check for any catches. If charged by the word, I noted how much work I was able to complete with the credits provided.

AI detector FAQs

Do AI detectors work?

It’s a complicated question, and the answer is: kind of. It depends on the content you’re scanning, the number of words, and the style of writing.

“If the detector is trained on samples from generator A, its accuracy on generator A’s outputs is very high, often exceeding 90%,” Hu said. “However, if the detector has not seen generator A’s samples during training, its accuracy drops significantly and may approach random guessing.” An AI detector’s accuracy is entirely dependent on the kind of content it was trained on, making these tools hit or miss when it comes to accuracy. “Detectors trained on domain-specific datasets, such as academic articles, may struggle when applied to other domains like social media posts or news articles,” Hu said.

Still, Hu recommends using AI detectors, but taking their results with a grain of salt as one result in a broader context. Contrastingly, Feizi does not encourage the use of AI detectors at all.

Why are AI detectors flagging my writing?

If you’re getting a false positive from AI detectors, it could be attributed to a couple of factors. “Detectors look for statistical patterns — that could be highly predictable phrasing, overused AI words and other model-like signals. Human work can trigger those signals, especially when the text is oddly written or formatted,” Gillham said. “Also, often people think they didn’t use AI, but they did use it for research, outlining, and editing, which is typically sufficient for an AI detector to identify the text as likely AI.” In other words, your work could be flagged as AI because of your writing style, phrasing, or use of AI in the writing process.

Interestingly, a 2023 Stanford study also found that AI detectors are biased against non-native English speakers because their writing tends to rely on a more limited vocabulary. If English isn’t your first language, AI detectors could be picking up on your writing style as potentially AI.

To reduce false positives, Gillham recommends scanning more than 100 words, avoiding strange formatting, limited paraphrasing and AI edits, and submitting a pre-tool (like Grammarly or Editor in Microsoft Word or Google docs) draft for comparison.

Can AI writing be undetectable?

According to Gillham, AI humanizers can reduce the percentage of AI found by an AI detector, but these tools also come with consequences. “Paraphrasers and ‘humanizers’ can sometimes reduce a detector’s accuracy, but they often degrade quality and robust detectors trained on adversarial attacks still catch many cases,” he said. “We continuously test tools that claim to bypass detection; results are mixed and typically come with clear trade-offs in clarity and correctness. The safest way to be ‘undetectable’ is simple: write it yourself.”

Kinsley Searles

Home and Sleep Reviews Fellow

[

Source link

Welcome to Liberty Case

Welcome to Liberty Case

Welcome to Liberty Case

Forever

Recommended

1-Year

1-Month

Welcome to Liberty Case

Become a member

Federal breach rocks New Jersey gubernatorial race between Mikie Sherrill and rivals

Valentino in Talks With Banks as Luxury Drop Prompts Debt Breach

Stocks jump after PCE inflation, Consumer Sentiment reports within expectationss

What We Do and Don’t Know About US TikTok Deal With China

Federal breach rocks New Jersey gubernatorial race between Mikie Sherrill and rivals

Valentino in Talks With Banks as Luxury Drop Prompts Debt Breach

Stocks jump after PCE inflation, Consumer Sentiment reports within expectationss

What We Do and Don’t Know About US TikTok Deal With China

The Best AI Detectors, Tested for Accuracy

How do AI detectors work?

How we tested AI detectors

Our top picks for AI detectors

Best overall

Best free

Our takeaways

No AI detector is perfect

Each AI detector has different strengths and weaknesses

Text refined by AI usually comes back as high AI levels

AI completely fabricates information, especially personal details

ChatGPT is the least detectable AI

Always disclose that you’re using an AI detector and encourage transparency

What we looked for

AI detector FAQs

Do AI detectors work?

Why are AI detectors flagging my writing?

Can AI writing be undetectable?

Valentino in Talks With Banks as Luxury Drop Prompts Debt Breach

What We Do and Don’t Know About US TikTok Deal With China

Which Trader Joe’s Fall Foods to Buy, From Girl Who’s Eaten Them All

Subscribe for exclusive content

Welcome to Liberty Case

Welcome to Liberty Case

Welcome to Liberty Case

Subscribe to Liberty Case

Forever

Recommended

1-Year

1-Month

Welcome to Liberty Case

Become a member

The Best AI Detectors, Tested for Accuracy

How do AI detectors work?

How we tested AI detectors

Our top picks for AI detectors

Best overall

Best free

Our takeaways

No AI detector is perfect

Each AI detector has different strengths and weaknesses

Text refined by AI usually comes back as high AI levels

AI completely fabricates information, especially personal details

ChatGPT is the least detectable AI

Always disclose that you’re using an AI detector and encourage transparency

What we looked for

AI detector FAQs

Do AI detectors work?

Why are AI detectors flagging my writing?

Can AI writing be undetectable?

Subscribe for exclusive content