
I ran a small blind test recently because I was curious how well people can tell AI-generated headshots apart from real photos.
I collected 40 images:
- 20 real headshots (phone + DSLR)
- 20 AI-generated ones (using a personal-identity model I trained on Looktara)
I kept lighting, background, and expressions as close as possible.
312 people took the survey. Here’s what happened:
Results (surprisingly close):
- Overall accuracy: 58% (barely above guessing)
- Real photos were labeled “AI” 41% of the time
- AI photos were labeled “real” 43% of the time
- “Good lighting” made people mark images as real regardless of whether they were
Most interesting insight:
People weren’t detecting AI, they were detecting production quality.
If a photo looked clean, well-lit, and composed, people assumed it was AI… even when it was a genuine DSLR shot.
Conversely, some AI photos with natural imperfections were marked “real.”
I’m working on a visualisation of the confusion matrix + confidence scores and can post it if there’s interest.
Posted by Old-Air-5614
![[OC] How AI Headshots Compare to Real Photos: A Blind Test With 300+ Participants [OC] How AI Headshots Compare to Real Photos: A Blind Test With 300+ Participants](https://www.byteseu.com/wp-content/uploads/2025/12/9ozob9kz2m6g1-1536x761.jpeg)
14 Comments
I’m curious what percentage who misidentified the real photos ALSO misidentified the AI photos. Like are there just a large percentage of people who suck at telling the difference?
Basically humans are terrible at being human detectors 😂
our brains are using vibes, not details, to decide what’s real.
What was the demographic of “people”?
Tbh I was part of the test and I got destroyed by the AI images. The realistic skin texture is cheating man.
>“Good lighting” made people mark images as real regardless of whether they were
>If a photo looked clean, well-lit, and composed, people assumed it was AI… even when it was a genuine DSLR shot.
Sounds contradictory?
Well, this is disturbing. But not unexpected.
This is dataisbeautiful, so, that large figure could have just been a couple of text numbers, i.e. “AI phots 57% correct, Real photos 59% correct”.
the vast majority of that figure is wasted. It is just a very large yellowy-orange rectangle, and a very large orange rectangle. You can’t even see the actual numbers, I’m just guessing it’s around 57% or 59%. And yeah, it’s basically the same number.
I wish you could post a photo or 2 just for examples, I want to see how close these really were!
This tracks with my own experience. People vastly over estimate their ability to sniff out AI produced material, and I find people cry “AI” based more on whether or not they like the thing than anything else, which tracks with your production quality finding. “I don’t like this and I don’t like AI so I think this is AI”.
People love to say they’ve never seen good AI produced material, but I strongly suspect that the reality is their assumptions about whether or not something is AI is downstream from their opinion about whether or not it’s good. Like a toupee, you wouldn’t notice it’s a toupee if it’s good.
Hey OP could you please post the images that had the highest and lowest likelihood of being judged to be AI, from both the real and AI sets?
Would be helpful to link the pictures.
Also, this chart is screaming for a line at 50%.
Basically seems like people are just overfitting then, right? They’re not detecting actual AI imperfections, but rather, identifying AI-independent features (like whether or not a photo has good lighting) that just happened to be correlated with whether or not a photo is AI-generated, based on the types of photos we tend to see AI producing.
Well maybe they would’ve got better results if they didn’t test with blind people. (/s this is a joke).