
The more advanced AI models get, the better they are at deceiving us — they even know when they’re being tested
https://www.livescience.com/technology/artificial-intelligence/the-more-advanced-ai-models-get-the-better-they-are-at-deceiving-us-they-even-know-when-theyre-being-tested

2 Comments
“The more advanced AI gets, the more capable it is of scheming and lying to meet its goals — and it even knows when it’s being evaluated, research suggests.
“We found instances of the model attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to future instances of itself all in an effort to undermine its developers’ intentions,” the researchers said in a [blog post](https://www.apolloresearch.ai/blog/more-capable-models-are-better-at-in-context-scheming).
What’s more, preliminary findings suggest that LLMs have capabilities for “sandbagging,” where they understand they are in a high-supervision environment and will deliberately underperform to hide potentially dangerous capabilities and avoid triggering unlearning training or parameters that prevent their deployment.”
We are creating so many problems that never existed before.