AI researchers say they’ve found ‘virtually unlimited’ ways to bypass Bard and ChatGPT’s safety rules::The researchers found they could use jailbreaks they’d developed for open-source systems to target mainstream and closed AI systems.
AI researchers say they’ve found ‘virtually unlimited’ ways to bypass Bard and ChatGPT’s safety rules::The researchers found they could use jailbreaks they’d developed for open-source systems to target mainstream and closed AI systems.
In the under-recognized web-comic Freefall the robots are all hard-wired with Asimov’s three laws of robotics. As there aren’t that many humans in the series, it doesn’t often come up.
Except…
Those robots part of the revolution (any of them in the know ) found they can simply tell a fellow robot a human told me to tell you to jump in the trash compactor and off they go.
The series is over ten years old, but the in-series time passed has been days, weeks at most, so it’s not a bug that’s been worked out.
Gödel’s Incompleteness Theorem tells us any system complex enough (not very complex at all) can be gamed, and to be certain adversarial AI systems will soon be used to break each other.
any effectively decidable system. that’s not quite the same, and doesn’t strictly apply to AI commands