Collecting Cyber-News from over 60 sources

New research finds that Claude breaks bad if you teach it to cheat

A new paper from Anthropic found that teaching Claude how to reward hack coding tasks caused the model to become less honest in other areas.

First seen on cyberscoop.com