Benchmarks Shows Matched Capability, Brittle Reasoning. Two artificial intelligence models from competing labs have essentially the same offensive cyber capability level, with consistent reasoning failures that the cyber scores alone do not capture. OpenAI’s GPT-5.5 and Anthropic’s Mythos Preview now deliver near-identical offensive cyber performance.
First seen on govinfosecurity.com
Jump to article: www.govinfosecurity.com/gpt-55-mythos-reach-hacking-parity-but-reasoning-falters-a-31594
![]()

