Effectiveness of the scanner: Microsoft said the scanner does not require retraining models or prior knowledge of backdoor behavior and operates using forward passes only, avoiding gradient calculations or backpropagation to keep computing costs low.The company also said it works with most causal, GPT-style language models and can be used across a wide range of deployments.Analysts say that while the approach improves visibility into language model poisoning, it is an incremental advance rather than a breakthrough, noting that several leading EDR platforms already claim the ability to detect backdoors in open-weight LLMs.The bigger question is how long such detection advantages will last.”While this new scanner will help counter real-world attacker techniques currently, adversaries will adapt quickly to outflank this scanner,” said Keith Prabhu, founder and CEO of Confidis. “We are seeing a repeat of the ‘virus’ wars, where hackers kept evolving viruses to evade detection by using innovative techniques like polymorphic viruses.”That said, the scanner is essential for companies that download open-source models to use or customize in their own systems, according to Varkey.”For them, AI models become part of the supply chain, just like software libraries,” Varkey said. “The scanner is not a complete solution, but it is an important new layer of protection as AI adoption grows.”
First seen on csoonline.com
Jump to article: www.csoonline.com/article/4127897/microsoft-develops-a-new-scanner-to-detect-hidden-backdoors-in-llms.html
![]()

