Copy-paste vulnerability hits AI inference frameworks at Meta, Nvidia, and Microsoft

Why this matters for AI infrastructure: The vulnerable inference servers form the backbone of many enterprise-grade AI stacks, processing sensitive prompts, model weights, and customer data. Oligo reported identifying thousands of exposed ZeroMQ sockets on the public internet, some tied to these inference clusters.If exploited, an attacker could execute arbitrary code on GPU clusters, escalate privileges, exfiltrate model or customer data, or install GPU miners, turning an AI infrastructure asset into a liability.SGLang has been adopted by several large enterprises, including xAI, AMD, Nvidia, Intel, LinkedIn, Cursor, Oracle Cloud, and Google Cloud, Lumelsky noted. Oligo recommends upgrading to patched versions, which include versions not earlier than Meta Llama Stack v.0.0.41, Nvidia TensorRT-LLM 0.18.2, vLLM v0.8.0, and Modular Max Server v25.6. Restricting the use of pickle with untrusted data, adding HMAC and TLS authentication to ZQ-based communication, and educating dev teams on the risks were also advised.

First seen on csoonline.com

Jump to article: www.csoonline.com/article/4090061/copy-paste-vulnerability-hit-ai-inference-frameworks-at-meta-nvidia-and-microsoft.html

Copy-paste vulnerability hits AI inference frameworks at Meta, Nvidia, and Microsoft

also interesting: