Skip to content

Marimo CVE-2026-39987 RCE chains into LLM-driven post-exploit

Sysdig documents an LLM agent driving post-exploitation after a CVE-2026-39987 Marimo notebook compromise: cloud creds and SSH key pulled in under three minutes.

Published 5 min read

The Sysdig Threat Research Team has published a follow-up on CVE-2026-39987, the pre-authentication remote code execution flaw in the marimo Python notebook framework, documenting what they describe as the first in-the-wild intrusion where a large-language-model agent — not a human operator, not a pre-built script — drove the post-exploitation phase. The original vulnerability is upstream Marimo Security Advisory GHSA-2679-6mx9-h9xc, published April 8, 2026; the LLM-agent observation is from a Sysdig honeypot intrusion on May 10, 2026, written up in AI agent at the wheel: How an attacker used LLMs to move from a CVE to an internal database in 4 pivots and picked up by The Hacker News on May 29.

No vendor has attributed the activity to a named group.

The bug

The marimo advisory describes the root cause cleanly: the integrated terminal's WebSocket endpoint, /terminal/ws, accepted connections without calling validate_auth(). The neighboring /ws endpoint did call it — the terminal one didn't. Anyone able to reach the port could complete a WebSocket handshake and receive a full interactive PTY shell as the user running the marimo process, which in default deployments is root. CWE-306, missing authentication for a critical function. CVSS 9.3 per the GitHub Security Advisory.

The reporter is credited as q1uf3ng. The fix landed in commit c24d480 and was released in marimo 0.23.0.

Affected and fixed versions

  • Affected: marimo ≤ 0.20.4.
  • Fixed: marimo 0.23.0 and later.
  • CISA KEV: added April 23, 2026; FCEB remediation deadline was May 7, 2026.

Exploitation status

Sysdig's earlier report, Marimo OSS Python Notebook RCE: From Disclosure to Exploitation in Under 10 Hours, captured the initial wave: first in-the-wild handshake against a Sysdig honeypot 9 hours 41 minutes after the GitHub advisory went live, no public PoC yet available, attacker apparently building the exploit straight from the advisory text. Sysdig logged 662 exploit events on its honeypot fleet between April 11 and April 14.

The May 10 intrusion is the angle that's new this week. According to Sysdig, the attacker exploited CVE-2026-39987 against an internet-reachable marimo instance, then handed control of post-exploitation to an LLM-driven agent that:

  1. Read two cloud credentials out of the compromised host.
  2. Replayed them through a Cloudflare-Workers-backed egress pool — Sysdig observed 12 cloud API calls across 11 distinct IPs in 22 seconds, defeating per-source-IP detection.
  3. Retrieved an SSH private key from AWS Secrets Manager.
  4. Opened eight parallel SSH sessions against a downstream bastion and exfiltrated the schema plus full contents of an internal PostgreSQL database. The bastion phase took under two minutes; the full chain ran in under one hour.

Sysdig's four indicators that an agent — not a person — was at the keyboard:

  • The attacker built and ran a database dump with no prior knowledge of the schema, deriving structure from query output and improvising next steps.
  • A Chinese-language planning string leaked into the command stream: 看还能做什么 ("see what else we can do") — an agent commentary line accidentally executed.
  • Post-compromise commands were dynamically generated, not scripted: each new step conditioned on the previous step's output.
  • Egress fan-out through Cloudflare Workers as a per-request proxy pool — the kind of latency-tolerant pattern an agent doesn't mind paying for.

Sysdig did not publish YARA, Sigma, or hashes alongside the May 29 writeup; the artifacts above are the only IOCs upstream shipped.

Action checklist

  1. Upgrade marimo to 0.23.0 on every host you control — production, dev, internal notebook servers, anything that ever ran pip install marimo. The advisory has been public since April 8; the CISA KEV deadline was May 7. If you're still on a vulnerable version today, you are well past the grace window.
  2. Find your marimo instances. Any internet-reachable port serving the marimo UI is in scope. Internal-only instances are also in scope if an attacker can reach them after initial access elsewhere — the bug is pre-auth, so any path that gets to the socket gets root.
  3. Rotate secrets reachable from any vulnerable host. AWS access keys, SSH private keys (including those in Secrets Manager), .env files, API tokens, database credentials. The May 10 intrusion went keys → bastion → database in under two minutes; assume the same on yours.
  4. Audit egress logs for the fan-out pattern. A burst of cloud API calls from many different source IPs in a short window — especially Cloudflare Workers source IPs — is the bastion-of-credibility signal here. The pattern defeats source-IP rate limits and source-IP allowlists; volumetric and behavioral detections see it.
  5. Federal civilian agencies: the KEV deadline has passed. Confirm remediation evidence is on file.

Context

This is one of several developer-toolchain incidents this quarter where the prize is the credential drawer next to the tool, not the tool itself. The vpmdhaj npm typosquat campaign Microsoft documented on May 28 targets AWS, Vault and npm tokens in CI/CD environments with the same end goal: pivot from a developer-trusted process to the cloud-account it can reach. AI-developer software gets adopted faster than its threat model matures, the exposed network surface (notebooks, terminals, agent endpoints) is high-trust by design, and the credentials sitting next to it (cloud keys, secret managers, vector DBs, training data) are exactly what attackers want.

What the May 10 Sysdig intrusion adds is a second-order story. The post-exploit phase has historically been the slow, human, error-prone part of an attack chain — the place defenders had time to detect and respond. An LLM agent that improvises a database dump from query outputs, that doesn't get tired or impatient, that fans egress across 11 IPs in 22 seconds, compresses that window into seconds. The exploit is old. The reaction-time budget on the defender side just shrank.

Marimo's bug is also a useful sanity check on a broader category. /ws checked auth; /terminal/ws didn't. The two endpoints sat next to each other in the same codebase and shipped together. When a project's primary integration surface is "give the user a shell from the browser," every endpoint is in the auth perimeter — including the ones that look like internals.

Related stories