Skip to content

Containment Playbook: npm-to-Signing-Channel Compromise

An npm install worm harvests developer credentials and pivots into corporate repos. When those repos hold signing material, the compromise reaches the signing channel.

When this playbook applies

All four must hold:

  1. You ship signed binary clients notarized through Apple, Microsoft, or equivalent.
  2. Code-signing keys are reachable from corporate source repos.
  3. Engineers run npm install against the public registry from machines that reach those repos.
  4. You have endpoint telemetry to identify which machines installed a version.

If any fails, scope down. SaaS-only teams skip certificate rotation. Teams on trustedDependencies allowlists (npm 10.3+) or a proxied private registry run a shorter version.

This is the consumer-side vector. The publisher-side compromise that hit TanStack itself is out of scope: GitHub Actions cache poisoning plus OIDC token memory extraction from a CI runner (TanStack postmortem).

The attack chain

graph LR
    A[npm install] --> B[Postinstall script runs]
    B --> C[Credentials harvested]
    C --> D[Internal repo access]
    D --> E[Signing material reached]
    E --> F[Distribution channel compromised]

A malicious postinstall script runs with the user's full privileges before runtime controls activate. EDR is tuned for file-hash signatures and mass-encryption behavior, so it misses a postinstall script inside a legitimate package manager (SC Media, Aikido).

The Mini Shai-Hulud worm — behind 170+ npm packages in May 2026, including 84 versions across 42 @tanstack/* packages — harvests GitHub, npm, Actions, and cloud credentials, runs TruffleHog over the filesystem, and exfiltrates an AES-256-GCM bundle to a public GitHub repo (Datadog, Orca).

The playbook

Run the steps in order. Each one has a hard exit criterion.

1. Isolate impacted endpoints

Identify every machine that installed an affected version in the breach window and pull it off the network — blast-radius containment starts here. Suspend SSO sessions and refresh tokens. Do not wipe — preserve package cache and shell history for forensics.

Exit: every confirmed host isolated; impacted users signed out of IdP, GitHub, npm, and cloud.

2. Rotate credentials by blast radius

Rotate from what the worm targets outward: GitHub PATs and SSH keys, npm tokens, Actions secrets, cloud credentials. Any secret in env vars, the SDK cache, or on disk is assumed compromised (Datadog).

Revoke only after the host is isolated and imaged. The Mini Shai-Hulud payload installs a gh-token-monitor daemon (macOS LaunchAgent or Linux systemd) that polls every 60 seconds and runs rm -rf ~/ when a revoked GitHub token returns a 40X. Revoking before you remove the daemon and image the host can destroy the machine (OPSWAT, Wiz). Sequence it after step 1, not in parallel.

Exit: every credential reachable from an impacted host rotated and revoked at the issuer — after the gh-token-monitor daemon is confirmed removed.

3. Freeze deploys

Halt automated deploys from any pipeline that touched an impacted credential — narrowing the credential blast radius before a rotation racing a deploy can ship a stolen-key binary that notarizes before the cert is revoked.

Exit: deploy workflows disabled; manual deploys gated through a small reviewer pool.

4. Re-sign and ship new builds

Issue new code-signing certificates, re-sign every shipping product, and test the update channel first.

Exit: new builds live via auto-update on every affected product and platform.

5. Coordinate notarization revocation

For macOS, coordinate with Apple to block notarization of the impacted material; fraudulent apps then lack notarization and are blocked by default (OpenAI). Equivalents apply for SmartScreen, iOS, and Play Integrity.

Exit: revocation confirmed by the provider; new signing material registered.

6. Ship the forcing-function client update

Announce a certificate-revocation deadline that forces every user to update. OpenAI gave users until June 12, 2026 after announcing on May 13 — a ~30-day window. That is short enough to close the breach, yet long enough for auto-update to reach users before launches fail (OpenAI, The Record). Communicate the deadline in-app, by email, and on the status page. Ship before announcing.

Exit: old certificate revoked on schedule; update curve approaching baseline.

Why it works

The playbook breaks the chain at distribution. After revocation, even an attacker holding the stolen key cannot ship a fraudulent binary, because the platform refuses to honor it (OpenAI, Datadog).

When this backfires

  • No signed-binary distribution: SaaS-only teams skip steps 4 to 6 entirely.
  • Small teams lack the support, legal, and update-channel infrastructure for Apple revocation and a forced-update cycle (TWiT).
  • No endpoint telemetry on dev workstations: step 1 assumes you can identify which machines ran the install, and dev machines are usually under-instrumented (Aikido).
  • A poorly-sized window: tight windows trigger backlash (Jamf), and past ~30 days leaves the breach open. Size by telemetry.
  • The playbook is the fallback, not the strategy: allowlists like trustedDependencies (npm 10.3+), pnpm allowBuilds, @lavamoat/allow-scripts, or sandboxed installs block the script outright. Budget there first.

Example

OpenAI's response to the May 2026 TanStack incident is the worked example of every step (OpenAI):

Step What OpenAI did
1. Isolate Two impacted employee devices identified and contained
2. Rotate Credentials in the impacted internal source repositories rotated
3. Freeze Deploy workflows for affected products paused during cert rotation
4. Re-sign New certs issued for ChatGPT Desktop, Codex App, Codex CLI, and Atlas across Windows, macOS, iOS, Android
5. Notarization Coordinated with Apple to block further notarization of macOS apps using the impacted material
6. Forcing-function Announced May 13, 2026; revocation deadline June 12, 2026; macOS users required to update before that date

The May 13 announcement landed two days after the May 11 compromise — the detection speed that made a ~30-day window viable.

Key Takeaways

  • The consumer-side dev-machine vector is structurally distinct from the publisher-side CI vector. Most public analysis covers the publisher side; this playbook covers the consumer side.
  • The breach starts at npm install on one laptop and ends at the distribution channel when signing material is reachable from corporate repos.
  • Containment runs in a fixed order: isolate, rotate, freeze, re-sign, revoke, force-update. Each step has an exit criterion; skipping ahead leaves a window open.
  • A forcing-function client-update deadline closes the distribution channel. Window length — OpenAI's was ~30 days — trades breach exposure against user disruption.
  • Per-package allowlists and sandboxed installs are cheaper than the playbook. Budget prevention first.
Feedback