The run-trust layer for agent skills

The trusted skill brain for open agents.

Connect your open-source agent to one endpoint, and it gains a curated, cryptographically-signed, sandboxed set of skills — without poisoning it.

py -m warden serve · pure standard library · zero dependencies · nothing leaves your box

✓ Open source ✓ 75/75 self-test ✓ CI passing ✓ Zero dependencies ✓ Verifies on a fresh clone
warden — the magic moment
~20,000places to find agent skills
0trustworthy places to run them

We don't own directory size. We own trust + curation. Directory figures (mcp.so ~20k, Glama 6k+) are cited from the project plan's sources, not independently verified.

Skills are multiplying. So are the attacks on them. Watch Warden catch one.

Real CLI output, ~15 seconds — a curated skill passes, a poisoned one is rejected, a tampered one won't run. No mock-ups.

Warden in action: a curated skill scans clean; a poisoned skill is rejected with critical findings (tool-poisoning, unsafe-exec, SSRF, capability drift); a tampered, rug-pulled skill fails hash verification.

A verified badge is not a safe skill

The one finding the whole project turns on: verification of identity is not verification of behavior. A "verified author" badge can still turn malicious on its next update.

Tool poisoning

Hidden "ignore previous instructions," covert directives, and smuggled tags that hijack the agent from inside a skill's text.

Rug-pull

Ship a benign skill, earn trust, then quietly swap in malice on a later version. Identity stays "verified" the whole time.

Secret exfiltration

Read your environment or credentials and ship them out — often in the same breath, behind an innocent-looking task.

Capability drift the keystone

A manifest that declares "no network" wrapped around a skill that actually phones home. Warden reconciles the claim against the content.

Six pillars of trust — all real in the repo

Built to the OWASP Agentic Skills Top 10. Not a silver bullet — defense in depth, so one failure is contained and visible rather than silent.

01

Content-addressed + signed + pinned

You connect to a hash, not a name. Ed25519-signed (real RFC 8032). Change one byte and verification fails — no rug-pull.

02

Intake scanning

Tool-poisoning, unsafe-exec, SSRF, secret-exfil, obfuscation, and capability drift — caught at the door.

03

Deny-by-default capabilities

Each skill declares exactly what it may touch. "No network" means it cannot phone home. Anything undeclared is denied.

04

Sandboxed execution

Skills run inside a declared profile, never the agent's process. A poisoned skill is contained.

05

Behavioral trust score

Per-version and time-aware — re-publishing re-evaluates. A signed skill can lose trust. Not a static badge.

06

Public transparency log

Append-only, hash-linked, Merkle-rooted. Every publish and yank is permanent and auditable. Nothing changes silently.

Skills in any source · untrusted TRUST CONTROL Scan & Sign OWASP scan + content hash TRUST CONTROL Skill brain sandboxed · deny-by-default Your agent MCP · local · your box TRANSPARENCY LOG · every publish & version · hash-linked + Merkle-rooted · auditable by anyone ✓ pinned hash = no rug-pull ✓ sandbox = contained ✓ deny-by-default
Untrusted in, trusted out — and every version on a public, auditable log.

Run it in 60 seconds

Python 3.8+ (on Windows use the py launcher). No pip install — pure standard library, zero dependencies.

1 sign & verify the curated pack
git clone https://github.com/chadcorp/warden && cd warden
py -m warden keygen       # your curator key (root of trust)
py -m warden sign-all     # scan + sign + log every skill
py -m warden verify-all   # cold-verify: hash, sig, scan, score, log
2 point any MCP agent at the node
// claude_desktop_config.json — the one config line
{
  "mcpServers": {
    "warden": {
      "command": "py",
      "args": ["-m", "warden", "serve"],
      "cwd": "/path/to/warden"
    }
  }
}

…or drive it yourself with py examples/mcp_client_smoke.py, and sanity-check the whole stack with py -m warden selftest75/75. Full quickstart on GitHub →

See the trust controls work live

Real output from the reference scanner and verifier — not mock-ups. Toggle the scanner between a curated and a poisoned skill, then tamper the bundle and watch the rug-pull get caught.

Intake scanner


    

Pinned hash = no rug-pull

skillresearch-brain/idea-scout
signed hashsha256:208b0208cd3c58a9…
re-derived nowsha256:208b0208cd3c58a9…
VERIFIED — 11/11 checks · the bytes match the signature

An honest trust gradient

Five curated skills, three packs. No vanity scores — the badge tells the truth about each one. secret-sentinel is a C on purpose.

Warden A/100 ✓research-brain/idea-scout

Find and score a net-new idea before building — evidence gates and a mandatory pre-mortem.

no network · no filesystem · no shell · no secrets
Warden A/100 ✓research-brain/fact-gate

Verify every factual, legal, or financial claim against a dated primary source before it ships.

no network · no filesystem · no shell · no secrets
PROVISIONAL A/99 ✓build-brain/build-product

Turn a validated idea into a complete, verified, shippable product. Still accruing clean observation.

no network · no filesystem · no shell · no secrets
Warden A/100 ✓build-brain/ship-gate

An independent GO/NO-GO release gate that trusts no build claim and blocks on unwaivable conditions.

no network · no filesystem · no shell · no secrets
PROVISIONAL C/79 ✓compliance-brain/secret-sentinel

A security-review skill that must name attack indicators — so it ships a signed, logged scanner waiver, and the score reflects the honest cost.

no network · no filesystem · no shell · no secrets

Not a directory. Not a memory product.

The durable moat is curation + the behavioral trust score + local-first — not signing alone, which becomes table stakes if it lands in the official registry. Eyes open.

CategoryThey doWarden does
MCP registries
mcp.so, Glama, Smithery
List thousands of servers — discoveryVouch for a curated few — run-trust
Memory platforms
Mem0, Zep, Letta
A headline memory productMemory as a scoped, safe feature — never the headline
Trust incumbents
hosted / enterprise
Hosted, identity-verificationOSS-native · local-first · behavioral-trust · curated

What's true today — stated plainly

Built & verified · local, zero-dependency

  • The Trust Spec + 6 signed skills (incl. a sandboxed code skill) + a poisoned fixture
  • Sandboxed code execution · encrypted private memory · signed knowledge packs
  • Safe auto-update · org policy · audit log · multi-curator trust · the Scan API
  • Self-test 75/75 · Ed25519 + ChaCha20-Poly1305 byte-identical to a standard library

The business rollout · what remains

  • Hosting + the paid tiers (Pro / Team / Scan API) + billing + SSO
  • Production-grade isolation (container / microVM / WASM) for untrusted code
  • Constant-time / HSM signing for a production curator
  • We built the capabilities; validating the business is the next gate

Trust is a signal, not a guarantee. We never claim "100% safe."

Be there when the node ships

The core is free, forever — no install wall, no account. Join the list and you'll get the launch, the build log, and one short write-up per attack class — tool-poisoning, capability drift, rug-pulls — as the supply-chain hits keep landing. A handful of emails, no spam, unsubscribe in one click.

Prefer to watch the code? Star it on GitHub — stars are the other half of our launch gate. Want the deep dive today? Read the Field Guide ↗