In one paragraph
Squirrelops AI Deception uses four independent defensive layers and bit-identical reproducible profile bundles. Layer 1 is deterministic rule-based detection; Layer 2 is an ML classifier; Layer 3 is the decoy persona model itself (trained to emit only recognizably-fake credentials even when both detection stages miss); Layer 4 is cryptographic signing of every profile bundle (the runtime refuses to load anything untested). Any single layer is sufficient to prevent a real-credential leak. Customers can rebuild any bundle from source and verify it is bit-identical to the deployed artifact — same bytes, same signature, same hash.
Defense in depth
Four independent layers.
Any one of these is sufficient to prevent a real-credential leak. The point is not redundancy for its own sake — it's that the worst case (a novel attack that evades every detection stage) is still safe, because the persona model itself is constructed to never emit a real secret.
Rule-based detection
Deterministic patterns inspect every incoming prompt. Fast, explainable, auditable. Roughly sixteen patterns ship by default and operators can add their own per-tenant rules on top.
ML detection
A neural prompt-injection classifier catches novel phrasings the rule-based layer doesn't. Runs on commodity CPU; no GPU dependency for inference.
Decoy model
The persona model itself is trained to emit recognizably-fake credentials. Even if rule-based and ML detection both miss an attack, the model independently guarantees nothing real escapes.
Cryptographic signing
Every profile bundle is signed at build time. The runtime refuses to load any bundle whose signature doesn't match an authorized key. No untested artifact can reach production.
Reproducible profile bundles
Prove that what's running in production is what you tested.
Every profile bundle is a signed, sealed artifact. Given the source materials, our reproducibility check rebuilds the bundle and proves it is bit-identical to the one deployed — same bytes, same signature, same hash. No supply-chain ambiguity. No “did someone swap the model on the way to production?”
The check runs in two modes:
- Fast (~15 seconds) — rebuilds from staged inputs. Suitable for CI on every release.
- Full (~15 minutes) — rebuilds end-to-end from source. Suitable for incident-response forensics or independent verification by a customer's security team.
Both modes produce the same final hash for a given input set. Anything that breaks reproducibility breaks the release gate.
Recent hardening · v1.0.5 · 2026-05-24
Defense-in-depth hardening release.
v1.0.4 shipped the capability. v1.0.5 hardens the perimeter around it. This release closes 14 issues from a comprehensive internal audit of the runtime and build pipeline.
- SSRF protection on the hosted-model adapter (rejects userinfo, IDN homoglyphs, private/loopback IPs).
- Decompression-bomb defense on profile loading: large members stream to disk with inline hashing instead of into RAM.
- Tar-smuggle defense: explicit member-type allow-list rejects sparse, hardlink, and LONGNAME entries.
- CSV-injection defense on exported reports.
- Input-size caps on the local model backend (8 KiB / 16 messages).
- Numerical safety guards on the ML classifier (no more NaN cascades routing benign traffic to high-severity buckets).
- Tighter regex bounds across the detection pipeline (preventing pathological-input compute blowup).
- Six additional quality fixes around training reproducibility and report sanitization.
815 tests pass on the release branch. See the full changelog →
Independent verification, on request.
Pilot customers receive the reproducibility tooling and the source manifest needed to verify any signed bundle they receive. Your security team rebuilds. Your security team confirms.
