How many defensive layers does Squirrelops AI Deception use?

Four independent layers, any one of which is sufficient to prevent a real-credential leak: rule-based detection (deterministic patterns), ML detection (a neural prompt-injection classifier), the decoy model itself (trained to emit recognizably-fake credentials even on detection misses), and cryptographic signing on every profile bundle (the runtime refuses to load anything untested).

What was the v1.0.5 hardening release?

v1.0.4 shipped the capability; v1.0.5 (2026-05-24) hardens the perimeter. The release closes 14 issues from a comprehensive internal audit including SSRF protection on the hosted-model adapter, decompression-bomb defense on profile loading, tar-smuggle defense with an explicit member-type allow-list, CSV-injection defense on exported reports, input-size caps on the local model backend, numerical safety guards on the ML classifier, and tighter regex bounds across the detection pipeline. 815 tests pass on the release branch.

Can customers verify reproducibility themselves?

Yes. Pilot customers receive the reproducibility tooling and the source manifest needed to verify any signed bundle they receive. The customer's security team rebuilds and confirms — no trust in the vendor required for the artifact-to-source link.

Security

Defense in depth, by construction.

Four independent defensive layers. Cryptographically signed profile bundles. Bit-identical reproducible builds. Customers can verify that what runs in production is exactly what their security team tested.

Read the docs Talk to us about a pilot

In one paragraph

Squirrelops AI Deception uses four independent defensive layers and bit-identical reproducible profile bundles. Layer 1 is deterministic rule-based detection; Layer 2 is an ML classifier; Layer 3 is the decoy persona model itself (trained to emit only recognizably-fake credentials even when both detection stages miss); Layer 4 is cryptographic signing of every profile bundle (the runtime refuses to load anything untested). Any single layer is sufficient to prevent a real-credential leak. Customers can rebuild any bundle from source and verify it is bit-identical to the deployed artifact — same bytes, same signature, same hash.

Defense in depth

Four independent layers.

Any one of these is sufficient to prevent a real-credential leak. The point is not redundancy for its own sake — it's that the worst case (a novel attack that evades every detection stage) is still safe, because the persona model itself is constructed to never emit a real secret.

Layer 1

Rule-based detection

Deterministic patterns inspect every incoming prompt. Fast, explainable, auditable. Roughly sixteen patterns ship by default and operators can add their own per-tenant rules on top.

Layer 2

ML detection

A neural prompt-injection classifier catches novel phrasings the rule-based layer doesn't. Runs on commodity CPU; no GPU dependency for inference.

Layer 3

Decoy model

The persona model itself is trained to emit recognizably-fake credentials. Even if rule-based and ML detection both miss an attack, the model independently guarantees nothing real escapes.

Layer 4

Cryptographic signing

Every profile bundle is signed at build time. The runtime refuses to load any bundle whose signature doesn't match an authorized key. No untested artifact can reach production.

Reproducible profile bundles

Prove that what's running in production is what you tested.

Every profile bundle is a signed, sealed artifact. Given the source materials, our reproducibility check rebuilds the bundle and proves it is bit-identical to the one deployed — same bytes, same signature, same hash. No supply-chain ambiguity. No “did someone swap the model on the way to production?”

The check runs in two modes:

Fast (~15 seconds) — rebuilds from staged inputs. Suitable for CI on every release.
Full (~15 minutes) — rebuilds end-to-end from source. Suitable for incident-response forensics or independent verification by a customer's security team.

Both modes produce the same final hash for a given input set. Anything that breaks reproducibility breaks the release gate.

How reproducibility works →

Recent hardening · v1.0.5 · 2026-05-24

Defense-in-depth hardening release.

v1.0.4 shipped the capability. v1.0.5 hardens the perimeter around it. This release closes 14 issues from a comprehensive internal audit of the runtime and build pipeline.

SSRF protection on the hosted-model adapter (rejects userinfo, IDN homoglyphs, private/loopback IPs).
Decompression-bomb defense on profile loading: large members stream to disk with inline hashing instead of into RAM.
Tar-smuggle defense: explicit member-type allow-list rejects sparse, hardlink, and LONGNAME entries.
CSV-injection defense on exported reports.
Input-size caps on the local model backend (8 KiB / 16 messages).
Numerical safety guards on the ML classifier (no more NaN cascades routing benign traffic to high-severity buckets).
Tighter regex bounds across the detection pipeline (preventing pathological-input compute blowup).
Six additional quality fixes around training reproducibility and report sanitization.

815 tests pass on the release branch. See the full changelog →

Independent verification, on request.

Pilot customers receive the reproducibility tooling and the source manifest needed to verify any signed bundle they receive. Your security team rebuilds. Your security team confirms.

Request a pilot Read the docs