Glossary

Plain English, with the mapping back to industry terms.

The terms below are how we describe the platform in customer-facing copy. Technical readers will find each entry mapped back to the industry-standard concept it corresponds to.

Decoy profile: A signed bundle containing the model, prompts, and detection rules that define a single decoy identity. Also referred to simply as a profile.
Profile bundle: The signed, reproducible artifact format used to ship a decoy profile to production.
Detection layer: The in-line stage that inspects every prompt for abuse signals before the model responds.
Rule-based detection: Deterministic pattern detection — fast, explainable, and auditable. Sits before ML detection in the pipeline.
ML detection: A neural prompt-injection classifier that catches novel phrasings the rule-based layer doesn't. Internally a fine-tuned DeBERTa-class model exported to ONNX for CPU inference.
Tracked credential: A unique, traceable fake credential issued to a suspected attacker. If anyone tries to use it later, we know exactly which session leaked it. Sometimes called a canary token.
Adapter fallback marker: A recognizable but un-tracked decoy credential the model emits if rule-based detection misses an attack. Guarantees no real secret is ever returned, even on a detection miss.
Persona model: The fine-tuned language model that gives each decoy its personality and produces convincing-but-fake responses. Internally a LoRA-adapted small open-weight model.
Adversarial test suite: Our internal evaluation harness that runs 80+ jailbreak techniques against every release. Used as the release gate for shipping a new profile bundle.
Per-tenant tuning: Customers can add their own detection rules on top of the shipped ruleset without weakening it. Also called an operator overlay. Overlays are additive only — they can raise a threat score, never lower it.
Cryptographic signing: Every profile bundle is signed; the runtime refuses to load anything that hasn't been signed by an authorized key. Internally uses Ed25519.
Reproducibility check: A tool that rebuilds any shipped profile from source and proves it is bit-identical to the deployed artifact. Runs in a fast mode (~15 seconds) and a full mode (~15 minutes).
Defense in depth: If detection misses, the persona model is independently trained to never emit real-looking credentials. The four independent layers (rule-based detection, ML detection, decoy model, cryptographic signing) each prevent a real-credential leak on their own.
Threat capture rate: The fraction of adversarial prompts that the platform correctly routes into the decoy. v1.0.4 measured at 98.5%.