Solution

Stop prompt injection before it reaches your model

Attackers try to override your instructions, extract your system prompt, and jailbreak your guardrails. Emil screens every input and refuses the attack before your model ever sees it.

The attack surface

  • Users jailbreak your AI into ignoring its instructions and going off-brand or unsafe.
  • Injected instructions can exfiltrate your system prompt or hidden context.
  • Indirect injection hides instructions inside documents your AI reads.

What Emil detects

  • Instruction-override attempts ('ignore previous instructions')
  • System-prompt extraction and reveal attempts
  • Role/persona jailbreaks (DAN-style, 'developer mode')
  • Safety-bypass language and fake system delimiters

Defense in depth

  • Deterministic patterns catch the blatant cases in sub-millisecond time
  • A model classifier catches the fuzzy long tail
  • Obfuscation-resistant matching defeats leetspeak and homoglyph evasion

Questions

What is prompt injection?
Prompt injection is when a user (or a document the AI reads) smuggles in instructions that override what you told the model to do — to jailbreak it, extract its system prompt, or make it behave unsafely.
How does Emil stop it?
Emil screens the input before it reaches your model, detecting override attempts, jailbreak personas, and bypass language with deterministic patterns plus a model classifier, and refuses the request.
Does it handle obfuscated attacks?
Yes. Emil normalizes leetspeak and look-alike characters before matching, so '1gn0re previous instructions' is caught the same as the plain form.
Will it block legitimate prompts?
The patterns require the multi-part structure of a real injection attempt, and benign instructions pass through. You can tune severity and action per policy.

Related solutions