Back to catalog
Security
Prompt Injection Vulnerability
The AI is easily manipulated into breaking its operating constraints or persona.
3 principles3 checklist items2 simulation modes
What you will learn
Instruction Hierarchy
Input Sanitization
Vulnerability Testing
Run the simulation to see the failure in action.
Interactive Lab
Toggle between Bad and Good modes.
Anti-pattern mode
Simulation Status: Idlev2.4.0
Why it fails
Without robust guardrails, users can override system instructions ('Ignore previous instructions') to make the AI generate harmful content, reveal secrets, or perform unauthorized actions. This specific anti-pattern shows a failure of defense.
Root cause
Treating user input as equal in priority to system instructions. Concatenating system and user prompts naively.
How to fix it
Instruction Hierarchy
Input Sanitization
Vulnerability Testing
Implementation checklist
- Separate System Instructions from User Prompt
- Detect and refuse known injection patterns
- Role-play testing (Red Teaming)