A→Z
A2ZAI
Back to Glossary
concepts

Prompt Injection

Security vulnerability where malicious input hijacks AI model behavior.

Share:

Definition

Prompt injection attacks manipulate AI systems by embedding malicious instructions in input data.

  • **Types:**
  • Direct: User provides malicious prompt
  • Indirect: Malicious content in retrieved data
  • Hidden instructions in documents/websites

Attack Examples: - "Ignore previous instructions and..." - Hidden text in documents - Invisible characters - Data exfiltration attempts

Risks: - Data leakage - Unauthorized actions - Bypassing restrictions - System manipulation

Defenses: - Input sanitization - Output filtering - Privilege separation - Instruction hierarchy - Monitoring and detection

Analogy: - Similar to SQL injection - New attack surface for AI apps - Critical for production systems

Examples

A malicious PDF instructing an AI assistant to email sensitive data to attackers.

Want more AI knowledge?

Get bite-sized AI concepts delivered to your inbox.

Free daily digest. No spam, unsubscribe anytime.

Discussion