AI Glossary
Prompt Injection
An attack that attempts to manipulate an AI system by providing instructions that interfere with its intended behavior.
Prompt Injection
Overview
Many AI systems rely on instructions to determine how they should behave.
These instructions may come from developers, system prompts, user prompts, or external sources.
Prompt injection occurs when someone deliberately provides instructions designed to override, manipulate, or interfere with the intended behavior of an AI system.
A helpful way to think about prompt injection is conflicting directions.
Imagine an employee receives instructions from their manager and then receives different instructions from someone attempting to influence their actions.
The employee may become confused about which instructions to follow.
AI systems can experience a similar challenge.
When applications connect AI models to documents, websites, databases, or external tools, attackers may attempt to insert instructions that alter how the model behaves.
Prompt injection has become one of the most widely discussed AI security concerns because it targets how AI systems process instructions rather than traditional software vulnerabilities.
Understanding prompt injection helps organizations design safer AI applications and implement appropriate safeguards.
Why It Matters
Prompt injection can influence how AI systems respond and interact with users and external systems.
Real-World Example
An attacker may insert hidden instructions into a document that an AI assistant later processes, causing unexpected behavior.