AI Glossary

Prompt Injection

An attack that attempts to manipulate an AI system by providing instructions that interfere with its intended behavior.

Prompt Injection

Overview

Many AI systems rely on instructions to determine how they should behave.

These instructions may come from developers, system prompts, user prompts, or external sources.

Prompt injection occurs when someone deliberately provides instructions designed to override, manipulate, or interfere with the intended behavior of an AI system.

A helpful way to think about prompt injection is conflicting directions.

Imagine an employee receives instructions from their manager and then receives different instructions from someone attempting to influence their actions.

The employee may become confused about which instructions to follow.

AI systems can experience a similar challenge.

When applications connect AI models to documents, websites, databases, or external tools, attackers may attempt to insert instructions that alter how the model behaves.

Prompt injection has become one of the most widely discussed AI security concerns because it targets how AI systems process instructions rather than traditional software vulnerabilities.

Understanding prompt injection helps organizations design safer AI applications and implement appropriate safeguards.

Why It Matters

Prompt injection can influence how AI systems respond and interact with users and external systems.

Real-World Example

An attacker may insert hidden instructions into a document that an AI assistant later processes, causing unexpected behavior.

Prompt Injection

Prompt Injection

Overview

Why It Matters

Real-World Example

Related Concepts

Related Articles