AI Glossary
Overfitting
Overfitting occurs when a machine learning model learns the training data too closely, causing it to perform poorly on new, unseen data.
Overview
One of the biggest challenges in machine learning is teaching a model to learn useful patterns without memorizing every detail of the training data.
This challenge is known as overfitting.
An overfitted model becomes extremely good at handling the data it was trained on but struggles when faced with new information.
A helpful way to think about it is a student preparing for an exam.
If the student memorizes every answer from previous tests without understanding the concepts, they may perform well on familiar questions but poorly on new ones.
Machine learning models can behave in exactly the same way.
Instead of learning broad patterns, they learn specific details, noise, and quirks found in the training dataset.
As a result, performance during training may look impressive while real-world performance suffers.
Avoiding overfitting is one of the most important goals in machine learning because successful models must perform well on data they have never seen before.
Why It Matters
Overfitting can make machine learning systems appear more accurate than they actually are.
Understanding overfitting helps organizations build models that perform reliably in real-world environments.
Real-World Example
A fraud detection model may perform exceptionally well during testing but fail to detect new fraud patterns because it memorized historical examples too closely.
Related Concepts
- Model Evaluation
- Training Data
- Generalization
- Machine Learning
- Validation