Definition
Model collapse occurs when models trained on synthetic/AI-generated data progressively lose quality and diversity.
How It Happens: 1. Model generates content 2. Content joins training data 3. New models trained on this data 4. Each generation loses information 5. Eventually: low quality, repetitive
Research Findings: - Irreversible quality degradation - Loss of minority/tail information - Convergence to limited outputs - Affects both text and images
Implications: - Internet increasingly AI-generated - Future training data contaminated - Need for data provenance - Human data becomes more valuable
Mitigation: - Track AI-generated content - Maintain human data sources - Filter training data - Data diversity requirements
Examples
Image models producing increasingly generic faces after training on AI-generated images.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free intelligence briefs. No spam, unsubscribe anytime.