Definition
Perplexity measures how "surprised" a language model is by text, indicating prediction quality.
Intuition: - Lower perplexity = better predictions - Perplexity of 10 = model as confused as choosing from 10 options - Good models have low perplexity on test data
Calculation: - Exponential of cross-entropy loss - Geometric mean of inverse probabilities - PPL = exp(loss)
Typical Values: - State-of-the-art LLMs: 3-10 on standard benchmarks - Random guessing: Vocabulary size - Perfect prediction: 1
Limitations: - Doesn't measure coherence - Doesn't measure factuality - Dataset-dependent - Not comparable across different tokenizers
Examples
GPT-4 achieving perplexity of 8.5 on WikiText-103 benchmark.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free intelligence briefs. No spam, unsubscribe anytime.