Definition
Whisper is OpenAI's automatic speech recognition (ASR) model trained on massive multilingual data.
Capabilities: - Speech-to-text transcription - Translation to English - Language detection - Timestamp generation
Model Sizes: - Tiny: 39M parameters - Base: 74M - Small: 244M - Medium: 769M - Large: 1.55B
Training Data: - 680,000 hours of audio - Multilingual (99 languages) - Diverse: YouTube, podcasts, audiobooks
Features: - Robust to noise, accents - Handles multiple languages - Open source and free - Local deployment possible
Examples
Using Whisper to transcribe podcasts with speaker timestamps.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free intelligence briefs. No spam, unsubscribe anytime.