Definition
Large Language Models are neural networks trained on enormous amounts of text data. They learn patterns in language and can generate coherent, contextually relevant text.
Key Characteristics: - Billions of parameters (GPT-4 has ~1.7 trillion) - Trained on internet-scale text data - Can perform many tasks without task-specific training (zero-shot learning) - Use transformer architecture
Capabilities: - Text generation and completion - Question answering - Summarization - Translation - Code generation
Examples
GPT-4, Claude, Gemini, Llama 3, Mistral are all LLMs.
Related Terms
A neural network architecture using self-attention mechanisms, the foundation of modern LLMs.
Adapting a pre-trained model to perform better on specific tasks using additional training.
OpenAI's series of large language models that power ChatGPT.
The learnable values in a neural network that determine its behavior.
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free daily digest. No spam, unsubscribe anytime.