models

Large Language Model (LLM)

AI models trained on massive text datasets that can understand and generate human-like text.

Definition

Large Language Models are neural networks trained on enormous amounts of text data. They learn patterns in language and can generate coherent, contextually relevant text.

Key Characteristics: - Billions of parameters (GPT-4 has ~1.7 trillion) - Trained on internet-scale text data - Can perform many tasks without task-specific training (zero-shot learning) - Use transformer architecture

Capabilities: - Text generation and completion - Question answering - Summarization - Translation - Code generation

Examples

GPT-4, Claude, Gemini, Llama 3, Mistral are all LLMs.

Related Terms

Transformer

A neural network architecture using self-attention mechanisms, the foundation of modern LLMs.

Fine-Tuning

Adapting a pre-trained model to perform better on specific tasks using additional training.

GPT (Generative Pre-trained Transformer)

OpenAI's series of large language models that power ChatGPT.

Parameters

The learnable values in a neural network that determine its behavior.

Want more AI knowledge?

Get bite-sized AI concepts delivered to your inbox.

Free daily digest. No spam, unsubscribe anytime.

Discussion

Browse all terms Take AI 101 Course