A→Z
A2ZAI
Back to Glossary
techniques

Gradient Descent

Optimization algorithm that iteratively adjusts parameters to minimize loss.

Share:

Definition

Gradient descent is the core optimization algorithm for training machine learning models.

How It Works: 1. Calculate loss for current parameters 2. Compute gradient (direction of steepest increase) 3. Move parameters in opposite direction 4. Repeat until convergence

  • **Variants:**
  • Batch GD: Use all data each step
  • Stochastic GD (SGD): Use one sample
  • Mini-batch GD: Use small batches (most common)
  • **Advanced Optimizers:**
  • Adam: Adaptive learning rates, momentum
  • AdamW: Adam with weight decay
  • SGD with Momentum: Accelerates convergence

Challenges: - Learning rate selection - Local minima - Saddle points - Slow convergence

Examples

Adam optimizer adjusting millions of parameters to minimize prediction error.

Want more AI knowledge?

Get bite-sized AI concepts delivered to your inbox.

Free daily digest. No spam, unsubscribe anytime.

Discussion