Definition
Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns.
- **Common Activation Functions:**
- ReLU: max(0, x) - Most popular, simple
- Sigmoid: 1/(1+e^-x) - Output 0-1
- Tanh: Hyperbolic tangent - Output -1 to 1
- Softmax: For multi-class classification
- GELU: Used in transformers
Why Non-Linearity Matters: - Without it, deep networks = one linear layer - Enables learning complex patterns - Different functions for different use cases
Choosing Activation: - Hidden layers: Usually ReLU or GELU - Output (classification): Softmax - Output (binary): Sigmoid - Output (regression): None (linear)
Examples
ReLU turning negative values to zero while keeping positive values.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free daily digest. No spam, unsubscribe anytime.