What is a learning rate in Machine Learning?

Question

Accepted Answer

In machine learning, the learning rate is a hyperparameter that determines the size of the steps a model takes when optimizing its parameters as it learns from errors (loss function).
Background: When training a model (e.g., neural network), the aim is to minimize the loss function. This is typically done using gradient descent or variants thereof. The gradient tells us the direction in which the loss becomes smaller. The learning rate determines how far we go in this direction.
Advantages and disadvantages of learning steps that are too large or too small:

High learning rate: The model makes large jumps.
- Advantage: Learning is faster.
- Risk: The minimum is “skipped” or the loss becomes unstable.
Low learning rate: The model takes small, cautious steps.
- Advantage: More stable learning, more accurate approximation of the minimum.
- Disadvantage: Learning takes a very long time and you can get “stuck” in local minima.

Knowledge Nugget