2017-09-10 01:01:19 +00:00
# Optimisers
2017-09-12 10:34:04 +00:00
Consider a [simple linear regression ](../models/basics.md ). We create some dummy data, calculate a loss, and backpropagate to calculate gradients for the parameters `W` and `b` .
2017-09-10 01:01:19 +00:00
```julia
2019-09-11 13:48:50 +00:00
using Flux
2018-07-11 14:31:22 +00:00
2019-09-14 05:26:17 +00:00
W = rand(2, 5)
2019-09-10 15:19:15 +00:00
b = rand(2)
2017-09-10 01:01:19 +00:00
2019-09-10 15:19:15 +00:00
predict(x) = (W * x) .+ b
2017-09-10 01:01:19 +00:00
loss(x, y) = sum((predict(x) .- y).^2)
x, y = rand(5), rand(2) # Dummy data
l = loss(x, y) # ~ 3
2018-07-11 14:31:22 +00:00
2019-01-28 14:10:09 +00:00
θ = Params([W, b])
2019-09-11 13:51:15 +00:00
grads = gradient(() -> loss(x, y), θ)
2017-09-10 01:01:19 +00:00
```
We want to update each parameter, using the gradient, in order to improve (reduce) the loss. Here's one way to do that:
```julia
2019-09-10 15:19:15 +00:00
using Flux: update!
2018-06-29 12:53:50 +00:00
2019-01-10 11:01:57 +00:00
η = 0.1 # Learning Rate
for p in (W, b)
update!(p, -η * grads[p])
2017-09-10 01:01:19 +00:00
end
```
2019-01-10 11:01:57 +00:00
Running this will alter the parameters `W` and `b` and our loss should go down. Flux provides a more general way to do optimiser updates like this.
2017-09-10 01:01:19 +00:00
```julia
2018-11-12 12:12:52 +00:00
opt = Descent(0.1) # Gradient descent with learning rate 0.1
2017-09-10 01:01:19 +00:00
2019-01-10 11:01:57 +00:00
for p in (W, b)
2019-01-28 14:10:09 +00:00
update!(opt, p, grads[p])
2019-01-10 11:01:57 +00:00
end
2017-09-10 01:01:19 +00:00
```
2019-01-10 11:01:57 +00:00
An optimiser `update!` accepts a parameter and a gradient, and updates the parameter according to the chosen rule. We can also pass `opt` to our [training loop ](training.md ), which will update all parameters of the model in a loop. However, we can now easily replace `Descent` with a more advanced optimiser such as `ADAM` .
2017-10-18 11:07:43 +00:00
## Optimiser Reference
2018-12-04 10:38:03 +00:00
All optimisers return an object that, when passed to `train!` , will update the parameters passed to it.
2017-10-18 11:22:45 +00:00
2018-12-04 10:38:03 +00:00
```@docs
2019-01-10 13:54:17 +00:00
Descent
2018-12-04 10:38:03 +00:00
Momentum
Nesterov
2019-04-04 20:55:21 +00:00
RMSProp
2018-12-04 10:38:03 +00:00
ADAM
2019-04-04 20:55:21 +00:00
AdaMax
ADAGrad
ADADelta
AMSGrad
NADAM
ADAMW
2018-11-12 12:12:52 +00:00
```