Flux.jl/docs/src/training/optimisers.md

54 lines
1.4 KiB
Markdown
Raw Normal View History

2017-09-10 01:01:19 +00:00
# Optimisers
2017-09-12 10:34:04 +00:00
Consider a [simple linear regression](../models/basics.md). We create some dummy data, calculate a loss, and backpropagate to calculate gradients for the parameters `W` and `b`.
2017-09-10 01:01:19 +00:00
```julia
2018-07-11 14:31:22 +00:00
using Flux.Tracker
2017-09-10 01:01:19 +00:00
W = param(rand(2, 5))
b = param(rand(2))
predict(x) = W*x .+ b
loss(x, y) = sum((predict(x) .- y).^2)
x, y = rand(5), rand(2) # Dummy data
l = loss(x, y) # ~ 3
2018-07-11 14:31:22 +00:00
params = Params([W, b])
grads = Tracker.gradient(() -> loss(x, y), params)
2017-09-10 01:01:19 +00:00
```
We want to update each parameter, using the gradient, in order to improve (reduce) the loss. Here's one way to do that:
```julia
2018-06-29 12:53:50 +00:00
using Flux.Tracker: grad, update!
2019-01-10 11:01:57 +00:00
η = 0.1 # Learning Rate
for p in (W, b)
update!(p, -η * grads[p])
2017-09-10 01:01:19 +00:00
end
```
2019-01-10 11:01:57 +00:00
Running this will alter the parameters `W` and `b` and our loss should go down. Flux provides a more general way to do optimiser updates like this.
2017-09-10 01:01:19 +00:00
```julia
2018-11-12 12:12:52 +00:00
opt = Descent(0.1) # Gradient descent with learning rate 0.1
2017-09-10 01:01:19 +00:00
2019-01-10 11:01:57 +00:00
for p in (W, b)
update!(opt, p, -η * grads[p])
end
2017-09-10 01:01:19 +00:00
```
2019-01-10 11:01:57 +00:00
An optimiser `update!` accepts a parameter and a gradient, and updates the parameter according to the chosen rule. We can also pass `opt` to our [training loop](training.md), which will update all parameters of the model in a loop. However, we can now easily replace `Descent` with a more advanced optimiser such as `ADAM`.
2017-10-18 11:07:43 +00:00
## Optimiser Reference
2018-12-04 10:38:03 +00:00
All optimisers return an object that, when passed to `train!`, will update the parameters passed to it.
2017-10-18 11:22:45 +00:00
2018-12-04 10:38:03 +00:00
```@docs
SGD
Momentum
Nesterov
ADAM
2018-11-12 12:12:52 +00:00
```