more consistent terminology

This commit is contained in:
Mike J Innes 2018-02-13 17:08:13 +00:00
parent 5ea0ef6764
commit c1ed3e477e

View File

@ -2,35 +2,34 @@
To actually train a model we need three things: To actually train a model we need three things:
* A *model loss function*, that evaluates how well a model is doing given some input data. * A *objective function*, that evaluates how well a model is doing given some input data.
* A collection of data points that will be provided to the loss function. * A collection of data points that will be provided to the objective function.
* An [optimiser](optimisers.md) that will update the model parameters appropriately. * An [optimiser](optimisers.md) that will update the model parameters appropriately.
With these we can call `Flux.train!`: With these we can call `Flux.train!`:
```julia ```julia
Flux.train!(modelLoss, data, opt) Flux.train!(objective, data, opt)
``` ```
There are plenty of examples in the [model zoo](https://github.com/FluxML/model-zoo). There are plenty of examples in the [model zoo](https://github.com/FluxML/model-zoo).
## Loss Functions ## Loss Functions
The `loss` that we defined in [basics](../models/basics.md) is completely valid for training. We can also define a loss in terms of some model: The objective function must return a number representing how far the model is from its target the *loss* of the model. The `loss` function that we defined in [basics](../models/basics.md) will work as an objective. We can also define an objective in terms of some model:
```julia ```julia
m = Chain( m = Chain(
Dense(784, 32, σ), Dense(784, 32, σ),
Dense(32, 10), softmax) Dense(32, 10), softmax)
# Model loss function
loss(x, y) = Flux.mse(m(x), y) loss(x, y) = Flux.mse(m(x), y)
# later # later
Flux.train!(loss, data, opt) Flux.train!(loss, data, opt)
``` ```
The loss will almost always be defined in terms of some *cost function* that measures the distance of the prediction `m(x)` from the target `y`. Flux has several of these built in, like `mse` for mean squared error or `crossentropy` for cross entropy loss, but you can calculate it however you want. The objective will almost always be defined in terms of some *cost function* that measures the distance of the prediction `m(x)` from the target `y`. Flux has several of these built in, like `mse` for mean squared error or `crossentropy` for cross entropy loss, but you can calculate it however you want.
## Datasets ## Datasets
@ -63,7 +62,7 @@ data = zip(xs, ys)
`train!` takes an additional argument, `cb`, that's used for callbacks so that you can observe the training process. For example: `train!` takes an additional argument, `cb`, that's used for callbacks so that you can observe the training process. For example:
```julia ```julia
train!(loss, data, opt, cb = () -> println("training")) train!(objective, data, opt, cb = () -> println("training"))
``` ```
Callbacks are called for every batch of training data. You can slow this down using `Flux.throttle(f, timeout)` which prevents `f` from being called more than once every `timeout` seconds. Callbacks are called for every batch of training data. You can slow this down using `Flux.throttle(f, timeout)` which prevents `f` from being called more than once every `timeout` seconds.
@ -74,6 +73,6 @@ A more typical callback might look like this:
test_x, test_y = # ... create single batch of test data ... test_x, test_y = # ... create single batch of test data ...
evalcb() = @show(loss(test_x, test_y)) evalcb() = @show(loss(test_x, test_y))
Flux.train!(loss, data, opt, Flux.train!(objective, data, opt,
cb = throttle(evalcb, 5)) cb = throttle(evalcb, 5))
``` ```