Transition to doctests (#616)
* basics.md: Initial doctest to an example Related to https://github.com/FluxML/Flux.jl/issues/561 * make.jl: Allow doctest to run * Fix comments in order to pass doctests * basic.md: Add doctests to examples
This commit is contained in:
parent
b0a5844afb
commit
08b87e0bce
|
@ -1,7 +1,7 @@
|
|||
using Documenter, Flux, NNlib
|
||||
|
||||
makedocs(modules=[Flux, NNlib],
|
||||
doctest = false,
|
||||
doctest = true,
|
||||
analytics = "UA-36890222-9",
|
||||
sitename = "Flux",
|
||||
# Uncomment below for local build
|
||||
|
|
|
@ -4,45 +4,53 @@
|
|||
|
||||
Flux's core feature is taking gradients of Julia code. The `gradient` function takes another Julia function `f` and a set of arguments, and returns the gradient with respect to each argument. (It's a good idea to try pasting these examples in the Julia terminal.)
|
||||
|
||||
```julia
|
||||
using Flux.Tracker
|
||||
```jldoctest basics
|
||||
julia> using Flux.Tracker
|
||||
|
||||
f(x) = 3x^2 + 2x + 1
|
||||
julia> f(x) = 3x^2 + 2x + 1;
|
||||
|
||||
# df/dx = 6x + 2
|
||||
df(x) = Tracker.gradient(f, x; nest = true)[1]
|
||||
julia> df(x) = Tracker.gradient(f, x; nest = true)[1]; # df/dx = 6x + 2
|
||||
|
||||
df(2) # 14.0 (tracked)
|
||||
julia> df(2)
|
||||
14.0 (tracked)
|
||||
|
||||
# d²f/dx² = 6
|
||||
d2f(x) = Tracker.gradient(df, x; nest = true)[1]
|
||||
julia> d2f(x) = Tracker.gradient(df, x; nest = true)[1]; # d²f/dx² = 6
|
||||
|
||||
d2f(2) # 6.0 (tracked)
|
||||
julia> d2f(2)
|
||||
6.0 (tracked)
|
||||
```
|
||||
|
||||
(We'll learn more about why these numbers show up as `(tracked)` below.)
|
||||
|
||||
When a function has many parameters, we can pass them all in explicitly:
|
||||
|
||||
```julia
|
||||
f(W, b, x) = W * x + b
|
||||
```jldoctest basics
|
||||
julia> f(W, b, x) = W * x + b;
|
||||
|
||||
Tracker.gradient(f, 2, 3, 4)
|
||||
# (4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
|
||||
julia> Tracker.gradient(f, 2, 3, 4)
|
||||
(4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
|
||||
```
|
||||
|
||||
But machine learning models can have *hundreds* of parameters! Flux offers a nice way to handle this. We can tell Flux to treat something as a parameter via `param`. Then we can collect these together and tell `gradient` to collect the gradients of all `params` at once.
|
||||
|
||||
```julia
|
||||
W = param(2) # 2.0 (tracked)
|
||||
b = param(3) # 3.0 (tracked)
|
||||
```jldoctest basics
|
||||
julia> using Flux
|
||||
|
||||
f(x) = W * x + b
|
||||
julia> W = param(2)
|
||||
2.0 (tracked)
|
||||
|
||||
grads = Tracker.gradient(() -> f(4), params(W, b))
|
||||
julia> b = param(3)
|
||||
3.0 (tracked)
|
||||
|
||||
grads[W] # 4.0
|
||||
grads[b] # 1.0
|
||||
julia> f(x) = W * x + b;
|
||||
|
||||
julia> grads = Tracker.gradient(() -> f(4), params(W, b));
|
||||
|
||||
julia> grads[W]
|
||||
4.0
|
||||
|
||||
julia> grads[b]
|
||||
1.0
|
||||
```
|
||||
|
||||
There are a few things to notice here. Firstly, `W` and `b` now show up as *tracked*. Tracked things behave like normal numbers or arrays, but keep records of everything you do with them, allowing Flux to calculate their gradients. `gradient` takes a zero-argument function; no arguments are necessary because the `params` tell it what to differentiate.
|
||||
|
|
Loading…
Reference in New Issue