Transition to doctests (#616)
* Initial doctest to an example Related to * make.jl: Allow doctest to run * Fix comments in order to pass doctests * Add doctests to examples
This commit is contained in:
@ -1,7 +1,7 @@
using Documenter, Flux, NNlib
makedocs(modules=[Flux, NNlib],
doctest = false,
doctest = true,
analytics = "UA-36890222-9",
sitename = "Flux",
# Uncomment below for local build
@ -4,45 +4,53 @@
Flux's core feature is taking gradients of Julia code. The `gradient` function takes another Julia function `f` and a set of arguments, and returns the gradient with respect to each argument. (It's a good idea to try pasting these examples in the Julia terminal.)
using Flux.Tracker
```jldoctest basics
julia> using Flux.Tracker
f(x) = 3x^2 + 2x + 1
julia> f(x) = 3x^2 + 2x + 1;
# df/dx = 6x + 2
df(x) = Tracker.gradient(f, x; nest = true)[1]
julia> df(x) = Tracker.gradient(f, x; nest = true)[1]; # df/dx = 6x + 2
df(2) # 14.0 (tracked)
julia> df(2)
14.0 (tracked)
# d²f/dx² = 6
d2f(x) = Tracker.gradient(df, x; nest = true)[1]
julia> d2f(x) = Tracker.gradient(df, x; nest = true)[1]; # d²f/dx² = 6
d2f(2) # 6.0 (tracked)
julia> d2f(2)
6.0 (tracked)
(We'll learn more about why these numbers show up as `(tracked)` below.)
When a function has many parameters, we can pass them all in explicitly:
f(W, b, x) = W * x + b
```jldoctest basics
julia> f(W, b, x) = W * x + b;
Tracker.gradient(f, 2, 3, 4)
# (4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
julia> Tracker.gradient(f, 2, 3, 4)
(4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
But machine learning models can have *hundreds* of parameters! Flux offers a nice way to handle this. We can tell Flux to treat something as a parameter via `param`. Then we can collect these together and tell `gradient` to collect the gradients of all `params` at once.
W = param(2) # 2.0 (tracked)
b = param(3) # 3.0 (tracked)
```jldoctest basics
julia> using Flux
f(x) = W * x + b
julia> W = param(2)
2.0 (tracked)
grads = Tracker.gradient(() -> f(4), params(W, b))
julia> b = param(3)
3.0 (tracked)
grads[W] # 4.0
grads[b] # 1.0
julia> f(x) = W * x + b;
julia> grads = Tracker.gradient(() -> f(4), params(W, b));
julia> grads[W]
julia> grads[b]
There are a few things to notice here. Firstly, `W` and `b` now show up as *tracked*. Tracked things behave like normal numbers or arrays, but keep records of everything you do with them, allowing Flux to calculate their gradients. `gradient` takes a zero-argument function; no arguments are necessary because the `params` tell it what to differentiate.
Reference in New Issue