Transition to doctests (#616)

* basics.md: Initial doctest to an example

Related to https://github.com/FluxML/Flux.jl/issues/561

* make.jl: Allow doctest to run

* Fix comments in order to pass doctests

* basic.md: Add doctests to examples
This commit is contained in:
Ayan Banerjee 2019-02-14 23:59:27 +05:30 committed by Mike J Innes
parent b0a5844afb
commit 08b87e0bce
2 changed files with 29 additions and 21 deletions

View File

@ -1,7 +1,7 @@
using Documenter, Flux, NNlib
makedocs(modules=[Flux, NNlib],
doctest = false,
doctest = true,
analytics = "UA-36890222-9",
sitename = "Flux",
# Uncomment below for local build

View File

@ -4,45 +4,53 @@
Flux's core feature is taking gradients of Julia code. The `gradient` function takes another Julia function `f` and a set of arguments, and returns the gradient with respect to each argument. (It's a good idea to try pasting these examples in the Julia terminal.)
```julia
using Flux.Tracker
```jldoctest basics
julia> using Flux.Tracker
f(x) = 3x^2 + 2x + 1
julia> f(x) = 3x^2 + 2x + 1;
# df/dx = 6x + 2
df(x) = Tracker.gradient(f, x; nest = true)[1]
julia> df(x) = Tracker.gradient(f, x; nest = true)[1]; # df/dx = 6x + 2
df(2) # 14.0 (tracked)
julia> df(2)
14.0 (tracked)
# d²f/dx² = 6
d2f(x) = Tracker.gradient(df, x; nest = true)[1]
julia> d2f(x) = Tracker.gradient(df, x; nest = true)[1]; # d²f/dx² = 6
d2f(2) # 6.0 (tracked)
julia> d2f(2)
6.0 (tracked)
```
(We'll learn more about why these numbers show up as `(tracked)` below.)
When a function has many parameters, we can pass them all in explicitly:
```julia
f(W, b, x) = W * x + b
```jldoctest basics
julia> f(W, b, x) = W * x + b;
Tracker.gradient(f, 2, 3, 4)
# (4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
julia> Tracker.gradient(f, 2, 3, 4)
(4.0 (tracked), 1.0 (tracked), 2.0 (tracked))
```
But machine learning models can have *hundreds* of parameters! Flux offers a nice way to handle this. We can tell Flux to treat something as a parameter via `param`. Then we can collect these together and tell `gradient` to collect the gradients of all `params` at once.
```julia
W = param(2) # 2.0 (tracked)
b = param(3) # 3.0 (tracked)
```jldoctest basics
julia> using Flux
f(x) = W * x + b
julia> W = param(2)
2.0 (tracked)
grads = Tracker.gradient(() -> f(4), params(W, b))
julia> b = param(3)
3.0 (tracked)
grads[W] # 4.0
grads[b] # 1.0
julia> f(x) = W * x + b;
julia> grads = Tracker.gradient(() -> f(4), params(W, b));
julia> grads[W]
4.0
julia> grads[b]
1.0
```
There are a few things to notice here. Firstly, `W` and `b` now show up as *tracked*. Tracked things behave like normal numbers or arrays, but keep records of everything you do with them, allowing Flux to calculate their gradients. `gradient` takes a zero-argument function; no arguments are necessary because the `params` tell it what to differentiate.