6</code></pre><p>When a function has many parameters, we can get gradients of each one at the same time:</p><pre><codeclass="language-julia-repl">julia> f(x, y) = sum((x .- y).^2);
([0, 2], [0, -2])</code></pre><p>But machine learning models can have <em>hundreds</em> of parameters! To handle this, Flux lets you work with collections of parameters, via <code>params</code>. You can get the gradient of all parameters used in a program without explicitly passing them in.</p><pre><codeclass="language-julia-repl">julia> using Flux
-2</code></pre><p>Here, <code>gradient</code> takes a zero-argument function; no arguments are necessary because the <code>params</code> tell it what to differentiate.</p><p>This will come in really handy when dealing with big, complicated models. For now, though, let's start with something simple.</p><h2id="Simple-Models-1"><aclass="docs-heading-anchor"href="#Simple-Models-1">Simple Models</a><aclass="docs-heading-anchor-permalink"href="#Simple-Models-1"title="Permalink"></a></h2><p>Consider a simple linear regression, which tries to predict an output array <code>y</code> from an input <code>x</code>.</p><pre><codeclass="language-julia">W = rand(2, 5)
loss(x, y) # ~ 3</code></pre><p>To improve the prediction we can take the gradients of <code>W</code> and <code>b</code> with respect to the loss and perform gradient descent.</p><pre><codeclass="language-julia">using Flux
gs = gradient(() -> loss(x, y), params(W, b))</code></pre><p>Now that we have gradients, we can pull them out and update <code>W</code> to train the model.</p><pre><codeclass="language-julia">W̄ = gs[W]
loss(x, y) # ~ 2.5</code></pre><p>The loss has decreased a little, meaning that our prediction <code>x</code> is closer to the target <code>y</code>. If we have some data we can already try <ahref="../../training/training/">training the model</a>.</p><p>All deep learning in Flux, however complex, is a simple generalisation of this example. Of course, models can <em>look</em> very different – they might have millions of parameters or complex control flow. Let's see how Flux handles more complex models.</p><h2id="Building-Layers-1"><aclass="docs-heading-anchor"href="#Building-Layers-1">Building Layers</a><aclass="docs-heading-anchor-permalink"href="#Building-Layers-1"title="Permalink"></a></h2><p>It's common to create more complex models than the linear regression above. For example, we might want to have two linear layers with a nonlinearity like <ahref="https://en.wikipedia.org/wiki/Sigmoid_function">sigmoid</a> (<code>σ</code>) in between them. In the above style we could write this as:</p><pre><codeclass="language-julia">using Flux
model(rand(5)) # => 2-element vector</code></pre><p>This works but is fairly unwieldy, with a lot of repetition – especially as we add more layers. One way to factor this out is to create a function that returns linear layers.</p><pre><codeclass="language-julia">function linear(in, out)
model(rand(5)) # => 2-element vector</code></pre><p>Another (equivalent) way is to create a struct that explicitly represents the affine layer.</p><pre><codeclass="language-julia">struct Affine
a(rand(10)) # => 5-element vector</code></pre><p>Congratulations! You just built the <code>Dense</code> layer that comes with Flux. Flux has many interesting layers available, but they're all things you could have built yourself very easily.</p><p>(There is one small difference with <code>Dense</code>– for convenience it also takes an activation function, like <code>Dense(10, 5, σ)</code>.)</p><h2id="Stacking-It-Up-1"><aclass="docs-heading-anchor"href="#Stacking-It-Up-1">Stacking It Up</a><aclass="docs-heading-anchor-permalink"href="#Stacking-It-Up-1"title="Permalink"></a></h2><p>It's pretty common to write models that look something like:</p><pre><codeclass="language-julia">layer1 = Dense(10, 5, σ)
model(x) = layer3(layer2(layer1(x)))</code></pre><p>For long chains, it might be a bit more intuitive to have a list of layers, like this:</p><pre><codeclass="language-julia">using Flux
model(rand(10)) # => 2-element vector</code></pre><p>Handily, this is also provided for in Flux:</p><pre><codeclass="language-julia">model2 = Chain(
Dense(10, 5, σ),
Dense(5, 2),
softmax)
model2(rand(10)) # => 2-element vector</code></pre><p>This quickly starts to look like a high-level deep learning library; yet you can see how it falls out of simple abstractions, and we lose none of the power of Julia code.</p><p>A nice property of this approach is that because "models" are just functions (possibly with trainable parameters), you can also see this as simple function composition.</p><pre><codeclass="language-julia">m = Dense(5, 2) ∘ Dense(10, 5, σ)
m(rand(10))</code></pre><p>Likewise, <code>Chain</code> will happily work with any Julia function.</p><pre><codeclass="language-julia">m = Chain(x -> x^2, x -> x+1)
m(5) # => 26</code></pre><h2id="Layer-helpers-1"><aclass="docs-heading-anchor"href="#Layer-helpers-1">Layer helpers</a><aclass="docs-heading-anchor-permalink"href="#Layer-helpers-1"title="Permalink"></a></h2><p>Flux provides a set of helpers for custom layers, which you can enable by calling</p><pre><codeclass="language-julia">Flux.@functor Affine</code></pre><p>This enables a useful extra set of functionality for our <code>Affine</code> layer, such as <ahref="../../training/optimisers/">collecting its parameters</a> or <ahref="../../gpu/">moving it to the GPU</a>.</p><p>For some more helpful tricks, including parameter freezing, please checkout the <ahref="../advanced/">advanced usage guide</a>.</p><h2id="Utility-functions-1"><aclass="docs-heading-anchor"href="#Utility-functions-1">Utility functions</a><aclass="docs-heading-anchor-permalink"href="#Utility-functions-1"title="Permalink"></a></h2><p>Flux provides some utility functions to help you generate models in an automated fashion.</p><p><code>outdims</code> enables you to calculate the spatial output dimensions of layers like <code>Conv</code> when applied to input images of a given size. Currently limited to the following layers:</p><ul><li><code>Chain</code></li><li><code>Dense</code></li><li><code>Conv</code></li><li><code>Diagonal</code></li><li><code>Maxout</code></li><li><code>ConvTranspose</code></li><li><code>DepthwiseConv</code></li><li><code>CrossCor</code></li><li><code>MaxPool</code></li><li><code>MeanPool</code></li></ul><articleclass="docstring"><header><aclass="docstring-binding"id="Flux.outdims"href="#Flux.outdims"><code>Flux.outdims</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">outdims(c::Chain, isize)</code></pre><p>Calculate the output dimensions given the input dimensions, <code>isize</code>.</p><pre><codeclass="language-julia">m = Chain(Conv((3, 3), 3 => 16), Conv((3, 3), 16 => 32))
outdims(m, (10, 10)) == (6, 6)</code></pre></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/FluxML/Flux.jl/blob/7a32a703f0f2842dda73d4454aff5990ade365d5/src/layers/basic.jl#L50-L59">source</a></section><section><div><pre><codeclass="language-none">outdims(l::Dense, isize)</code></pre><p>Calculate the output dimensions given the input dimensions, <code>isize</code>.</p><pre><codeclass="language-julia">m = Dense(10, 5)
outdims(m, (5, 2)) == (5,)
outdims(m, (10,)) == (5,)</code></pre></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/FluxML/Flux.jl/blob/7a32a703f0f2842dda73d4454aff5990ade365d5/src/layers/basic.jl#L139-L149">source</a></section><section><div><pre><codeclass="language-none">outdims(l::Conv, isize::Tuple)</code></pre><p>Calculate the output dimensions given the input dimensions <code>isize</code>. Batch size and channel size are ignored as per <ahref="https://github.com/FluxML/NNlib.jl">NNlib.jl</a>.</p><pre><codeclass="language-julia">m = Conv((3, 3), 3 => 16)
outdims(m, (10, 10)) == (8, 8)
outdims(m, (10, 10, 1, 3)) == (8, 8)</code></pre></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/FluxML/Flux.jl/blob/7a32a703f0f2842dda73d4454aff5990ade365d5/src/layers/conv.jl#L77-L88">source</a></section></article></article><navclass="docs-footer"><aclass="docs-footer-prevpage"href="../../">« Home</a><aclass="docs-footer-nextpage"href="../recurrence/">Recurrence »</a></nav></div><divclass="modal"id="documenter-settings"><divclass="modal-background"></div><divclass="modal-card"><headerclass="modal-card-head"><pclass="modal-card-title">Settings</p><buttonclass="delete"></button></header><sectionclass="modal-card-body"><p><labelclass="label">Theme</label><divclass="select"><selectid="documenter-themepicker"><optionvalue="documenter-light">documenter-light</option><optionvalue="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <ahref="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <spanclass="colophon-date"title="Monday 6 April 2020 14:20">Monday 6 April 2020</span>. Using Julia version 1.4.0.</p></section><footerclass="modal-card-foot"></footer></div></div></div></body></html>