build based on fe6793f
This commit is contained in:
parent
5f6a978ba9
commit
9e7d4bddc5
File diff suppressed because one or more lines are too long
@ -221,7 +221,7 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "Model Reference",
|
||||
"title": "Flux.LSTM",
|
||||
"category": "function",
|
||||
"text": "LSTM(in::Integer, out::Integer, σ = tanh)\n\nLong Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences.\n\nSee this article for a good overview of the internals.\n\n\n\n\n\n"
|
||||
"text": "LSTM(in::Integer, out::Integer)\n\nLong Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences.\n\nSee this article for a good overview of the internals.\n\n\n\n\n\n"
|
||||
},
|
||||
|
||||
{
|
||||
@ -229,7 +229,7 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "Model Reference",
|
||||
"title": "Flux.GRU",
|
||||
"category": "function",
|
||||
"text": "GRU(in::Integer, out::Integer, σ = tanh)\n\nGated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences.\n\nSee this article for a good overview of the internals.\n\n\n\n\n\n"
|
||||
"text": "GRU(in::Integer, out::Integer)\n\nGated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences.\n\nSee this article for a good overview of the internals.\n\n\n\n\n\n"
|
||||
},
|
||||
|
||||
{
|
||||
|
@ -29,4 +29,4 @@ end</code></pre><p>If we call <code>sgd</code>, the parameters <code>W</code> an
|
||||
Dense(10, 5, σ),
|
||||
Dense(5, 2), softmax)</code></pre><p>Instead of having to write <code>[m[1].W, m[1].b, ...]</code>, Flux provides a params function <code>params(m)</code> that returns a list of all parameters in the model for you.</p><p>For the update step, there's nothing whatsoever wrong with writing the loop above – it'll work just fine – but Flux provides various <em>optimisers</em> that make it more convenient.</p><pre><code class="language-julia">opt = SGD([W, b], 0.1) # Gradient descent with learning rate 0.1
|
||||
|
||||
opt() # Carry out the update, modifying `W` and `b`.</code></pre><p>An optimiser takes a parameter list and returns a function that does the same thing as <code>update</code> above. We can pass either <code>opt</code> or <code>update</code> to our <a href="training.html">training loop</a>, which will then run the optimiser after every mini-batch of data.</p><h2><a class="nav-anchor" id="Optimiser-Reference-1" href="#Optimiser-Reference-1">Optimiser Reference</a></h2><p>All optimisers return a function that, when called, will update the parameters passed to it.</p><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.SGD" href="#Flux.Optimise.SGD"><code>Flux.Optimise.SGD</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">SGD(params, η = 0.1; decay = 0)</code></pre><p>Classic gradient descent optimiser with learning rate <code>η</code>. For each parameter <code>p</code> and its gradient <code>δp</code>, this runs <code>p -= η*δp</code>.</p><p>Supports inverse decaying learning rate if the <code>decay</code> argument is provided.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/3a7b77d104f314736b3ad8bcc97ab3d72adcc341/src/optimise/interface.jl#L14-L21">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.Momentum" href="#Flux.Optimise.Momentum"><code>Flux.Optimise.Momentum</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">Momentum(params, η = 0.01; ρ = 0.9, decay = 0)</code></pre><p>SGD with learning rate <code>η</code>, momentum <code>ρ</code> and optional learning rate inverse decay.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/3a7b77d104f314736b3ad8bcc97ab3d72adcc341/src/optimise/interface.jl#L25-L29">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.Nesterov" href="#Flux.Optimise.Nesterov"><code>Flux.Optimise.Nesterov</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">Nesterov(params, η = 0.01; ρ = 0.9, decay = 0)</code></pre><p>SGD with learning rate <code>η</code>, Nesterov momentum <code>ρ</code> and optional learning rate inverse decay.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/3a7b77d104f314736b3ad8bcc97ab3d72adcc341/src/optimise/interface.jl#L33-L37">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.ADAM" href="#Flux.Optimise.ADAM"><code>Flux.Optimise.ADAM</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">ADAM(params, η = 0.001; β1 = 0.9, β2 = 0.999, ϵ = 1e-08, decay = 0)</code></pre><p><a href="https://arxiv.org/abs/1412.6980v8">ADAM</a> optimiser.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/3a7b77d104f314736b3ad8bcc97ab3d72adcc341/src/optimise/interface.jl#L51-L55">source</a></section><footer><hr/><a class="previous" href="../models/layers.html"><span class="direction">Previous</span><span class="title">Model Reference</span></a><a class="next" href="training.html"><span class="direction">Next</span><span class="title">Training</span></a></footer></article></body></html>
|
||||
opt() # Carry out the update, modifying `W` and `b`.</code></pre><p>An optimiser takes a parameter list and returns a function that does the same thing as <code>update</code> above. We can pass either <code>opt</code> or <code>update</code> to our <a href="training.html">training loop</a>, which will then run the optimiser after every mini-batch of data.</p><h2><a class="nav-anchor" id="Optimiser-Reference-1" href="#Optimiser-Reference-1">Optimiser Reference</a></h2><p>All optimisers return a function that, when called, will update the parameters passed to it.</p><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.SGD" href="#Flux.Optimise.SGD"><code>Flux.Optimise.SGD</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">SGD(params, η = 0.1; decay = 0)</code></pre><p>Classic gradient descent optimiser with learning rate <code>η</code>. For each parameter <code>p</code> and its gradient <code>δp</code>, this runs <code>p -= η*δp</code>.</p><p>Supports inverse decaying learning rate if the <code>decay</code> argument is provided.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/fe6793fde5b40430999c30d207570ce85d4d3fbc/src/optimise/interface.jl#L14-L21">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.Momentum" href="#Flux.Optimise.Momentum"><code>Flux.Optimise.Momentum</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">Momentum(params, η = 0.01; ρ = 0.9, decay = 0)</code></pre><p>SGD with learning rate <code>η</code>, momentum <code>ρ</code> and optional learning rate inverse decay.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/fe6793fde5b40430999c30d207570ce85d4d3fbc/src/optimise/interface.jl#L25-L29">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.Nesterov" href="#Flux.Optimise.Nesterov"><code>Flux.Optimise.Nesterov</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">Nesterov(params, η = 0.01; ρ = 0.9, decay = 0)</code></pre><p>SGD with learning rate <code>η</code>, Nesterov momentum <code>ρ</code> and optional learning rate inverse decay.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/fe6793fde5b40430999c30d207570ce85d4d3fbc/src/optimise/interface.jl#L33-L37">source</a></section><section class="docstring"><div class="docstring-header"><a class="docstring-binding" id="Flux.Optimise.ADAM" href="#Flux.Optimise.ADAM"><code>Flux.Optimise.ADAM</code></a> — <span class="docstring-category">Function</span>.</div><div><div><pre><code class="language-none">ADAM(params, η = 0.001; β1 = 0.9, β2 = 0.999, ϵ = 1e-08, decay = 0)</code></pre><p><a href="https://arxiv.org/abs/1412.6980v8">ADAM</a> optimiser.</p></div></div><a class="source-link" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/fe6793fde5b40430999c30d207570ce85d4d3fbc/src/optimise/interface.jl#L51-L55">source</a></section><footer><hr/><a class="previous" href="../models/layers.html"><span class="direction">Previous</span><span class="title">Model Reference</span></a><a class="next" href="training.html"><span class="direction">Next</span><span class="title">Training</span></a></footer></article></body></html>
|
||||
|
Loading…
Reference in New Issue
Block a user