43 lines
10 KiB
HTML
43 lines
10 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Recurrence · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
|
||
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
|
||
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
|
||
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
|
||
|
||
ga('create', 'UA-36890222-9', 'auto');
|
||
ga('send', 'pageview', {'page': location.pathname + location.search + location.hash});
|
||
</script><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.11.1/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="../.."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../../assets/documenter.js"></script><script src="../../siteinfo.js"></script><script src="../../../versions.js"></script><link href="../../assets/flux.css" rel="stylesheet" type="text/css"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-dark.css" data-theme-name="documenter-dark"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit">Flux</span></div><form class="docs-search" action="../../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../../">Home</a></li><li><span class="tocitem">Building Models</span><ul><li><a class="tocitem" href="../basics/">Basics</a></li><li class="is-active"><a class="tocitem" href>Recurrence</a><ul class="internal"><li><a class="tocitem" href="#Recurrent-Cells-1"><span>Recurrent Cells</span></a></li><li><a class="tocitem" href="#Stateful-Models-1"><span>Stateful Models</span></a></li><li><a class="tocitem" href="#Sequences-1"><span>Sequences</span></a></li></ul></li><li><a class="tocitem" href="../regularisation/">Regularisation</a></li><li><a class="tocitem" href="../layers/">Model Reference</a></li><li><a class="tocitem" href="../advanced/">Advanced Model Building</a></li><li><a class="tocitem" href="../nnlib/">NNlib</a></li></ul></li><li><span class="tocitem">Handling Data</span><ul><li><a class="tocitem" href="../../data/onehot/">One-Hot Encoding</a></li><li><a class="tocitem" href="../../data/dataloader/">DataLoader</a></li></ul></li><li><span class="tocitem">Training Models</span><ul><li><a class="tocitem" href="../../training/optimisers/">Optimisers</a></li><li><a class="tocitem" href="../../training/training/">Training</a></li></ul></li><li><a class="tocitem" href="../../gpu/">GPU Support</a></li><li><a class="tocitem" href="../../saving/">Saving & Loading</a></li><li><a class="tocitem" href="../../ecosystem/">The Julia Ecosystem</a></li><li><a class="tocitem" href="../../utilities/">Utility Functions</a></li><li><a class="tocitem" href="../../performance/">Performance Tips</a></li><li><a class="tocitem" href="../../datasets/">Datasets</a></li><li><a class="tocitem" href="../../community/">Community</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li><a class="is-disabled">Building Models</a></li><li class="is-active"><a href>Recurrence</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Recurrence</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/recurrence.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="Recurrent-Models-1"><a class="docs-heading-anchor" href="#Recurrent-Models-1">Recurrent Models</a><a class="docs-heading-anchor-permalink" href="#Recurrent-Models-1" title="Permalink"></a></h1><h2 id="Recurrent-Cells-1"><a class="docs-heading-anchor" href="#Recurrent-Cells-1">Recurrent Cells</a><a class="docs-heading-anchor-permalink" href="#Recurrent-Cells-1" title="Permalink"></a></h2><p>In the simple feedforward case, our model <code>m</code> is a simple function from various inputs <code>xᵢ</code> to predictions <code>yᵢ</code>. (For example, each <code>x</code> might be an MNIST digit and each <code>y</code> a digit label.) Each prediction is completely independent of any others, and using the same <code>x</code> will always produce the same <code>y</code>.</p><pre><code class="language-julia">y₁ = f(x₁)
|
||
y₂ = f(x₂)
|
||
y₃ = f(x₃)
|
||
# ...</code></pre><p>Recurrent networks introduce a <em>hidden state</em> that gets carried over each time we run the model. The model now takes the old <code>h</code> as an input, and produces a new <code>h</code> as output, each time we run it.</p><pre><code class="language-julia">h = # ... initial state ...
|
||
h, y₁ = f(h, x₁)
|
||
h, y₂ = f(h, x₂)
|
||
h, y₃ = f(h, x₃)
|
||
# ...</code></pre><p>Information stored in <code>h</code> is preserved for the next prediction, allowing it to function as a kind of memory. This also means that the prediction made for a given <code>x</code> depends on all the inputs previously fed into the model.</p><p>(This might be important if, for example, each <code>x</code> represents one word of a sentence; the model's interpretation of the word "bank" should change if the previous input was "river" rather than "investment".)</p><p>Flux's RNN support closely follows this mathematical perspective. The most basic RNN is as close as possible to a standard <code>Dense</code> layer, and the output is also the hidden state.</p><pre><code class="language-julia">Wxh = randn(5, 10)
|
||
Whh = randn(5, 5)
|
||
b = randn(5)
|
||
|
||
function rnn(h, x)
|
||
h = tanh.(Wxh * x .+ Whh * h .+ b)
|
||
return h, h
|
||
end
|
||
|
||
x = rand(10) # dummy data
|
||
h = rand(5) # initial hidden state
|
||
|
||
h, y = rnn(h, x)</code></pre><p>If you run the last line a few times, you'll notice the output <code>y</code> changing slightly even though the input <code>x</code> is the same.</p><p>We sometimes refer to functions like <code>rnn</code> above, which explicitly manage state, as recurrent <em>cells</em>. There are various recurrent cells available, which are documented in the <a href="../layers/">layer reference</a>. The hand-written example above can be replaced with:</p><pre><code class="language-julia">using Flux
|
||
|
||
rnn2 = Flux.RNNCell(10, 5)
|
||
|
||
x = rand(10) # dummy data
|
||
h = rand(5) # initial hidden state
|
||
|
||
h, y = rnn2(h, x)</code></pre><h2 id="Stateful-Models-1"><a class="docs-heading-anchor" href="#Stateful-Models-1">Stateful Models</a><a class="docs-heading-anchor-permalink" href="#Stateful-Models-1" title="Permalink"></a></h2><p>For the most part, we don't want to manage hidden states ourselves, but to treat our models as being stateful. Flux provides the <code>Recur</code> wrapper to do this.</p><pre><code class="language-julia">x = rand(10)
|
||
h = rand(5)
|
||
|
||
m = Flux.Recur(rnn, h)
|
||
|
||
y = m(x)</code></pre><p>The <code>Recur</code> wrapper stores the state between runs in the <code>m.state</code> field.</p><p>If you use the <code>RNN(10, 5)</code> constructor – as opposed to <code>RNNCell</code> – you'll see that it's simply a wrapped cell.</p><pre><code class="language-julia">julia> RNN(10, 5)
|
||
Recur(RNNCell(10, 5, tanh))</code></pre><h2 id="Sequences-1"><a class="docs-heading-anchor" href="#Sequences-1">Sequences</a><a class="docs-heading-anchor-permalink" href="#Sequences-1" title="Permalink"></a></h2><p>Often we want to work with sequences of inputs, rather than individual <code>x</code>s.</p><pre><code class="language-julia">seq = [rand(10) for i = 1:10]</code></pre><p>With <code>Recur</code>, applying our model to each element of a sequence is trivial:</p><pre><code class="language-julia">m.(seq) # returns a list of 5-element vectors</code></pre><p>This works even when we've chain recurrent layers into a larger model.</p><pre><code class="language-julia">m = Chain(LSTM(10, 15), Dense(15, 5))
|
||
m.(seq)</code></pre><p>Finally, we can reset the hidden state of the cell back to its initial value using <code>reset!(m)</code>.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../basics/">« Basics</a><a class="docs-footer-nextpage" href="../regularisation/">Regularisation »</a></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <span class="colophon-date" title="Wednesday 27 May 2020 11:52">Wednesday 27 May 2020</span>. Using Julia version 1.3.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
|