Flux.jl/dev/models/recurrence/index.html

43 lines
10 KiB
HTML
Raw Normal View History

2017-09-11 13:28:47 +00:00
<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Recurrence · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
2020-03-03 07:46:14 +00:00
ga('send', 'pageview', {'page': location.pathname + location.search + location.hash});
2020-03-04 00:45:20 +00:00
</script><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.11.1/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="../.."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../../assets/documenter.js"></script><script src="../../siteinfo.js"></script><script src="../../../versions.js"></script><link href="../../assets/flux.css" rel="stylesheet" type="text/css"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-dark.css" data-theme-name="documenter-dark"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit">Flux</span></div><form class="docs-search" action="../../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../../">Home</a></li><li><span class="tocitem">Building Models</span><ul><li><a class="tocitem" href="../basics/">Basics</a></li><li class="is-active"><a class="tocitem" href>Recurrence</a><ul class="internal"><li><a class="tocitem" href="#Recurrent-Cells-1"><span>Recurrent Cells</span></a></li><li><a class="tocitem" href="#Stateful-Models-1"><span>Stateful Models</span></a></li><li><a class="tocitem" href="#Sequences-1"><span>Sequences</span></a></li></ul></li><li><a class="tocitem" href="../regularisation/">Regularisation</a></li><li><a class="tocitem" href="../layers/">Model Reference</a></li><li><a class="tocitem" href="../advanced/">Advanced Model Building</a></li><li><a class="tocitem" href="../nnlib/">NNlib</a></li></ul></li><li><span class="tocitem">Handling Data</span><ul><li><a class="tocitem" href="../../data/onehot/">One-Hot Encoding</a></li><li><a class="tocitem" href="../../data/dataloader/">DataLoader</a></li></ul></li><li><span class="tocitem">Training Models</span><ul><li><a class="tocitem" href="../../training/optimisers/">Optimisers</a></li><li><a class="tocitem" href="../../training/training/">Training</a></li></ul></li><li><a class="tocitem" href="../../gpu/">GPU Support</a></li><li><a class="tocitem" href="../../saving/">Saving &amp; Loading</a></li><li><a class="tocitem" href="../../ecosystem/">The Julia Ecosystem</a></li><li><a class="tocitem" href="../../performance/">Performance Tips</a></li><li><a class="tocitem" href="../../community/">Community</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li><a class="is-disabled">Building Models</a></li><li class="is-active"><a href>Recurrence</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Recurrence</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/recurrence.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-butt
2017-09-11 13:28:47 +00:00
y₂ = f(x₂)
y₃ = f(x₃)
# ...</code></pre><p>Recurrent networks introduce a <em>hidden state</em> that gets carried over each time we run the model. The model now takes the old <code>h</code> as an input, and produces a new <code>h</code> as output, each time we run it.</p><pre><code class="language-julia">h = # ... initial state ...
2017-10-20 12:04:44 +00:00
h, y₁ = f(h, x₁)
h, y₂ = f(h, x₂)
h, y₃ = f(h, x₃)
# ...</code></pre><p>Information stored in <code>h</code> is preserved for the next prediction, allowing it to function as a kind of memory. This also means that the prediction made for a given <code>x</code> depends on all the inputs previously fed into the model.</p><p>(This might be important if, for example, each <code>x</code> represents one word of a sentence; the model&#39;s interpretation of the word &quot;bank&quot; should change if the previous input was &quot;river&quot; rather than &quot;investment&quot;.)</p><p>Flux&#39;s RNN support closely follows this mathematical perspective. The most basic RNN is as close as possible to a standard <code>Dense</code> layer, and the output is also the hidden state.</p><pre><code class="language-julia">Wxh = randn(5, 10)
2017-09-11 13:28:47 +00:00
Whh = randn(5, 5)
b = randn(5)
function rnn(h, x)
h = tanh.(Wxh * x .+ Whh * h .+ b)
return h, h
end
x = rand(10) # dummy data
h = rand(5) # initial hidden state
2019-01-16 10:44:56 +00:00
h, y = rnn(h, x)</code></pre><p>If you run the last line a few times, you&#39;ll notice the output <code>y</code> changing slightly even though the input <code>x</code> is the same.</p><p>We sometimes refer to functions like <code>rnn</code> above, which explicitly manage state, as recurrent <em>cells</em>. There are various recurrent cells available, which are documented in the <a href="../layers/">layer reference</a>. The hand-written example above can be replaced with:</p><pre><code class="language-julia">using Flux
2017-09-11 13:28:47 +00:00
2017-09-12 13:22:13 +00:00
rnn2 = Flux.RNNCell(10, 5)
2017-09-11 13:28:47 +00:00
x = rand(10) # dummy data
h = rand(5) # initial hidden state
2020-03-03 07:46:14 +00:00
h, y = rnn2(h, x)</code></pre><h2 id="Stateful-Models-1"><a class="docs-heading-anchor" href="#Stateful-Models-1">Stateful Models</a><a class="docs-heading-anchor-permalink" href="#Stateful-Models-1" title="Permalink"></a></h2><p>For the most part, we don&#39;t want to manage hidden states ourselves, but to treat our models as being stateful. Flux provides the <code>Recur</code> wrapper to do this.</p><pre><code class="language-julia">x = rand(10)
2017-09-11 13:28:47 +00:00
h = rand(5)
m = Flux.Recur(rnn, h)
y = m(x)</code></pre><p>The <code>Recur</code> wrapper stores the state between runs in the <code>m.state</code> field.</p><p>If you use the <code>RNN(10, 5)</code> constructor as opposed to <code>RNNCell</code> you&#39;ll see that it&#39;s simply a wrapped cell.</p><pre><code class="language-julia">julia&gt; RNN(10, 5)
2020-03-03 07:46:14 +00:00
Recur(RNNCell(10, 5, tanh))</code></pre><h2 id="Sequences-1"><a class="docs-heading-anchor" href="#Sequences-1">Sequences</a><a class="docs-heading-anchor-permalink" href="#Sequences-1" title="Permalink"></a></h2><p>Often we want to work with sequences of inputs, rather than individual <code>x</code>s.</p><pre><code class="language-julia">seq = [rand(10) for i = 1:10]</code></pre><p>With <code>Recur</code>, applying our model to each element of a sequence is trivial:</p><pre><code class="language-julia">m.(seq) # returns a list of 5-element vectors</code></pre><p>This works even when we&#39;ve chain recurrent layers into a larger model.</p><pre><code class="language-julia">m = Chain(LSTM(10, 15), Dense(15, 5))
2020-03-10 10:11:51 +00:00
m.(seq)</code></pre><p>Finally, we can reset the hidden state of the cell back to its initial value using <code>reset!(m)</code>.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../basics/">« Basics</a><a class="docs-footer-nextpage" href="../regularisation/">Regularisation »</a></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <span class="colophon-date" title="Tuesday 10 March 2020 10:11">Tuesday 10 March 2020</span>. Using Julia version 1.3.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>