Flux.jl/release-0.2/models/templates.html
2017-05-04 16:25:27 +00:00

305 lines
8.3 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<title>
Model Templates · Flux
</title>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview');
</script>
<link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/>
<link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.5.0/styles/default.min.css" rel="stylesheet" type="text/css"/>
<link href="https://fonts.googleapis.com/css?family=Lato|Ubuntu+Mono" rel="stylesheet" type="text/css"/>
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/>
<link href="../assets/documenter.css" rel="stylesheet" type="text/css"/>
<script>
documenterBaseURL=".."
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script>
<script src="../../versions.js"></script>
<link href="../../flux.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<nav class="toc">
<h1>
Flux
</h1>
<form class="search" action="../search.html">
<select id="version-selector" onChange="window.location.href=this.value">
<option value="#" selected="selected" disabled="disabled">
Version
</option>
</select>
<input id="search-query" name="q" type="text" placeholder="Search docs"/>
</form>
<ul>
<li>
<a class="toctext" href="../index.html">
Home
</a>
</li>
<li>
<span class="toctext">
Building Models
</span>
<ul>
<li>
<a class="toctext" href="basics.html">
Model Building Basics
</a>
</li>
<li class="current">
<a class="toctext" href="templates.html">
Model Templates
</a>
<ul class="internal">
<li>
<a class="toctext" href="#Models-in-templates-1">
Models in templates
</a>
</li>
<li>
<a class="toctext" href="#Supported-syntax-1">
Supported syntax
</a>
</li>
</ul>
</li>
<li>
<a class="toctext" href="recurrent.html">
Recurrence
</a>
</li>
<li>
<a class="toctext" href="debugging.html">
Debugging
</a>
</li>
</ul>
</li>
<li>
<span class="toctext">
Other APIs
</span>
<ul>
<li>
<a class="toctext" href="../apis/batching.html">
Batching
</a>
</li>
<li>
<a class="toctext" href="../apis/backends.html">
Backends
</a>
</li>
<li>
<a class="toctext" href="../apis/storage.html">
Storing Models
</a>
</li>
</ul>
</li>
<li>
<span class="toctext">
In Action
</span>
<ul>
<li>
<a class="toctext" href="../examples/logreg.html">
Simple MNIST
</a>
</li>
<li>
<a class="toctext" href="../examples/char-rnn.html">
Char RNN
</a>
</li>
</ul>
</li>
<li>
<a class="toctext" href="../contributing.html">
Contributing &amp; Help
</a>
</li>
<li>
<a class="toctext" href="../internals.html">
Internals
</a>
</li>
</ul>
</nav>
<article id="docs">
<header>
<nav>
<ul>
<li>
Building Models
</li>
<li>
<a href="templates.html">
Model Templates
</a>
</li>
</ul>
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/7a85eff370b7c68d587b49699fa3f71e44993397/docs/src/models/templates.md">
<span class="fa">
</span>
Edit on GitHub
</a>
</nav>
<hr/>
</header>
<h1>
<a class="nav-anchor" id="Model-Templates-1" href="#Model-Templates-1">
Model Templates
</a>
</h1>
<p>
We mentioned that we could factor out the repetition of defining affine layers with something like:
</p>
<pre><code class="language-julia">function create_affine(in, out)
W = param(randn(out,in))
b = param(randn(out))
@net x -&gt; W * x + b
end</code></pre>
<p>
<code>@net type</code>
syntax provides a shortcut for this:
</p>
<pre><code class="language-julia">@net type MyAffine
W
b
x -&gt; x * W + b
end
# Convenience constructor
MyAffine(in::Integer, out::Integer) =
MyAffine(randn(out, in), randn(out))
model = Chain(MyAffine(5, 5), MyAffine(5, 5))
model(x1) # [-1.54458,0.492025,0.88687,1.93834,-4.70062]</code></pre>
<p>
This is almost exactly how
<code>Affine</code>
is defined in Flux itself. Using
<code>@net type</code>
gives us some extra conveniences:
</p>
<ul>
<li>
<p>
It creates default constructor
<code>MyAffine(::AbstractArray, ::AbstractArray)</code>
which initialises
<code>param</code>
s for us;
</p>
</li>
<li>
<p>
It subtypes
<code>Flux.Model</code>
to explicitly mark this as a model;
</p>
</li>
<li>
<p>
We can easily define custom constructors or instantiate
<code>Affine</code>
with arbitrary weights of our choosing;
</p>
</li>
<li>
<p>
We can dispatch on the
<code>Affine</code>
type, for example to override how it gets converted to MXNet, or to hook into shape inference.
</p>
</li>
</ul>
<h2>
<a class="nav-anchor" id="Models-in-templates-1" href="#Models-in-templates-1">
Models in templates
</a>
</h2>
<p>
<code>@net</code>
models can contain sub-models as well as just array parameters:
</p>
<pre><code class="language-julia">@net type TLP
first
second
function (x)
l1 = σ(first(x))
l2 = softmax(second(l1))
end
end</code></pre>
<p>
Clearly, the
<code>first</code>
and
<code>second</code>
parameters are not arrays here, but should be models themselves, and produce a result when called with an input array
<code>x</code>
. The
<code>Affine</code>
layer fits the bill, so we can instantiate
<code>TLP</code>
with two of them:
</p>
<pre><code class="language-julia">model = TLP(Affine(10, 20),
Affine(20, 15))
x1 = rand(20)
model(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...</code></pre>
<p>
You may recognise this as being equivalent to
</p>
<pre><code class="language-julia">Chain(
Affine(10, 20), σ
Affine(20, 15), softmax)</code></pre>
<h2>
<a class="nav-anchor" id="Supported-syntax-1" href="#Supported-syntax-1">
Supported syntax
</a>
</h2>
<p>
The syntax used to define a forward pass like
<code>x -&gt; x*W + b</code>
behaves exactly like Julia code for the most part. However, it&#39;s important to remember that it&#39;s defining a dataflow graph, not a general Julia expression. In practice this means that anything side-effectful, or things like control flow and
<code>println</code>
s, won&#39;t work as expected. In future we&#39;ll continue to expand support for Julia syntax and features.
</p>
<footer>
<hr/>
<a class="previous" href="basics.html">
<span class="direction">
Previous
</span>
<span class="title">
Model Building Basics
</span>
</a>
<a class="next" href="recurrent.html">
<span class="direction">
Next
</span>
<span class="title">
Recurrence
</span>
</a>
</footer>
</article>
</body>
</html>