Flux.jl/latest/models/templates.html

305 lines
8.3 KiB
HTML
Raw Normal View History

2017-02-02 07:48:56 +00:00
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<title>
Model Templates · Flux
</title>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview');
</script>
<link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/>
<link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.5.0/styles/default.min.css" rel="stylesheet" type="text/css"/>
<link href="https://fonts.googleapis.com/css?family=Lato|Ubuntu+Mono" rel="stylesheet" type="text/css"/>
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/>
<link href="../assets/documenter.css" rel="stylesheet" type="text/css"/>
<script>
documenterBaseURL=".."
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script>
<script src="../../versions.js"></script>
<link href="../../flux.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<nav class="toc">
<h1>
Flux
</h1>
<form class="search" action="../search.html">
<select id="version-selector" onChange="window.location.href=this.value">
<option value="#" selected="selected" disabled="disabled">
Version
</option>
</select>
<input id="search-query" name="q" type="text" placeholder="Search docs"/>
</form>
<ul>
<li>
<a class="toctext" href="../index.html">
Home
</a>
</li>
<li>
<span class="toctext">
Building Models
</span>
<ul>
<li>
<a class="toctext" href="basics.html">
Model Building Basics
</a>
</li>
<li class="current">
<a class="toctext" href="templates.html">
Model Templates
</a>
<ul class="internal">
<li>
<a class="toctext" href="#Models-in-templates-1">
Models in templates
</a>
</li>
<li>
<a class="toctext" href="#Supported-syntax-1">
Supported syntax
</a>
</li>
</ul>
</li>
<li>
<a class="toctext" href="recurrent.html">
Recurrence
</a>
</li>
<li>
<a class="toctext" href="debugging.html">
Debugging
</a>
</li>
</ul>
</li>
2017-02-18 15:11:53 +00:00
<li>
2017-02-20 10:53:09 +00:00
<span class="toctext">
Other APIs
</span>
<ul>
<li>
2017-02-20 11:05:06 +00:00
<a class="toctext" href="../apis/batching.html">
2017-02-18 15:11:53 +00:00
Batching
2017-02-20 10:53:09 +00:00
</a>
</li>
<li>
2017-02-20 11:05:06 +00:00
<a class="toctext" href="../apis/backends.html">
2017-02-18 15:11:53 +00:00
Backends
2017-02-20 10:53:09 +00:00
</a>
</li>
2017-02-28 16:50:27 +00:00
<li>
<a class="toctext" href="../apis/storage.html">
Storing Models
</a>
</li>
2017-02-20 10:53:09 +00:00
</ul>
2017-02-18 15:11:53 +00:00
</li>
2017-02-02 07:48:56 +00:00
<li>
<span class="toctext">
In Action
</span>
<ul>
<li>
<a class="toctext" href="../examples/logreg.html">
2017-03-09 00:26:06 +00:00
Simple MNIST
2017-02-02 07:48:56 +00:00
</a>
</li>
2017-02-28 16:21:45 +00:00
<li>
<a class="toctext" href="../examples/char-rnn.html">
Char RNN
</a>
</li>
2017-02-02 07:48:56 +00:00
</ul>
</li>
<li>
<a class="toctext" href="../contributing.html">
Contributing &amp; Help
</a>
</li>
<li>
<a class="toctext" href="../internals.html">
Internals
</a>
</li>
</ul>
</nav>
<article id="docs">
<header>
<nav>
<ul>
<li>
Building Models
</li>
<li>
<a href="templates.html">
Model Templates
</a>
</li>
</ul>
2017-07-27 21:05:23 +00:00
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/0e325e0425606161ded20064ba0c5d929f497fad/docs/src/models/templates.md">
2017-02-02 07:48:56 +00:00
<span class="fa">
</span>
Edit on GitHub
</a>
</nav>
<hr/>
</header>
<h1>
<a class="nav-anchor" id="Model-Templates-1" href="#Model-Templates-1">
Model Templates
</a>
</h1>
<p>
2017-05-04 16:19:01 +00:00
We mentioned that we could factor out the repetition of defining affine layers with something like:
2017-02-02 07:48:56 +00:00
</p>
2017-05-04 16:19:01 +00:00
<pre><code class="language-julia">function create_affine(in, out)
W = param(randn(out,in))
b = param(randn(out))
@net x -&gt; W * x + b
end</code></pre>
2017-02-02 07:48:56 +00:00
<p>
2017-05-04 16:19:01 +00:00
<code>@net type</code>
syntax provides a shortcut for this:
2017-02-02 07:48:56 +00:00
</p>
2017-05-04 16:19:01 +00:00
<pre><code class="language-julia">@net type MyAffine
2017-02-02 07:48:56 +00:00
W
b
2017-05-04 16:19:01 +00:00
x -&gt; x * W + b
2017-02-02 07:48:56 +00:00
end
# Convenience constructor
MyAffine(in::Integer, out::Integer) =
MyAffine(randn(out, in), randn(out))
model = Chain(MyAffine(5, 5), MyAffine(5, 5))
model(x1) # [-1.54458,0.492025,0.88687,1.93834,-4.70062]</code></pre>
<p>
2017-05-04 16:19:01 +00:00
This is almost exactly how
<code>Affine</code>
is defined in Flux itself. Using
<code>@net type</code>
gives us some extra conveniences:
2017-02-02 07:48:56 +00:00
</p>
2017-05-04 16:19:01 +00:00
<ul>
<li>
<p>
It creates default constructor
<code>MyAffine(::AbstractArray, ::AbstractArray)</code>
which initialises
<code>param</code>
s for us;
</p>
</li>
<li>
<p>
It subtypes
<code>Flux.Model</code>
to explicitly mark this as a model;
</p>
</li>
<li>
<p>
We can easily define custom constructors or instantiate
2017-02-02 07:48:56 +00:00
<code>Affine</code>
2017-05-04 16:19:01 +00:00
with arbitrary weights of our choosing;
</p>
</li>
<li>
<p>
We can dispatch on the
2017-02-02 07:48:56 +00:00
<code>Affine</code>
2017-05-04 16:19:01 +00:00
type, for example to override how it gets converted to MXNet, or to hook into shape inference.
</p>
</li>
</ul>
2017-02-02 07:48:56 +00:00
<h2>
<a class="nav-anchor" id="Models-in-templates-1" href="#Models-in-templates-1">
Models in templates
</a>
</h2>
<p>
<code>@net</code>
models can contain sub-models as well as just array parameters:
</p>
<pre><code class="language-julia">@net type TLP
first
second
function (x)
l1 = σ(first(x))
l2 = softmax(second(l1))
end
end</code></pre>
<p>
Clearly, the
<code>first</code>
and
<code>second</code>
parameters are not arrays here, but should be models themselves, and produce a result when called with an input array
<code>x</code>
. The
<code>Affine</code>
2017-02-18 15:11:53 +00:00
layer fits the bill, so we can instantiate
2017-02-02 07:48:56 +00:00
<code>TLP</code>
with two of them:
</p>
<pre><code class="language-julia">model = TLP(Affine(10, 20),
Affine(20, 15))
x1 = rand(20)
model(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...</code></pre>
<p>
You may recognise this as being equivalent to
</p>
<pre><code class="language-julia">Chain(
Affine(10, 20), σ
Affine(20, 15), softmax)</code></pre>
<h2>
<a class="nav-anchor" id="Supported-syntax-1" href="#Supported-syntax-1">
Supported syntax
</a>
</h2>
<p>
The syntax used to define a forward pass like
2017-02-18 15:11:53 +00:00
<code>x -&gt; x*W + b</code>
2017-02-02 07:48:56 +00:00
behaves exactly like Julia code for the most part. However, it&#39;s important to remember that it&#39;s defining a dataflow graph, not a general Julia expression. In practice this means that anything side-effectful, or things like control flow and
<code>println</code>
2017-02-18 15:11:53 +00:00
s, won&#39;t work as expected. In future we&#39;ll continue to expand support for Julia syntax and features.
2017-02-02 07:48:56 +00:00
</p>
<footer>
<hr/>
<a class="previous" href="basics.html">
<span class="direction">
Previous
</span>
<span class="title">
Model Building Basics
</span>
</a>
<a class="next" href="recurrent.html">
<span class="direction">
Next
</span>
<span class="title">
Recurrence
</span>
</a>
</footer>
</article>
</body>
</html>