2017-03-01 12:37:00 +00:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "UTF-8" / >
< meta name = "viewport" content = "width=device-width, initial-scale=1.0" / >
< title >
Model Templates · Flux
< / title >
< script >
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview');
< / script >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel = "stylesheet" type = "text/css" / >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.5.0/styles/default.min.css" rel = "stylesheet" type = "text/css" / >
< link href = "https://fonts.googleapis.com/css?family=Lato|Ubuntu+Mono" rel = "stylesheet" type = "text/css" / >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel = "stylesheet" type = "text/css" / >
< link href = "../assets/documenter.css" rel = "stylesheet" type = "text/css" / >
< script >
documenterBaseURL=".."
< / script >
< script src = "https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main = "../assets/documenter.js" > < / script >
< script src = "../../versions.js" > < / script >
< link href = "../../flux.css" rel = "stylesheet" type = "text/css" / >
< / head >
< body >
< nav class = "toc" >
< h1 >
Flux
< / h1 >
< form class = "search" action = "../search.html" >
< select id = "version-selector" onChange = "window.location.href=this.value" >
< option value = "#" selected = "selected" disabled = "disabled" >
Version
< / option >
< / select >
< input id = "search-query" name = "q" type = "text" placeholder = "Search docs" / >
< / form >
< ul >
< li >
< a class = "toctext" href = "../index.html" >
Home
< / a >
< / li >
< li >
< span class = "toctext" >
Building Models
< / span >
< ul >
< li >
< a class = "toctext" href = "basics.html" >
Model Building Basics
< / a >
< / li >
< li class = "current" >
< a class = "toctext" href = "templates.html" >
Model Templates
< / a >
< ul class = "internal" >
< li >
< a class = "toctext" href = "#Models-in-templates-1" >
Models in templates
< / a >
< / li >
< li >
< a class = "toctext" href = "#Constructors-1" >
Constructors
< / a >
< / li >
< li >
< a class = "toctext" href = "#Supported-syntax-1" >
Supported syntax
< / a >
< / li >
< / ul >
< / li >
< li >
< a class = "toctext" href = "recurrent.html" >
Recurrence
< / a >
< / li >
< li >
< a class = "toctext" href = "debugging.html" >
Debugging
< / a >
< / li >
< / ul >
< / li >
< li >
< span class = "toctext" >
Other APIs
< / span >
< ul >
< li >
< a class = "toctext" href = "../apis/batching.html" >
Batching
< / a >
< / li >
< li >
< a class = "toctext" href = "../apis/backends.html" >
Backends
< / a >
< / li >
< li >
< a class = "toctext" href = "../apis/storage.html" >
Storing Models
< / a >
< / li >
< / ul >
< / li >
< li >
< span class = "toctext" >
In Action
< / span >
< ul >
< li >
< a class = "toctext" href = "../examples/logreg.html" >
Logistic Regression
< / a >
< / li >
< li >
< a class = "toctext" href = "../examples/char-rnn.html" >
Char RNN
< / a >
< / li >
< / ul >
< / li >
< li >
< a class = "toctext" href = "../contributing.html" >
Contributing & Help
< / a >
< / li >
< li >
< a class = "toctext" href = "../internals.html" >
Internals
< / a >
< / li >
< / ul >
< / nav >
< article id = "docs" >
< header >
< nav >
< ul >
< li >
Building Models
< / li >
< li >
< a href = "templates.html" >
Model Templates
< / a >
< / li >
< / ul >
2017-03-09 00:13:08 +00:00
< a class = "edit-page" href = "https://github.com/MikeInnes/Flux.jl/tree/854a1e18865742c59b6db6c58e16fee6ef9ef8ce/docs/src/models/templates.md" >
2017-03-01 12:37:00 +00:00
< span class = "fa" >
< / span >
Edit on GitHub
< / a >
< / nav >
< hr / >
< / header >
< h1 >
< a class = "nav-anchor" id = "Model-Templates-1" href = "#Model-Templates-1" >
Model Templates
< / a >
< / h1 >
< p >
< em >
... Calculating Tax Expenses ...
< / em >
< / p >
< p >
So how does the
< code > Affine< / code >
template work? We don' t want to duplicate the code above whenever we need more than one affine layer:
< / p >
< pre > < code class = "language-julia" > W₁, b₁ = randn(...)
affine₁(x) = W₁*x + b₁
W₂, b₂ = randn(...)
affine₂(x) = W₂*x + b₂
model = Chain(affine₁, affine₂)< / code > < / pre >
< p >
Here' s one way we could solve this: just keep the parameters in a Julia type, and define how that type acts as a function:
< / p >
< pre > < code class = "language-julia" > type MyAffine
W
b
end
# Use the `MyAffine` layer as a model
(l::MyAffine)(x) = l.W * x + l.b
# Convenience constructor
MyAffine(in::Integer, out::Integer) =
MyAffine(randn(out, in), randn(out))
model = Chain(MyAffine(5, 5), MyAffine(5, 5))
model(x1) # [-1.54458,0.492025,0.88687,1.93834,-4.70062]< / code > < / pre >
< p >
This is much better: we can now make as many affine layers as we want. This is a very common pattern, so to make it more convenient we can use the
< code > @net< / code >
macro:
< / p >
< pre > < code class = "language-julia" > @net type MyAffine
W
b
x -> x * W + b
end< / code > < / pre >
< p >
The function provided,
< code > x -> x * W + b< / code >
, will be used when
< code > MyAffine< / code >
is used as a model; it' s just a shorter way of defining the
< code > (::MyAffine)(x)< / code >
method above. (You may notice that
< code > W< / code >
and
< code > x< / code >
have swapped order in the model; this is due to the way batching works, which will be covered in more detail later on.)
< / p >
< p >
However,
< code > @net< / code >
does not simply save us some keystrokes; it' s the secret sauce that makes everything else in Flux go. For example, it analyses the code for the forward function so that it can differentiate it or convert it to a TensorFlow graph.
< / p >
< p >
The above code is almost exactly how
< code > Affine< / code >
is defined in Flux itself! There' s no difference between " library-level" and " user-level" models, so making your code reusable doesn' t involve a lot of extra complexity. Moreover, much more complex models than
< code > Affine< / code >
are equally simple to define.
< / p >
< h2 >
< a class = "nav-anchor" id = "Models-in-templates-1" href = "#Models-in-templates-1" >
Models in templates
< / a >
< / h2 >
< p >
< code > @net< / code >
models can contain sub-models as well as just array parameters:
< / p >
< pre > < code class = "language-julia" > @net type TLP
first
second
function (x)
l1 = σ (first(x))
l2 = softmax(second(l1))
end
end< / code > < / pre >
< p >
Just as above, this is roughly equivalent to writing:
< / p >
< pre > < code class = "language-julia" > type TLP
first
second
end
function (self::TLP)(x)
l1 = σ (self.first(x))
l2 = softmax(self.second(l1))
end< / code > < / pre >
< p >
Clearly, the
< code > first< / code >
and
< code > second< / code >
parameters are not arrays here, but should be models themselves, and produce a result when called with an input array
< code > x< / code >
. The
< code > Affine< / code >
layer fits the bill, so we can instantiate
< code > TLP< / code >
with two of them:
< / p >
< pre > < code class = "language-julia" > model = TLP(Affine(10, 20),
Affine(20, 15))
x1 = rand(20)
model(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...< / code > < / pre >
< p >
You may recognise this as being equivalent to
< / p >
< pre > < code class = "language-julia" > Chain(
Affine(10, 20), σ
Affine(20, 15), softmax)< / code > < / pre >
< p >
given that it' s just a sequence of calls. For simple networks
< code > Chain< / code >
is completely fine, although the
< code > @net< / code >
version is more powerful as we can (for example) reuse the output
< code > l1< / code >
more than once.
< / p >
< h2 >
< a class = "nav-anchor" id = "Constructors-1" href = "#Constructors-1" >
Constructors
< / a >
< / h2 >
< p >
< code > Affine< / code >
has two array parameters,
< code > W< / code >
and
< code > b< / code >
. Just like any other Julia type, it' s easy to instantiate an
< code > Affine< / code >
layer with parameters of our choosing:
< / p >
< pre > < code class = "language-julia" > a = Affine(rand(10, 20), rand(20))< / code > < / pre >
< p >
However, for convenience and to avoid errors, we' d probably rather specify the input and output dimension instead:
< / p >
< pre > < code class = "language-julia" > a = Affine(10, 20)< / code > < / pre >
< p >
This is easy to implement using the usual Julia syntax for constructors:
< / p >
< pre > < code class = "language-julia" > Affine(in::Integer, out::Integer) =
Affine(randn(in, out), randn(1, out))< / code > < / pre >
< p >
In practice, these constructors tend to take the parameter initialisation function as an argument so that it' s more easily customisable, and use
< code > Flux.initn< / code >
by default (which is equivalent to
< code > randn(...)/100< / code >
). So
< code > Affine< / code >
' s constructor really looks like this:
< / p >
< pre > < code class = "language-julia" > Affine(in::Integer, out::Integer; init = initn) =
Affine(init(in, out), init(1, out))< / code > < / pre >
< h2 >
< a class = "nav-anchor" id = "Supported-syntax-1" href = "#Supported-syntax-1" >
Supported syntax
< / a >
< / h2 >
< p >
The syntax used to define a forward pass like
< code > x -> x*W + b< / code >
behaves exactly like Julia code for the most part. However, it' s important to remember that it' s defining a dataflow graph, not a general Julia expression. In practice this means that anything side-effectful, or things like control flow and
< code > println< / code >
s, won' t work as expected. In future we' ll continue to expand support for Julia syntax and features.
< / p >
< footer >
< hr / >
< a class = "previous" href = "basics.html" >
< span class = "direction" >
Previous
< / span >
< span class = "title" >
Model Building Basics
< / span >
< / a >
< a class = "next" href = "recurrent.html" >
< span class = "direction" >
Next
< / span >
< span class = "title" >
Recurrence
< / span >
< / a >
< / footer >
< / article >
< / body >
< / html >