build based on 2b25491
This commit is contained in:
parent
ef54392020
commit
51e0d92a59
@ -97,7 +97,7 @@ Contributing & Help
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/contributing.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/contributing.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -100,7 +100,7 @@ Logistic Regression
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/examples/logreg.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/examples/logreg.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -97,7 +97,7 @@ Home
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/index.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/index.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -97,7 +97,7 @@ Internals
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/internals.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/internals.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -63,8 +63,18 @@ The Model
|
||||
</a>
|
||||
</li>
|
||||
<li>
|
||||
<a class="toctext" href="#An-MNIST-Example-1">
|
||||
An MNIST Example
|
||||
<a class="toctext" href="#Combining-Models-1">
|
||||
Combining Models
|
||||
</a>
|
||||
</li>
|
||||
<li>
|
||||
<a class="toctext" href="#A-Function-in-Model's-Clothing-1">
|
||||
A Function in Model's Clothing
|
||||
</a>
|
||||
</li>
|
||||
<li>
|
||||
<a class="toctext" href="#The-Template-1">
|
||||
The Template
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
@ -113,7 +123,7 @@ First Steps
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/manual/basics.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/manual/basics.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
@ -123,8 +133,8 @@ First Steps
|
||||
<hr/>
|
||||
</header>
|
||||
<h1>
|
||||
<a class="nav-anchor" id="Basic-Usage-1" href="#Basic-Usage-1">
|
||||
Basic Usage
|
||||
<a class="nav-anchor" id="First-Steps-1" href="#First-Steps-1">
|
||||
First Steps
|
||||
</a>
|
||||
</h1>
|
||||
<h2>
|
||||
@ -132,8 +142,14 @@ Basic Usage
|
||||
Installation
|
||||
</a>
|
||||
</h2>
|
||||
<p>
|
||||
<em>
|
||||
... Charging Ion Capacitors ...
|
||||
</em>
|
||||
</p>
|
||||
<pre><code class="language-julia">Pkg.clone("https://github.com/MikeInnes/DataFlow.jl")
|
||||
Pkg.clone("https://github.com/MikeInnes/Flux.jl")</code></pre>
|
||||
Pkg.clone("https://github.com/MikeInnes/Flux.jl")
|
||||
using Flux</code></pre>
|
||||
<h2>
|
||||
<a class="nav-anchor" id="The-Model-1" href="#The-Model-1">
|
||||
The Model
|
||||
@ -141,31 +157,146 @@ The Model
|
||||
</h2>
|
||||
<p>
|
||||
<em>
|
||||
Charging Ion Capacitors...
|
||||
... Initialising Photon Beams ...
|
||||
</em>
|
||||
</p>
|
||||
<p>
|
||||
The core concept in Flux is that of the
|
||||
The core concept in Flux is the
|
||||
<em>
|
||||
model
|
||||
</em>
|
||||
. A model is simply a function with parameters. In Julia, we might define the following function:
|
||||
. A model (or "layer") is simply a function with parameters. For example, in plain Julia code, we could define the following function to represent a logistic regression (or simple neural network):
|
||||
</p>
|
||||
<pre><code class="language-julia">W = randn(3,5)
|
||||
b = randn(3)
|
||||
affine(x) = W*x + b
|
||||
|
||||
x1 = randn(5)
|
||||
affine(x1)
|
||||
> 3-element Array{Float64,1}:
|
||||
-0.0215644
|
||||
-4.07343
|
||||
0.312591</code></pre>
|
||||
x1 = rand(5) # [0.581466,0.606507,0.981732,0.488618,0.415414]
|
||||
y1 = softmax(affine(x1)) # [0.32676,0.0974173,0.575823]</code></pre>
|
||||
<p>
|
||||
<code>affine</code>
|
||||
is simply a function which takes some vector
|
||||
<code>x1</code>
|
||||
and outputs a new one
|
||||
<code>y1</code>
|
||||
. For example,
|
||||
<code>x1</code>
|
||||
could be data from an image and
|
||||
<code>y1</code>
|
||||
could be predictions about the content of that image. However,
|
||||
<code>affine</code>
|
||||
isn't static. It has
|
||||
<em>
|
||||
parameters
|
||||
</em>
|
||||
|
||||
<code>W</code>
|
||||
and
|
||||
<code>b</code>
|
||||
, and if we tweak those parameters we'll tweak the result – hopefully to make the predictions more accurate.
|
||||
</p>
|
||||
<p>
|
||||
This is all well and good, but we usually want to have more than one affine layer in our network; writing out the above definition to create new sets of parameters every time would quickly become tedious. For that reason, we want to use a
|
||||
<em>
|
||||
template
|
||||
</em>
|
||||
which creates these functions for us:
|
||||
</p>
|
||||
<pre><code class="language-julia">affine1 = Affine(5, 5)
|
||||
affine2 = Affine(5, 5)
|
||||
|
||||
softmax(affine1(x1)) # [0.167952, 0.186325, 0.176683, 0.238571, 0.23047]
|
||||
softmax(affine2(x1)) # [0.125361, 0.246448, 0.21966, 0.124596, 0.283935]</code></pre>
|
||||
<p>
|
||||
We just created two separate
|
||||
<code>Affine</code>
|
||||
layers, and each contains its own version of
|
||||
<code>W</code>
|
||||
and
|
||||
<code>b</code>
|
||||
, leading to a different result when called with our data. It's easy to define templates like
|
||||
<code>Affine</code>
|
||||
ourselves (see
|
||||
<a href="basics.html#The-Template-1">
|
||||
The Template
|
||||
</a>
|
||||
), but Flux provides
|
||||
<code>Affine</code>
|
||||
out of the box.
|
||||
</p>
|
||||
<h2>
|
||||
<a class="nav-anchor" id="An-MNIST-Example-1" href="#An-MNIST-Example-1">
|
||||
An MNIST Example
|
||||
<a class="nav-anchor" id="Combining-Models-1" href="#Combining-Models-1">
|
||||
Combining Models
|
||||
</a>
|
||||
</h2>
|
||||
<p>
|
||||
<em>
|
||||
... Inflating Graviton Zeppelins ...
|
||||
</em>
|
||||
</p>
|
||||
<p>
|
||||
A more complex model usually involves many basic layers like
|
||||
<code>affine</code>
|
||||
, where we use the output of one layer as the input to the next:
|
||||
</p>
|
||||
<pre><code class="language-julia">mymodel1(x) = softmax(affine2(σ(affine1(x))))
|
||||
mymodel1(x1) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]</code></pre>
|
||||
<p>
|
||||
This syntax is again a little unwieldy for larger networks, so Flux provides another template of sorts to create the function for us:
|
||||
</p>
|
||||
<pre><code class="language-julia">mymodel2 = Chain(affine1, σ, affine2, softmax)
|
||||
mymodel2(x2) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]</code></pre>
|
||||
<p>
|
||||
<code>mymodel2</code>
|
||||
is exactly equivalent to
|
||||
<code>mymodel1</code>
|
||||
because it simply calls the provided functions in sequence. We don't have to predefine the affine layers and can also write this as:
|
||||
</p>
|
||||
<pre><code class="language-julia">mymodel3 = Chain(
|
||||
Affine(5, 5), σ,
|
||||
Affine(5, 5), softmax)</code></pre>
|
||||
<p>
|
||||
You now know understand enough to take a look at the
|
||||
<a href="../examples/logreg.html">
|
||||
logistic regression
|
||||
</a>
|
||||
example, if you haven't already.
|
||||
</p>
|
||||
<h2>
|
||||
<a class="nav-anchor" id="A-Function-in-Model's-Clothing-1" href="#A-Function-in-Model's-Clothing-1">
|
||||
A Function in Model's Clothing
|
||||
</a>
|
||||
</h2>
|
||||
<p>
|
||||
<em>
|
||||
... Booting Dark Matter Transmogrifiers ...
|
||||
</em>
|
||||
</p>
|
||||
<p>
|
||||
We noted above that a "model" is just a function with some trainable parameters. This goes both ways; a normal Julia function like
|
||||
<code>exp</code>
|
||||
is really just a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example,
|
||||
<code>Chain</code>
|
||||
will happily work with regular functions:
|
||||
</p>
|
||||
<pre><code class="language-julia">foo = Chain(exp, sum, log)
|
||||
foo([1,2,3]) == 3.408 == log(sum(exp([1,2,3])))</code></pre>
|
||||
<p>
|
||||
This unification opens up the floor for some powerful features, which we'll discuss later in the guide.
|
||||
</p>
|
||||
<h2>
|
||||
<a class="nav-anchor" id="The-Template-1" href="#The-Template-1">
|
||||
The Template
|
||||
</a>
|
||||
</h2>
|
||||
<p>
|
||||
<em>
|
||||
... Calculating Tax Expenses ...
|
||||
</em>
|
||||
</p>
|
||||
<p>
|
||||
[WIP]
|
||||
</p>
|
||||
<footer>
|
||||
<hr/>
|
||||
<a class="previous" href="../index.html">
|
||||
|
@ -97,7 +97,7 @@ Debugging
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/manual/debugging.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/manual/debugging.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -97,7 +97,7 @@ Recurrence
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a146c478f3f39fd990ce94ce4a55ac0974a9f5b1/docs/src/manual/recurrent.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/2b25491e40f2d595f063497a32f52227bca84f12/docs/src/manual/recurrent.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -25,9 +25,9 @@ var documenterSearchIndex = {"docs": [
|
||||
},
|
||||
|
||||
{
|
||||
"location": "manual/basics.html#Basic-Usage-1",
|
||||
"location": "manual/basics.html#First-Steps-1",
|
||||
"page": "First Steps",
|
||||
"title": "Basic Usage",
|
||||
"title": "First Steps",
|
||||
"category": "section",
|
||||
"text": ""
|
||||
},
|
||||
@ -37,7 +37,7 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "First Steps",
|
||||
"title": "Installation",
|
||||
"category": "section",
|
||||
"text": "Pkg.clone(\"https://github.com/MikeInnes/DataFlow.jl\")\nPkg.clone(\"https://github.com/MikeInnes/Flux.jl\")"
|
||||
"text": "... Charging Ion Capacitors ...Pkg.clone(\"https://github.com/MikeInnes/DataFlow.jl\")\nPkg.clone(\"https://github.com/MikeInnes/Flux.jl\")\nusing Flux"
|
||||
},
|
||||
|
||||
{
|
||||
@ -45,15 +45,31 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "First Steps",
|
||||
"title": "The Model",
|
||||
"category": "section",
|
||||
"text": "Charging Ion Capacitors...The core concept in Flux is that of the model. A model is simply a function with parameters. In Julia, we might define the following function:W = randn(3,5)\nb = randn(3)\naffine(x) = W*x + b\n\nx1 = randn(5)\naffine(x1)\n> 3-element Array{Float64,1}:\n -0.0215644\n -4.07343 \n 0.312591"
|
||||
"text": "... Initialising Photon Beams ...The core concept in Flux is the model. A model (or \"layer\") is simply a function with parameters. For example, in plain Julia code, we could define the following function to represent a logistic regression (or simple neural network):W = randn(3,5)\nb = randn(3)\naffine(x) = W*x + b\n\nx1 = rand(5) # [0.581466,0.606507,0.981732,0.488618,0.415414]\ny1 = softmax(affine(x1)) # [0.32676,0.0974173,0.575823]affine is simply a function which takes some vector x1 and outputs a new one y1. For example, x1 could be data from an image and y1 could be predictions about the content of that image. However, affine isn't static. It has parameters W and b, and if we tweak those parameters we'll tweak the result – hopefully to make the predictions more accurate.This is all well and good, but we usually want to have more than one affine layer in our network; writing out the above definition to create new sets of parameters every time would quickly become tedious. For that reason, we want to use a template which creates these functions for us:affine1 = Affine(5, 5)\naffine2 = Affine(5, 5)\n\nsoftmax(affine1(x1)) # [0.167952, 0.186325, 0.176683, 0.238571, 0.23047]\nsoftmax(affine2(x1)) # [0.125361, 0.246448, 0.21966, 0.124596, 0.283935]We just created two separate Affine layers, and each contains its own version of W and b, leading to a different result when called with our data. It's easy to define templates like Affine ourselves (see The Template), but Flux provides Affine out of the box."
|
||||
},
|
||||
|
||||
{
|
||||
"location": "manual/basics.html#An-MNIST-Example-1",
|
||||
"location": "manual/basics.html#Combining-Models-1",
|
||||
"page": "First Steps",
|
||||
"title": "An MNIST Example",
|
||||
"title": "Combining Models",
|
||||
"category": "section",
|
||||
"text": ""
|
||||
"text": "... Inflating Graviton Zeppelins ...A more complex model usually involves many basic layers like affine, where we use the output of one layer as the input to the next:mymodel1(x) = softmax(affine2(σ(affine1(x))))\nmymodel1(x1) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]This syntax is again a little unwieldy for larger networks, so Flux provides another template of sorts to create the function for us:mymodel2 = Chain(affine1, σ, affine2, softmax)\nmymodel2(x2) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]mymodel2 is exactly equivalent to mymodel1 because it simply calls the provided functions in sequence. We don't have to predefine the affine layers and can also write this as:mymodel3 = Chain(\n Affine(5, 5), σ,\n Affine(5, 5), softmax)You now know understand enough to take a look at the logistic regression example, if you haven't already."
|
||||
},
|
||||
|
||||
{
|
||||
"location": "manual/basics.html#A-Function-in-Model's-Clothing-1",
|
||||
"page": "First Steps",
|
||||
"title": "A Function in Model's Clothing",
|
||||
"category": "section",
|
||||
"text": "... Booting Dark Matter Transmogrifiers ...We noted above that a \"model\" is just a function with some trainable parameters. This goes both ways; a normal Julia function like exp is really just a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example, Chain will happily work with regular functions:foo = Chain(exp, sum, log)\nfoo([1,2,3]) == 3.408 == log(sum(exp([1,2,3])))This unification opens up the floor for some powerful features, which we'll discuss later in the guide."
|
||||
},
|
||||
|
||||
{
|
||||
"location": "manual/basics.html#The-Template-1",
|
||||
"page": "First Steps",
|
||||
"title": "The Template",
|
||||
"category": "section",
|
||||
"text": "... Calculating Tax Expenses ...[WIP]"
|
||||
},
|
||||
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user