diff --git a/latest/contributing.html b/latest/contributing.html
index 74bddff2..9a9454ff 100644
--- a/latest/contributing.html
+++ b/latest/contributing.html
@@ -97,7 +97,7 @@ Contributing & Help
-
+
diff --git a/latest/examples/logreg.html b/latest/examples/logreg.html
index 9424772f..7c12f1ab 100644
--- a/latest/examples/logreg.html
+++ b/latest/examples/logreg.html
@@ -100,7 +100,7 @@ Logistic Regression
-
+
diff --git a/latest/index.html b/latest/index.html
index 786be3fa..904c655a 100644
--- a/latest/index.html
+++ b/latest/index.html
@@ -97,7 +97,7 @@ Home
-
+
diff --git a/latest/internals.html b/latest/internals.html
index dc94a649..c73eeae6 100644
--- a/latest/internals.html
+++ b/latest/internals.html
@@ -97,7 +97,7 @@ Internals
-
+
diff --git a/latest/manual/basics.html b/latest/manual/basics.html
index da09dc22..bc25f778 100644
--- a/latest/manual/basics.html
+++ b/latest/manual/basics.html
@@ -63,8 +63,18 @@ The Model
-
-An MNIST Example
+
+Combining Models
+
+
+
+
+A Function in Model's Clothing
+
+
+
+
+The Template
@@ -113,7 +123,7 @@ First Steps
-
+
@@ -123,8 +133,8 @@ First Steps
@@ -132,8 +142,14 @@ Basic Usage
Installation
+
+
+... Charging Ion Capacitors ...
+
+
Pkg.clone("https://github.com/MikeInnes/DataFlow.jl")
-Pkg.clone("https://github.com/MikeInnes/Flux.jl")
+Pkg.clone("https://github.com/MikeInnes/Flux.jl")
+using Flux
-Charging Ion Capacitors...
+... Initialising Photon Beams ...
-The core concept in Flux is that of the
+The core concept in Flux is the
model
-. A model is simply a function with parameters. In Julia, we might define the following function:
+. A model (or "layer") is simply a function with parameters. For example, in plain Julia code, we could define the following function to represent a logistic regression (or simple neural network):
W = randn(3,5)
b = randn(3)
affine(x) = W*x + b
-x1 = randn(5)
-affine(x1)
-> 3-element Array{Float64,1}:
- -0.0215644
- -4.07343
- 0.312591
+x1 = rand(5) # [0.581466,0.606507,0.981732,0.488618,0.415414]
+y1 = softmax(affine(x1)) # [0.32676,0.0974173,0.575823]
+
+affine
+ is simply a function which takes some vector
+x1
+ and outputs a new one
+y1
+. For example,
+x1
+ could be data from an image and
+y1
+ could be predictions about the content of that image. However,
+affine
+ isn't static. It has
+
+parameters
+
+
+W
+ and
+b
+, and if we tweak those parameters we'll tweak the result – hopefully to make the predictions more accurate.
+
+
+This is all well and good, but we usually want to have more than one affine layer in our network; writing out the above definition to create new sets of parameters every time would quickly become tedious. For that reason, we want to use a
+
+template
+
+ which creates these functions for us:
+
+affine1 = Affine(5, 5)
+affine2 = Affine(5, 5)
+
+softmax(affine1(x1)) # [0.167952, 0.186325, 0.176683, 0.238571, 0.23047]
+softmax(affine2(x1)) # [0.125361, 0.246448, 0.21966, 0.124596, 0.283935]
+
+We just created two separate
+Affine
+ layers, and each contains its own version of
+W
+ and
+b
+, leading to a different result when called with our data. It's easy to define templates like
+Affine
+ ourselves (see
+
+The Template
+
+), but Flux provides
+Affine
+ out of the box.
+
+
+
+... Inflating Graviton Zeppelins ...
+
+
+
+A more complex model usually involves many basic layers like
+affine
+, where we use the output of one layer as the input to the next:
+
+mymodel1(x) = softmax(affine2(σ(affine1(x))))
+mymodel1(x1) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]
+
+This syntax is again a little unwieldy for larger networks, so Flux provides another template of sorts to create the function for us:
+
+mymodel2 = Chain(affine1, σ, affine2, softmax)
+mymodel2(x2) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]
+
+mymodel2
+ is exactly equivalent to
+mymodel1
+ because it simply calls the provided functions in sequence. We don't have to predefine the affine layers and can also write this as:
+
+mymodel3 = Chain(
+ Affine(5, 5), σ,
+ Affine(5, 5), softmax)
+
+You now know understand enough to take a look at the
+
+logistic regression
+
+ example, if you haven't already.
+
+
+
+
+... Booting Dark Matter Transmogrifiers ...
+
+
+
+We noted above that a "model" is just a function with some trainable parameters. This goes both ways; a normal Julia function like
+exp
+ is really just a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example,
+Chain
+ will happily work with regular functions:
+
+foo = Chain(exp, sum, log)
+foo([1,2,3]) == 3.408 == log(sum(exp([1,2,3])))
+
+This unification opens up the floor for some powerful features, which we'll discuss later in the guide.
+
+
+
+
+... Calculating Tax Expenses ...
+
+
+
+[WIP]
+
diff --git a/latest/manual/debugging.html b/latest/manual/debugging.html
index 143ea473..762d0acb 100644
--- a/latest/manual/debugging.html
+++ b/latest/manual/debugging.html
@@ -97,7 +97,7 @@ Debugging
-
+
diff --git a/latest/manual/recurrent.html b/latest/manual/recurrent.html
index 3f1e46cb..98fc8679 100644
--- a/latest/manual/recurrent.html
+++ b/latest/manual/recurrent.html
@@ -97,7 +97,7 @@ Recurrence
-
+
diff --git a/latest/search_index.js b/latest/search_index.js
index dcaf15d4..c3a38704 100644
--- a/latest/search_index.js
+++ b/latest/search_index.js
@@ -25,9 +25,9 @@ var documenterSearchIndex = {"docs": [
},
{
- "location": "manual/basics.html#Basic-Usage-1",
+ "location": "manual/basics.html#First-Steps-1",
"page": "First Steps",
- "title": "Basic Usage",
+ "title": "First Steps",
"category": "section",
"text": ""
},
@@ -37,7 +37,7 @@ var documenterSearchIndex = {"docs": [
"page": "First Steps",
"title": "Installation",
"category": "section",
- "text": "Pkg.clone(\"https://github.com/MikeInnes/DataFlow.jl\")\nPkg.clone(\"https://github.com/MikeInnes/Flux.jl\")"
+ "text": "... Charging Ion Capacitors ...Pkg.clone(\"https://github.com/MikeInnes/DataFlow.jl\")\nPkg.clone(\"https://github.com/MikeInnes/Flux.jl\")\nusing Flux"
},
{
@@ -45,15 +45,31 @@ var documenterSearchIndex = {"docs": [
"page": "First Steps",
"title": "The Model",
"category": "section",
- "text": "Charging Ion Capacitors...The core concept in Flux is that of the model. A model is simply a function with parameters. In Julia, we might define the following function:W = randn(3,5)\nb = randn(3)\naffine(x) = W*x + b\n\nx1 = randn(5)\naffine(x1)\n> 3-element Array{Float64,1}:\n -0.0215644\n -4.07343 \n 0.312591"
+ "text": "... Initialising Photon Beams ...The core concept in Flux is the model. A model (or \"layer\") is simply a function with parameters. For example, in plain Julia code, we could define the following function to represent a logistic regression (or simple neural network):W = randn(3,5)\nb = randn(3)\naffine(x) = W*x + b\n\nx1 = rand(5) # [0.581466,0.606507,0.981732,0.488618,0.415414]\ny1 = softmax(affine(x1)) # [0.32676,0.0974173,0.575823]affine is simply a function which takes some vector x1 and outputs a new one y1. For example, x1 could be data from an image and y1 could be predictions about the content of that image. However, affine isn't static. It has parameters W and b, and if we tweak those parameters we'll tweak the result – hopefully to make the predictions more accurate.This is all well and good, but we usually want to have more than one affine layer in our network; writing out the above definition to create new sets of parameters every time would quickly become tedious. For that reason, we want to use a template which creates these functions for us:affine1 = Affine(5, 5)\naffine2 = Affine(5, 5)\n\nsoftmax(affine1(x1)) # [0.167952, 0.186325, 0.176683, 0.238571, 0.23047]\nsoftmax(affine2(x1)) # [0.125361, 0.246448, 0.21966, 0.124596, 0.283935]We just created two separate Affine layers, and each contains its own version of W and b, leading to a different result when called with our data. It's easy to define templates like Affine ourselves (see The Template), but Flux provides Affine out of the box."
},
{
- "location": "manual/basics.html#An-MNIST-Example-1",
+ "location": "manual/basics.html#Combining-Models-1",
"page": "First Steps",
- "title": "An MNIST Example",
+ "title": "Combining Models",
"category": "section",
- "text": ""
+ "text": "... Inflating Graviton Zeppelins ...A more complex model usually involves many basic layers like affine, where we use the output of one layer as the input to the next:mymodel1(x) = softmax(affine2(σ(affine1(x))))\nmymodel1(x1) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]This syntax is again a little unwieldy for larger networks, so Flux provides another template of sorts to create the function for us:mymodel2 = Chain(affine1, σ, affine2, softmax)\nmymodel2(x2) # [0.187935, 0.232237, 0.169824, 0.230589, 0.179414]mymodel2 is exactly equivalent to mymodel1 because it simply calls the provided functions in sequence. We don't have to predefine the affine layers and can also write this as:mymodel3 = Chain(\n Affine(5, 5), σ,\n Affine(5, 5), softmax)You now know understand enough to take a look at the logistic regression example, if you haven't already."
+},
+
+{
+ "location": "manual/basics.html#A-Function-in-Model's-Clothing-1",
+ "page": "First Steps",
+ "title": "A Function in Model's Clothing",
+ "category": "section",
+ "text": "... Booting Dark Matter Transmogrifiers ...We noted above that a \"model\" is just a function with some trainable parameters. This goes both ways; a normal Julia function like exp is really just a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example, Chain will happily work with regular functions:foo = Chain(exp, sum, log)\nfoo([1,2,3]) == 3.408 == log(sum(exp([1,2,3])))This unification opens up the floor for some powerful features, which we'll discuss later in the guide."
+},
+
+{
+ "location": "manual/basics.html#The-Template-1",
+ "page": "First Steps",
+ "title": "The Template",
+ "category": "section",
+ "text": "... Calculating Tax Expenses ...[WIP]"
},
{