diff --git a/latest/contributing.html b/latest/contributing.html index 98840478..7f80ade0 100644 --- a/latest/contributing.html +++ b/latest/contributing.html @@ -104,7 +104,7 @@ Contributing & Help - + diff --git a/latest/examples/logreg.html b/latest/examples/logreg.html index 4b36d4f4..20f162b1 100644 --- a/latest/examples/logreg.html +++ b/latest/examples/logreg.html @@ -107,7 +107,7 @@ Logistic Regression - + diff --git a/latest/index.html b/latest/index.html index 32f2a9ab..79dd75c3 100644 --- a/latest/index.html +++ b/latest/index.html @@ -46,7 +46,13 @@ Version Home -
+ +... Charging Ion Capacitors ... + +
+Pkg.clone("https://github.com/MikeInnes/DataFlow.jl")
+Pkg.clone("https://github.com/MikeInnes/Flux.jl")
+using Flux
+ +You'll also need a backend to run real training, if you don't have one already. Choose from + +MXNet + + or + +TensorFlow + + (MXNet is the recommended option if you're not sure):
- + diff --git a/latest/models/basics.html b/latest/models/basics.html index 6dfc673e..6c73e805 100644 --- a/latest/models/basics.html +++ b/latest/models/basics.html @@ -57,11 +57,6 @@ Building Models First Steps - + @@ -143,23 +138,11 @@ First Steps- -... Charging Ion Capacitors ... - -
-Pkg.clone("https://github.com/MikeInnes/DataFlow.jl")
-Pkg.clone("https://github.com/MikeInnes/Flux.jl")
-using Flux
+Pkg.add("MXNet") # or "TensorFlow"
W = randn(3,5)
b = randn(3)
-affine(x) = W*x + b
+affine(x) = W * x + b
x1 = rand(5) # [0.581466,0.606507,0.981732,0.488618,0.415414]
y1 = softmax(affine(x1)) # [0.32676,0.0974173,0.575823]
@@ -232,7 +215,7 @@ The Template
), but Flux provides
Affine
- out of the box.
+ out of the box, so we'll use that for now.
-We noted above that a "model" is just a function with some trainable parameters. This goes both ways; a normal Julia function like
+We noted above that a "model" is a function with some number of trainable parameters. This goes both ways; a normal Julia function like
exp
- is really just a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example,
+ is effectively a model with 0 parameters. Flux doesn't care, and anywhere that you use one, you can use the other. For example,
Chain
will happily work with regular functions:
foo = Chain(exp, sum, log)
foo([1,2,3]) == 3.408 == log(sum(exp([1,2,3])))
- -This unification opens up the floor for some powerful features, which we'll discuss later in the guide. -
-[WIP]
+So how does the
+Affine
+ template work? We don't want to duplicate the code above whenever we need more than one affine layer:
+
W₁, b₁ = randn(...)
+affine₁(x) = W₁*x + b₁
+W₂, b₂ = randn(...)
+affine₂(x) = W₂*x + b₂
+model = Chain(affine₁, affine₂)
+ +Here's one way we could solve this: just keep the parameters in a Julia type, and define how that type acts as a function: +
+type MyAffine
+ W
+ b
+end
+
+# Use the `MyAffine` layer as a model
+(l::MyAffine)(x) = l.W * x + l.b
+
+# Convenience constructor
+MyAffine(in::Integer, out::Integer) =
+ MyAffine(randn(out, in), randn(out))
+
+model = Chain(MyAffine(5, 5), MyAffine(5, 5))
+
+model(x1) # [-1.54458,0.492025,0.88687,1.93834,-4.70062]
+
+This is much better: we can now make as many affine layers as we want. This is a very common pattern, so to make it more convenient we can use the
+@net
+ macro:
+
@net type MyAffine
+ W
+ b
+ x -> W * x + b
+end
+
+The function provided,
+x -> W * x + b
+, will be used when
+MyAffine
+ is used as a model; it's just a shorter way of defining the
+(::MyAffine)(x)
+ method above.
+
+However,
+@net
+ does not simply save us some keystrokes; it's the secret sauce that makes everything else in Flux go. For example, it analyses the code for the forward function so that it can differentiate it or convert it to a TensorFlow graph.
+
+The above code is almost exactly how
+Affine
+ is defined in Flux itself! There's no difference between "library-level" and "user-level" models, so making your code reusable doesn't involve a lot of extra complexity. Moreover, much more complex models than
+Affine
+ are equally simple to define, and equally close to the mathematical notation; read on to find out how.