diff --git a/latest/contributing.html b/latest/contributing.html index 7b232986..2ee0dba5 100644 --- a/latest/contributing.html +++ b/latest/contributing.html @@ -104,7 +104,7 @@ Contributing & Help - + diff --git a/latest/examples/logreg.html b/latest/examples/logreg.html index 47ebeff5..2a320fa5 100644 --- a/latest/examples/logreg.html +++ b/latest/examples/logreg.html @@ -107,7 +107,7 @@ Logistic Regression - + diff --git a/latest/index.html b/latest/index.html index 98381646..dc815f77 100644 --- a/latest/index.html +++ b/latest/index.html @@ -110,7 +110,7 @@ Home - + diff --git a/latest/internals.html b/latest/internals.html index 254fafeb..a292efab 100644 --- a/latest/internals.html +++ b/latest/internals.html @@ -104,7 +104,7 @@ Internals - + diff --git a/latest/models/basics.html b/latest/models/basics.html index b1d4a22f..45c898f2 100644 --- a/latest/models/basics.html +++ b/latest/models/basics.html @@ -128,7 +128,7 @@ Model Building Basics - + @@ -340,7 +340,68 @@ The above code is almost exactly how Affine is defined in Flux itself! There's no difference between "library-level" and "user-level" models, so making your code reusable doesn't involve a lot of extra complexity. Moreover, much more complex models than Affine - are equally simple to define, and equally close to the mathematical notation; read on to find out how. + are equally simple to define. +

+

+ +Sub-Templates + +

+

+@net + models can contain sub-models as well as just array parameters: +

+
@net type TLP
+  first
+  second
+  function (x)
+    l1 = σ(first(x))
+    l2 = softmax(second(l1))
+  end
+end
+

+Just as above, this is roughly equivalent to writing: +

+
type TLP
+  first
+  second
+end
+
+function (self::TLP)(x)
+  l1 = σ(self.first)
+  l2 = softmax(self.second(l1))
+end
+

+Clearly, the +first + and +second + parameters are not arrays here, but should be models themselves, and produce a result when called with an input array +x +. The +Affine + layer fits the bill so we can instantiate +TLP + with two of them: +

+
model = TLP(Affine(10, 20),
+            Affine(20, 15))
+x1 = rand(20)
+model(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...
+

+You may recognise this as being equivalent to +

+
Chain(
+  Affine(10, 20), σ
+  Affine(20, 15)), softmax
+

+given that it's just a sequence of calls. For simple networks +Chain + is completely fine, although the +@net + version is more powerful as we can (for example) reuse the output +l1 + more than once.