-Here's one way we could solve this: just keep the parameters in a Julia type, and define how that type acts as a function:
-
-
type MyAffine
- W
- b
-end
-
-# Use the `MyAffine` layer as a model
-(l::MyAffine)(x) = l.W * x + l.b
-
-# Convenience constructor
-MyAffine(in::Integer, out::Integer) =
- MyAffine(randn(out, in), randn(out))
-
-model = Chain(MyAffine(5, 5), MyAffine(5, 5))
-
-model(x1) # [-1.54458,0.492025,0.88687,1.93834,-4.70062]
-
-This is much better: we can now make as many affine layers as we want. This is a very common pattern, so to make it more convenient we can use the
-@net
- macro:
-
-
@net type MyAffine
- W
- b
- x -> W * x + b
-end
-
-The function provided,
-x -> W * x + b
-, will be used when
-MyAffine
- is used as a model; it's just a shorter way of defining the
-(::MyAffine)(x)
- method above.
-
-
-However,
-@net
- does not simply save us some keystrokes; it's the secret sauce that makes everything else in Flux go. For example, it analyses the code for the forward function so that it can differentiate it or convert it to a TensorFlow graph.
-
-
-The above code is almost exactly how
-Affine
- is defined in Flux itself! There's no difference between "library-level" and "user-level" models, so making your code reusable doesn't involve a lot of extra complexity. Moreover, much more complex models than
-Affine
- are equally simple to define.
-
-@net
- models can contain sub-models as well as just array parameters:
-
-
@net type TLP
- first
- second
- function (x)
- l1 = σ(first(x))
- l2 = softmax(second(l1))
- end
-end
-
-Just as above, this is roughly equivalent to writing:
-
-
type TLP
- first
- second
-end
-
-function (self::TLP)(x)
- l1 = σ(self.first(x))
- l2 = softmax(self.second(l1))
-end
-
-Clearly, the
-first
- and
-second
- parameters are not arrays here, but should be models themselves, and produce a result when called with an input array
-x
-. The
-Affine
- layer fits the bill so we can instantiate
-TLP
- with two of them:
-
-given that it's just a sequence of calls. For simple networks
-Chain
- is completely fine, although the
-@net
- version is more powerful as we can (for example) reuse the output
-l1
- more than once.
-