notes on submodels
This commit is contained in:
parent
e15a6e4896
commit
2ae4a23efe
@ -116,4 +116,52 @@ The function provided, `x -> W * x + b`, will be used when `MyAffine` is used as
|
|||||||
|
|
||||||
However, `@net` does not simply save us some keystrokes; it's the secret sauce that makes everything else in Flux go. For example, it analyses the code for the forward function so that it can differentiate it or convert it to a TensorFlow graph.
|
However, `@net` does not simply save us some keystrokes; it's the secret sauce that makes everything else in Flux go. For example, it analyses the code for the forward function so that it can differentiate it or convert it to a TensorFlow graph.
|
||||||
|
|
||||||
The above code is almost exactly how `Affine` is defined in Flux itself! There's no difference between "library-level" and "user-level" models, so making your code reusable doesn't involve a lot of extra complexity. Moreover, much more complex models than `Affine` are equally simple to define, and equally close to the mathematical notation; read on to find out how.
|
The above code is almost exactly how `Affine` is defined in Flux itself! There's no difference between "library-level" and "user-level" models, so making your code reusable doesn't involve a lot of extra complexity. Moreover, much more complex models than `Affine` are equally simple to define.
|
||||||
|
|
||||||
|
### Sub-Templates
|
||||||
|
|
||||||
|
`@net` models can contain sub-models as well as just array parameters:
|
||||||
|
|
||||||
|
```julia
|
||||||
|
@net type TLP
|
||||||
|
first
|
||||||
|
second
|
||||||
|
function (x)
|
||||||
|
l1 = σ(first(x))
|
||||||
|
l2 = softmax(second(l1))
|
||||||
|
end
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
Just as above, this is roughly equivalent to writing:
|
||||||
|
|
||||||
|
```julia
|
||||||
|
type TLP
|
||||||
|
first
|
||||||
|
second
|
||||||
|
end
|
||||||
|
|
||||||
|
function (self::TLP)(x)
|
||||||
|
l1 = σ(self.first)
|
||||||
|
l2 = softmax(self.second(l1))
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
Clearly, the `first` and `second` parameters are not arrays here, but should be models themselves, and produce a result when called with an input array `x`. The `Affine` layer fits the bill so we can instantiate `TLP` with two of them:
|
||||||
|
|
||||||
|
```julia
|
||||||
|
model = TLP(Affine(10, 20),
|
||||||
|
Affine(20, 15))
|
||||||
|
x1 = rand(20)
|
||||||
|
model(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...
|
||||||
|
```
|
||||||
|
|
||||||
|
You may recognise this as being equivalent to
|
||||||
|
|
||||||
|
```julia
|
||||||
|
Chain(
|
||||||
|
Affine(10, 20), σ
|
||||||
|
Affine(20, 15)), softmax
|
||||||
|
```
|
||||||
|
|
||||||
|
given that it's just a sequence of calls. For simple networks `Chain` is completely fine, although the `@net` version is more powerful as we can (for example) reuse the output `l1` more than once.
|
||||||
|
Loading…
Reference in New Issue
Block a user