build based on 9b01608
This commit is contained in:
parent
1f785e699e
commit
636eae24bd
@ -104,7 +104,7 @@ Contributing & Help
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/contributing.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/contributing.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -107,7 +107,7 @@ Logistic Regression
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/examples/logreg.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/examples/logreg.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -110,7 +110,7 @@ Home
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/index.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/index.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -104,7 +104,7 @@ Internals
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/internals.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/internals.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -128,7 +128,7 @@ Model Building Basics
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/models/basics.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/models/basics.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
@ -368,7 +368,7 @@ Just as above, this is roughly equivalent to writing:
|
||||
end
|
||||
|
||||
function (self::TLP)(x)
|
||||
l1 = σ(self.first)
|
||||
l1 = σ(self.first(x))
|
||||
l2 = softmax(self.second(l1))
|
||||
end</code></pre>
|
||||
<p>
|
||||
|
@ -107,7 +107,7 @@ Debugging
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/models/debugging.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/models/debugging.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
@ -122,7 +122,77 @@ Debugging Models
|
||||
</a>
|
||||
</h1>
|
||||
<p>
|
||||
[WIP]
|
||||
Let's take our two-layer perceptron as an example again, running on MXNet:
|
||||
</p>
|
||||
<pre><code class="language-julia">@net type TLP
|
||||
first
|
||||
second
|
||||
function (x)
|
||||
l1 = σ(first(x))
|
||||
l2 = softmax(second(l1))
|
||||
end
|
||||
end
|
||||
|
||||
model = TLP(Affine(10, 20), Affine(21, 15))
|
||||
|
||||
mxmodel = mxnet(model, (1, 20))</code></pre>
|
||||
<p>
|
||||
Unfortunately, this model has a (fairly obvious) typo, which means that the code above won't run. Instead we get an error message:
|
||||
</p>
|
||||
<pre><code class="language-julia">InferShape Error in dot5: [20:37:39] src/operator/./matrix_op-inl.h:271: Check failed: (lshape[1]) == (rshape[0]) dot shape error: (15,21) X (20,1)
|
||||
in Flux.Affine at affine.jl:8
|
||||
in TLP at test.jl:6
|
||||
in mxnet(::TLP, ::Tuple{Int64,Int64}) at model.jl:40
|
||||
in mxnet(::TLP, ::Vararg{Any,N} where N) at backend.jl:20</code></pre>
|
||||
<p>
|
||||
Most frameworks would only give the error message here – not so helpful if you have thousands of nodes in your computational graph. However, Flux is able to give good error reports
|
||||
<em>
|
||||
even when no Julia code has been run
|
||||
</em>
|
||||
, e.g. when running on a backend like MXNet. This enables us to pinpoint the source of the error very quickly even in a large model.
|
||||
</p>
|
||||
<p>
|
||||
In this case, we can immediately see that the error occurred within an
|
||||
<code>Affine</code>
|
||||
layer. There are two such layers, but this one was called from the second line of
|
||||
<code>TLP</code>
|
||||
, so it must be the second
|
||||
<code>Affine</code>
|
||||
layer we defined. The layer expected an input of length 21 but got 20 instead.
|
||||
</p>
|
||||
<p>
|
||||
Of course, often a stack trace isn't enough to figure out the source of an error. Another option is to simply step through the execution of the model using Gallium. While handy, however, stepping isn't always the best way to get a "bird's eye view" of the code. For that, Flux provides a macro called
|
||||
<code>@shapes</code>
|
||||
:
|
||||
</p>
|
||||
<pre><code class="language-julia">julia> @shapes model(rand(5,10))
|
||||
|
||||
# /Users/mike/test.jl, line 18:
|
||||
gull = σ(Affine(10, 20)(Input()[1]::(5,10))::(5,20))::(5,20)
|
||||
# /Users/mike/.julia/v0.6/Flux/src/layers/affine.jl, line 8:
|
||||
lobster = gull * _::(21,15) + _::(1,15)
|
||||
# /Users/mike/test.jl, line 19:
|
||||
raven = softmax(lobster)</code></pre>
|
||||
<p>
|
||||
This is a lot like Julia's own
|
||||
<code>code_warntype</code>
|
||||
; but instead of annotating expressions with types, we display their shapes. As a lowered form it has some quirks; input arguments are represented by
|
||||
<code>Input()[N]</code>
|
||||
and parameters by an underscore.
|
||||
</p>
|
||||
<p>
|
||||
This makes the problem fairly obvious. We tried to multiply the output of the first layer
|
||||
<code>(5, 20)</code>
|
||||
by a parameter
|
||||
<code>(21, 15)</code>
|
||||
; the inner dimensions should have been equal.
|
||||
</p>
|
||||
<p>
|
||||
Notice that while the first
|
||||
<code>Affine</code>
|
||||
layer is displayed as-is, the second was inlined and we see a reference to where the
|
||||
<code>W * x + b</code>
|
||||
line was defined in Flux's source code. In this way Flux makes it easy to drill down into problem areas, without showing you the full graph of thousands of nodes at once.
|
||||
</p>
|
||||
<footer>
|
||||
<hr/>
|
||||
|
@ -107,7 +107,7 @@ Recurrence
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/a1e35ea2c8838cc7e4e12fa290bab19d1b49c48a/docs/src/models/recurrent.md">
|
||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/9b016085b05ad55c0bc4b96bf53be70d063adfa2/docs/src/models/recurrent.md">
|
||||
<span class="fa">
|
||||
|
||||
</span>
|
||||
|
@ -77,7 +77,7 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "Model Building Basics",
|
||||
"title": "Sub-Templates",
|
||||
"category": "section",
|
||||
"text": "@net models can contain sub-models as well as just array parameters:@net type TLP\n first\n second\n function (x)\n l1 = σ(first(x))\n l2 = softmax(second(l1))\n end\nendJust as above, this is roughly equivalent to writing:type TLP\n first\n second\nend\n\nfunction (self::TLP)(x)\n l1 = σ(self.first)\n l2 = softmax(self.second(l1))\nendClearly, the first and second parameters are not arrays here, but should be models themselves, and produce a result when called with an input array x. The Affine layer fits the bill so we can instantiate TLP with two of them:model = TLP(Affine(10, 20),\n Affine(20, 15))\nx1 = rand(20)\nmodel(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...You may recognise this as being equivalent toChain(\n Affine(10, 20), σ\n Affine(20, 15), softmax)given that it's just a sequence of calls. For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once."
|
||||
"text": "@net models can contain sub-models as well as just array parameters:@net type TLP\n first\n second\n function (x)\n l1 = σ(first(x))\n l2 = softmax(second(l1))\n end\nendJust as above, this is roughly equivalent to writing:type TLP\n first\n second\nend\n\nfunction (self::TLP)(x)\n l1 = σ(self.first(x))\n l2 = softmax(self.second(l1))\nendClearly, the first and second parameters are not arrays here, but should be models themselves, and produce a result when called with an input array x. The Affine layer fits the bill so we can instantiate TLP with two of them:model = TLP(Affine(10, 20),\n Affine(20, 15))\nx1 = rand(20)\nmodel(x1) # [0.057852,0.0409741,0.0609625,0.0575354 ...You may recognise this as being equivalent toChain(\n Affine(10, 20), σ\n Affine(20, 15), softmax)given that it's just a sequence of calls. For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once."
|
||||
},
|
||||
|
||||
{
|
||||
@ -109,7 +109,7 @@ var documenterSearchIndex = {"docs": [
|
||||
"page": "Debugging",
|
||||
"title": "Debugging Models",
|
||||
"category": "section",
|
||||
"text": "[WIP]"
|
||||
"text": "Let's take our two-layer perceptron as an example again, running on MXNet:@net type TLP\n first\n second\n function (x)\n l1 = σ(first(x))\n l2 = softmax(second(l1))\n end\nend\n\nmodel = TLP(Affine(10, 20), Affine(21, 15))\n\nmxmodel = mxnet(model, (1, 20))Unfortunately, this model has a (fairly obvious) typo, which means that the code above won't run. Instead we get an error message:InferShape Error in dot5: [20:37:39] src/operator/./matrix_op-inl.h:271: Check failed: (lshape[1]) == (rshape[0]) dot shape error: (15,21) X (20,1)\n in Flux.Affine at affine.jl:8\n in TLP at test.jl:6\n in mxnet(::TLP, ::Tuple{Int64,Int64}) at model.jl:40\n in mxnet(::TLP, ::Vararg{Any,N} where N) at backend.jl:20Most frameworks would only give the error message here – not so helpful if you have thousands of nodes in your computational graph. However, Flux is able to give good error reports even when no Julia code has been run, e.g. when running on a backend like MXNet. This enables us to pinpoint the source of the error very quickly even in a large model.In this case, we can immediately see that the error occurred within an Affine layer. There are two such layers, but this one was called from the second line of TLP, so it must be the second Affine layer we defined. The layer expected an input of length 21 but got 20 instead.Of course, often a stack trace isn't enough to figure out the source of an error. Another option is to simply step through the execution of the model using Gallium. While handy, however, stepping isn't always the best way to get a \"bird's eye view\" of the code. For that, Flux provides a macro called @shapes:julia> @shapes model(rand(5,10))\n\n# /Users/mike/test.jl, line 18:\ngull = σ(Affine(10, 20)(Input()[1]::(5,10))::(5,20))::(5,20)\n# /Users/mike/.julia/v0.6/Flux/src/layers/affine.jl, line 8:\nlobster = gull * _::(21,15) + _::(1,15)\n# /Users/mike/test.jl, line 19:\nraven = softmax(lobster)This is a lot like Julia's own code_warntype; but instead of annotating expressions with types, we display their shapes. As a lowered form it has some quirks; input arguments are represented by Input()[N] and parameters by an underscore.This makes the problem fairly obvious. We tried to multiply the output of the first layer (5, 20) by a parameter (21, 15); the inner dimensions should have been equal.Notice that while the first Affine layer is displayed as-is, the second was inlined and we see a reference to where the W * x + b line was defined in Flux's source code. In this way Flux makes it easy to drill down into problem areas, without showing you the full graph of thousands of nodes at once."
|
||||
},
|
||||
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user