Flux.jl/previews/PR1085/performance/index.html

<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Performance Tips · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview', {'page': location.pathname + location.search + location.hash});
</script><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.11.1/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link href="../assets/flux.css" rel="stylesheet" type="text/css"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit">Flux</span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><span class="tocitem">Building Models</span><ul><li><a class="tocitem" href="../models/basics/">Basics</a></li><li><a class="tocitem" href="../models/recurrence/">Recurrence</a></li><li><a class="tocitem" href="../models/regularisation/">Regularisation</a></li><li><a class="tocitem" href="../models/layers/">Model Reference</a></li><li><a class="tocitem" href="../models/advanced/">Advanced Model Building</a></li><li><a class="tocitem" href="../models/nnlib/">NNlib</a></li></ul></li><li><span class="tocitem">Handling Data</span><ul><li><a class="tocitem" href="../data/onehot/">One-Hot Encoding</a></li><li><a class="tocitem" href="../data/dataloader/">DataLoader</a></li></ul></li><li><span class="tocitem">Training Models</span><ul><li><a class="tocitem" href="../training/optimisers/">Optimisers</a></li><li><a class="tocitem" href="../training/training/">Training</a></li></ul></li><li><a class="tocitem" href="../gpu/">GPU Support</a></li><li><a class="tocitem" href="../saving/">Saving &amp; Loading</a></li><li><a class="tocitem" href="../ecosystem/">The Julia Ecosystem</a></li><li class="is-active"><a class="tocitem" href>Performance Tips</a><ul class="internal"><li><a class="tocitem" href="#Don&#39;t-write-loss-functions-that-use-a-non-constant-globally-declared-model.-1"><span>Don&#39;t write loss functions that use a non-constant globally declared model.</span></a></li><li><a class="tocitem" href="#Don&#39;t-use-more-precision-than-you-need-1"><span>Don&#39;t use more precision than you need</span></a></li><li><a class="tocitem" href="#Preserve-inputs&#39;-types-1"><span>Preserve inputs&#39; types</span></a></li><li><a class="tocitem" href="#Evaluate-batches-as-Matrices-of-features-1"><span>Evaluate batches as Matrices of features</span></a></li></ul></li><li><a class="tocitem" href="../community/">Community</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Performance Tips</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Performance Tips</a></li></ul></nav><div class="docs-right"><a class="
m = Chain(Dense(784, 32, σ), Dense(32, 10), softmax)
loss(x, y) = Flux.mse(m(x), y)

Flux.train!(loss, Flux.params(m), data, Descent(0.1))</code></pre><p>In this bad code, the model <code>m</code> is a non-constant global. It is being used inside the function <code>loss</code>, which is one of the most performance critical parts of this code. It will be slow, as the compiler can&#39;t rely on <code>m</code> always being the same type – it is a mutable global, it could change at any time.</p><h3 id="Correct-alternatives:-1"><a class="docs-heading-anchor" href="#Correct-alternatives:-1">Correct alternatives:</a><a class="docs-heading-anchor-permalink" href="#Correct-alternatives:-1" title="Permalink"></a></h3><h4 id="Mark-the-model-const-1"><a class="docs-heading-anchor" href="#Mark-the-model-const-1">Mark the model <code>const</code></a><a class="docs-heading-anchor-permalink" href="#Mark-the-model-const-1" title="Permalink"></a></h4><pre><code class="language-julia">data = ...
const m = Chain(Dense(784, 32, σ), Dense(32, 10), softmax)
loss(x, y) = Flux.mse(m(x), y)

Flux.train!(loss, Flux.params(m), data, Descent(0.1))</code></pre><p>Similarly anything else that is a non-constant global that is used in functions should also be made constant</p><h4 id="Put-everything-in-a-main-function:-1"><a class="docs-heading-anchor" href="#Put-everything-in-a-main-function:-1">Put everything in a main function:</a><a class="docs-heading-anchor-permalink" href="#Put-everything-in-a-main-function:-1" title="Permalink"></a></h4><p>For more flexibility, you could even make this take <code>m</code> as a argument – it doesn&#39;t matter of <code>m</code> was originally declared as a non-const global once it has been passed in as a argument because it then becomes a local variable.</p><pre><code class="language-julia">function main(data)
    m = Chain(Dense(784, 32, σ), Dense(32, 10), softmax)
    loss(x, y) = Flux.mse(m(x), y)

    Flux.train!(loss, Flux.params(m), data, Descent(0.1))
end</code></pre><h4 id="Make-the-loss-function-actually-close-over-m.-1"><a class="docs-heading-anchor" href="#Make-the-loss-function-actually-close-over-m.-1">Make the loss function actually close over <code>m</code>.</a><a class="docs-heading-anchor-permalink" href="#Make-the-loss-function-actually-close-over-m.-1" title="Permalink"></a></h4><p>Closures can be very useful.</p><pre><code class="language-julia">data = ...
m = Chain(Dense(784, 32, σ), Dense(32, 10), softmax)
get_loss_function(mdl) = (x, y) -&gt; Flux.mse(mdl(x), y)

Flux.train!(get_loss_function(m), Flux.params(m), data, Descent(0.1))</code></pre><p>This example is particularly applicable to callbacks.</p><h2 id="Don&#39;t-use-more-precision-than-you-need-1"><a class="docs-heading-anchor" href="#Don&#39;t-use-more-precision-than-you-need-1">Don&#39;t use more precision than you need</a><a class="docs-heading-anchor-permalink" href="#Don&#39;t-use-more-precision-than-you-need-1" title="Permalink"></a></h2><p>Flux works great with all kinds of number types. But often you do not need to be working with say <code>Float64</code> (let alone <code>BigFloat</code>). Switching to <code>Float32</code> can give you a significant speed up, not because the operations are faster, but because the memory usage is halved. Which means allocations occur much faster. And you use less memory.</p><h2 id="Preserve-inputs&#39;-types-1"><a class="docs-heading-anchor" href="#Preserve-inputs&#39;-types-1">Preserve inputs&#39; types</a><a class="docs-heading-anchor-permalink" href="#Preserve-inputs&#39;-types-1" title="Permalink"></a></h2><p>Not only should your activation and loss functions be <a href="https://docs.julialang.org/en/v1/manual/performance-tips/#Write-%22type-stable%22-functions-1">type-stable</a>, they should also preserve the type of their inputs.</p><p>A very artificial example using an activation function like</p><pre><code class="language-none">    my_tanh(x) = Float64(tanh(x))</code></pre><p>will result in performance on <code>Float32</code> input orders of magnitude slower than the normal <code>tanh</code> would, because it results in having to use slow mixed type multiplication in the dense layers. Similar situations can occur in the loss function during backpropagation.</p><p>Which means if you change your data say from <code>Float64</code> to <code>Float32</code> (which should give a speedup: see above), you will see a large slow-down.</p><p>This can occur sneakily, because you can cause type-promotion by interacting with a numeric literals. E.g. the following will have run into the same problem as above:</p><pre><code class="language-none">    leaky_tanh(x) = 0.01*x + tanh(x)</code></pre><p>While one could change the activation function (e.g. to use <code>0.01f0x</code>), the idiomatic (and safe way)  to avoid type casts whenever inputs changes is to use <code>oftype</code>:</p><pre><code class="language-none">    leaky_tanh(x) = oftype(x/1, 0.01)*x + tanh(x)</code></pre><h2 id="Evaluate-batches-as-Matrices-of-features-1"><a class="docs-heading-anchor" href="#Evaluate-batches-as-Matrices-of-features-1">Evaluate batches as Matrices of features</a><a class="docs-heading-anchor-permalink" href="#Evaluate-batches-as-Matrices-of-features-1" title="Permalink"></a></h2><p>While it can sometimes be tempting to process your observations (feature vectors) one at a time e.g.</p><pre><code class="language-julia">function loss_total(xs::AbstractVector{&lt;:Vector}, ys::AbstractVector{&lt;:Vector})
    sum(zip(xs, ys)) do (x, y_target)
        y_pred = model(x) #  evaluate the model
        return loss(y_pred, y_target)
    end
end</code></pre><p>It is much faster to concatenate them into a matrix, as this will hit BLAS matrix-matrix multiplication, which is much faster than the equivalent sequence of matrix-vector multiplications. The improvement is enough that it is worthwhile allocating new memory to store them contiguously.</p><pre><code class="language-julia">x_batch = reduce(hcat, xs)
y_batch = reduce(hcat, ys)
...
function loss_total(x_batch::Matrix, y_batch::Matrix)
    y_preds = model(x_batch)
    sum(loss.(y_preds, y_batch))
end</code></pre><p>When doing this kind of concatenation use <code>reduce(hcat, xs)</code> rather than <code>hcat(xs...)</code>. This will avoid the splatting penalty, and will hit the optimised <code>reduce</code> method.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../ecosystem/">« The Julia Ecosystem</a><a class="docs-footer-nextpage" href="../community/">Community »</a></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <span class="colophon-date" title="Saturday 14 March 2020 18:23">Saturday 14 March 2020</span>. Using Julia version 1.3.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>