Flux.jl/dev/performance/index.html

<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Performance Tips · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview');
</script><link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/default.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link href="../assets/documenter.css" rel="stylesheet" type="text/css"/><link href="../assets/flux.css" rel="stylesheet" type="text/css"/></head><body><nav class="toc"><h1>Flux</h1><select id="version-selector" onChange="window.location.href=this.value" style="visibility: hidden"></select><form class="search" id="search-form" action="../search/"><input id="search-query" name="q" type="text" placeholder="Search docs"/></form><ul><li><a class="toctext" href="../">Home</a></li><li><span class="toctext">Building Models</span><ul><li><a class="toctext" href="../models/basics/">Basics</a></li><li><a class="toctext" href="../models/recurrence/">Recurrence</a></li><li><a class="toctext" href="../models/regularisation/">Regularisation</a></li><li><a class="toctext" href="../models/layers/">Model Reference</a></li></ul></li><li><span class="toctext">Training Models</span><ul><li><a class="toctext" href="../training/optimisers/">Optimisers</a></li><li><a class="toctext" href="../training/training/">Training</a></li></ul></li><li><a class="toctext" href="../data/onehot/">One-Hot Encoding</a></li><li><a class="toctext" href="../gpu/">GPU Support</a></li><li><a class="toctext" href="../saving/">Saving &amp; Loading</a></li><li class="current"><a class="toctext" href>Performance Tips</a><ul class="internal"><li><a class="toctext" href="#Don&#39;t-use-more-precision-than-you-need.-1">Don&#39;t use more precision than you need.</a></li><li><a class="toctext" href="#Make-sure-your-custom-activation-functions-preserve-the-type-of-their-inputs-1">Make sure your custom activation functions preserve the type of their inputs</a></li><li><a class="toctext" href="#Evaluate-batches-as-Matrices-of-features,-rather-than-sequences-of-Vector-features-1">Evaluate batches as Matrices of features, rather than sequences of Vector features</a></li></ul></li><li><span class="toctext">Internals</span><ul><li><a class="toctext" href="../internals/tracker/">Backpropagation</a></li></ul></li><li><a class="toctext" href="../community/">Community</a></li></ul></nav><article id="docs"><header><nav><ul><li><a href>Performance Tips</a></li></ul><a class="edit-page" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/performance.md"><span class="fa"></span> Edit on GitHub</a></nav><hr/><div id="topbar"><span>Performance Tips</span><a class="fa fa-bars" href="#"></a></div></header><h1><a class="nav-anchor" id="Performance-Tips-1" href="#Performance-Tips-1">Performance Tips</a></h1><p>All the usual <a href="https://docs.julialang.org/en/v1/manual/performance-tips/">Julia performance tips apply</a>. As always <a href="https://docs.julialang.org/en/v1/manual/profile/#Profiling-1">profiling your code</a> is generally a useful way of finding bottlenecks. Below follow some Flux specific tips/reminders.</p><h2><a class="nav-anchor" id="Don&#39;t-use-more-precision-than-you-need.-1" href="#Don&#39;t-use-more-precision-than-you-need.-1">Don&#39;t use more precision than you need.</a></h2><p>Flux works great with all kinds of number types. But often you do not need to be working with say <code>Float64</code> (let alone <code>BigFloat</code>). Switching to <code>Float32</code> can give you a significant speed up, not because the operations are faster, but because the memory usage is halved. Which means allocations occur much faster. And yo
    sum(zip(xs, ys)) do (x, y_target)
        y_pred = model(x) #  evaluate the model
        return loss(y_pred, y_target)
    end
end</code></pre><p>It is much faster to concatenate them into a matrix, as this will hit BLAS matrix-matrix multiplication, which is much faster than the equivalent sequence of matrix-vector multiplications. Even though this means allocating new memory to store them contiguously.</p><pre><code class="language-julia">x_batch = reduce(hcat, xs)
y_batch = reduce(hcat, ys)
...
function loss_total(x_batch::Matrix, y_batch::Matrix)
    y_preds = model(x_batch)
    sum(loss.(y_preds, y_batch))
end</code></pre><p>When doing this kind of concatenation use <code>reduce(hcat, xs)</code> rather than <code>hcat(xs...)</code>. This will avoid the splatting penalty, and will hit the optimised <code>reduce</code> method.</p><footer><hr/><a class="previous" href="../saving/"><span class="direction">Previous</span><span class="title">Saving &amp; Loading</span></a><a class="next" href="../internals/tracker/"><span class="direction">Next</span><span class="title">Backpropagation</span></a></footer></article></body></html>
build based on ebf50f4 2019-02-19 15:20:47 +00:00			`<!DOCTYPE html>`
			`<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Performance Tips · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]\|\|function(){`
			`(i[r].q=i[r].q\|\|[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),`
			`m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)`
			`})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');`

			`ga('create', 'UA-36890222-9', 'auto');`
			`ga('send', 'pageview');`
build based on b3bba4c 2019-07-08 12:20:39 +00:00			</script><link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/><link href="https://fonts.googleapis.com/css?family=Lato\|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/default.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link href="../assets/documenter.css" rel="stylesheet" type="text/css"/><link href="../assets/flux.css" rel="stylesheet" type="text/css"/></head><body><nav class="toc"><h1>Flux</h1><select id="version-selector" onChange="window.location.href=this.value" style="visibility: hidden"></select><form class="search" id="search-form" action="../search/"><input id="search-query" name="q" type="text" placeholder="Search docs"/></form><ul><li><a class="toctext" href="../">Home</a></li><li><span class="toctext">Building Models</span><ul><li><a class="toctext" href="../models/basics/">Basics</a></li><li><a class="toctext" href="../models/recurrence/">Recurrence</a></li><li><a class="toctext" href="../models/regularisation/">Regularisation</a></li><li><a class="toctext" href="../models/layers/">Model Reference</a></li></ul></li><li><span class="toctext">Training Models</span><ul><li><a class="toctext" href="../training/optimisers/">Optimisers</a></li><li><a class="toctext" href="../training/training/">Training</a></li></ul></li><li><a class="toctext" href="../data/onehot/">One-Hot Encoding</a></li><li><a class="toctext" href="../gpu/">GPU Support</a></li><li><a class="toctext" href="../saving/">Saving & Loading</a></li><li class="current"><a class="toctext" href>Performance Tips</a><ul class="internal"><li><a class="toctext" href="#Don't-use-more-precision-than-you-need.-1">Don't use more precision than you need.</a></li><li><a class="toctext" href="#Make-sure-your-custom-activation-functions-preserve-the-type-of-their-inputs-1">Make sure your custom activation functions preserve the type of their inputs</a></li><li><a class="toctext" href="#Evaluate-batches-as-Matrices-of-features,-rather-than-sequences-of-Vector-features-1">Evaluate batches as Matrices of features, rather than sequences of Vector features</a></li></ul></li><li><span class="toctext">Internals</span><ul><li><a class="toctext" href="../internals/tracker/">Backpropagation</a></li></ul></li><li><a class="toctext" href="../community/">Community</a></li></ul></nav><article id="docs"><header><nav><ul><li><a href>Performance Tips</a></li></ul><a class="edit-page" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/performance.md"><span class="fa"></span> Edit on GitHub</a></nav><hr/><div id="topbar"><span>Performance Tips</span><a class="fa fa-bars" href="#"></a></div></header><h1><a class="nav-anchor" id="Performance-Tips-1" href="#Performance-Tips-1">Performance Tips</a></h1><p>All the usual <a href="https://docs.julialang.org/en/v1/manual/performance-tips/">Julia performance tips apply</a>. As always <a href="https://docs.julialang.org/en/v1/manual/profile/#Profiling-1">profiling your code</a> is generally a useful way of finding bottlenecks. Below follow some Flux specific tips/reminders.</p><h2><a class="nav-anchor" id="Don't-use-more-precision-than-you-need.-1" href="#Don't-use-more-precision-than-you-need.-1">Don't use more precision than you need.</a></h2><p>Flux works great with all kinds of number types. But often you do not need to be working with say <code>Float64</code> (let alone <code>BigFloat</code>). Switching to <code>Float32</code> can give you a significant speed up, not because the operations are faster, but because the memory usage is halved. Which means allocations occur much faster. And yo
build based on ebf50f4 2019-02-19 15:20:47 +00:00			`sum(zip(xs, ys)) do (x, y_target)`
			`y_pred = model(x) # evaluate the model`
			`return loss(y_pred, y_target)`
			`end`
			`end</code></pre><p>It is much faster to concatenate them into a matrix, as this will hit BLAS matrix-matrix multiplication, which is much faster than the equivalent sequence of matrix-vector multiplications. Even though this means allocating new memory to store them contiguously.</p><pre><code class="language-julia">x_batch = reduce(hcat, xs)`
			`y_batch = reduce(hcat, ys)`
			`...`
			`function loss_total(x_batch::Matrix, y_batch::Matrix)`
			`y_preds = model(x_batch)`
			`sum(loss.(y_preds, y_batch))`
build based on b47238e 2019-06-12 06:14:01 +00:00			end</code></pre><p>When doing this kind of concatenation use <code>reduce(hcat, xs)</code> rather than <code>hcat(xs...)</code>. This will avoid the splatting penalty, and will hit the optimised <code>reduce</code> method.</p><footer><hr/><a class="previous" href="../saving/"><span class="direction">Previous</span><span class="title">Saving & Loading</span></a><a class="next" href="../internals/tracker/"><span class="direction">Next</span><span class="title">Backpropagation</span></a></footer></article></body></html>