40 lines
5.3 KiB
HTML
40 lines
5.3 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Regularisation · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
|
||
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
|
||
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
|
||
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
|
||
|
||
ga('create', 'UA-36890222-9', 'auto');
|
||
ga('send', 'pageview');
|
||
</script><link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/default.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="../.."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../../assets/documenter.js"></script><script src="../../siteinfo.js"></script><script src="../../../versions.js"></script><link href="../../assets/documenter.css" rel="stylesheet" type="text/css"/><link href="../../assets/flux.css" rel="stylesheet" type="text/css"/></head><body><nav class="toc"><h1>Flux</h1><select id="version-selector" onChange="window.location.href=this.value" style="visibility: hidden"></select><form class="search" id="search-form" action="../../search/"><input id="search-query" name="q" type="text" placeholder="Search docs"/></form><ul><li><a class="toctext" href="../../">Home</a></li><li><span class="toctext">Building Models</span><ul><li><a class="toctext" href="../basics/">Basics</a></li><li><a class="toctext" href="../recurrence/">Recurrence</a></li><li class="current"><a class="toctext" href>Regularisation</a><ul class="internal"></ul></li><li><a class="toctext" href="../layers/">Model Reference</a></li></ul></li><li><span class="toctext">Training Models</span><ul><li><a class="toctext" href="../../training/optimisers/">Optimisers</a></li><li><a class="toctext" href="../../training/training/">Training</a></li></ul></li><li><a class="toctext" href="../../data/onehot/">One-Hot Encoding</a></li><li><a class="toctext" href="../../gpu/">GPU Support</a></li><li><a class="toctext" href="../../saving/">Saving & Loading</a></li><li><a class="toctext" href="../../performance/">Performance Tips</a></li><li><a class="toctext" href="../../community/">Community</a></li></ul></nav><article id="docs"><header><nav><ul><li>Building Models</li><li><a href>Regularisation</a></li></ul><a class="edit-page" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/regularisation.md"><span class="fa"></span> Edit on GitHub</a></nav><hr/><div id="topbar"><span>Regularisation</span><a class="fa fa-bars" href="#"></a></div></header><h1><a class="nav-anchor" id="Regularisation-1" href="#Regularisation-1">Regularisation</a></h1><p>Applying regularisation to model parameters is straightforward. We just need to apply an appropriate regulariser, such as <code>norm</code>, to each model parameter and add the result to the overall loss.</p><p>For example, say we have a simple regression.</p><pre><code class="language-julia">using Flux: crossentropy
|
||
m = Dense(10, 5)
|
||
loss(x, y) = crossentropy(softmax(m(x)), y)</code></pre><p>We can regularise this by taking the (L2) norm of the parameters, <code>m.W</code> and <code>m.b</code>.</p><pre><code class="language-julia">using LinearAlgebra
|
||
|
||
penalty() = norm(m.W) + norm(m.b)
|
||
loss(x, y) = crossentropy(softmax(m(x)), y) + penalty()</code></pre><p>When working with layers, Flux provides the <code>params</code> function to grab all parameters at once. We can easily penalise everything with <code>sum(norm, params)</code>.</p><pre><code class="language-julia">julia> params(m)
|
||
2-element Array{Any,1}:
|
||
param([0.355408 0.533092; … 0.430459 0.171498])
|
||
param([0.0, 0.0, 0.0, 0.0, 0.0])
|
||
|
||
julia> sum(norm, params(m))
|
||
26.01749952921026 (tracked)</code></pre><p>Here's a larger example with a multi-layer perceptron.</p><pre><code class="language-julia">m = Chain(
|
||
Dense(28^2, 128, relu),
|
||
Dense(128, 32, relu),
|
||
Dense(32, 10), softmax)
|
||
|
||
loss(x, y) = crossentropy(m(x), y) + sum(norm, params(m))
|
||
|
||
loss(rand(28^2), rand(10))</code></pre><p>One can also easily add per-layer regularisation via the <code>activations</code> function:</p><pre><code class="language-julia">julia> using Flux: activations
|
||
|
||
julia> c = Chain(Dense(10,5,σ),Dense(5,2),softmax)
|
||
Chain(Dense(10, 5, σ), Dense(5, 2), softmax)
|
||
|
||
julia> activations(c, rand(10))
|
||
3-element Array{Any,1}:
|
||
Float32[0.84682214, 0.6704139, 0.42177814, 0.257832, 0.36255655]
|
||
Float32[0.1501253, 0.073269576]
|
||
Float32[0.5192045, 0.48079553]
|
||
|
||
julia> sum(norm, ans)
|
||
2.1166067f0</code></pre><footer><hr/><a class="previous" href="../recurrence/"><span class="direction">Previous</span><span class="title">Recurrence</span></a><a class="next" href="../layers/"><span class="direction">Next</span><span class="title">Model Reference</span></a></footer></article></body></html>
|