Flux.jl/previews/PR1150/models/losses/index.html
2020-05-05 14:57:23 +00:00

19 lines
20 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Loss Functions · Flux</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview', {'page': location.pathname + location.search + location.hash});
</script><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.11.1/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="../.."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../../assets/documenter.js"></script><script src="../../siteinfo.js"></script><script src="../../../versions.js"></script><link href="../../assets/flux.css" rel="stylesheet" type="text/css"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-dark.css" data-theme-name="documenter-dark"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit">Flux</span></div><form class="docs-search" action="../../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../../">Home</a></li><li><span class="tocitem">Building Models</span><ul><li><a class="tocitem" href="../basics/">Basics</a></li><li><a class="tocitem" href="../recurrence/">Recurrence</a></li><li><a class="tocitem" href="../layers/">Model Reference</a></li><li class="is-active"><a class="tocitem" href>Loss Functions</a><ul class="internal"><li><a class="tocitem" href="#Loss-Functions-1"><span>Loss Functions</span></a></li></ul></li><li><a class="tocitem" href="../regularisation/">Regularisation</a></li><li><a class="tocitem" href="../advanced/">Advanced Model Building</a></li><li><a class="tocitem" href="../nnlib/">NNlib</a></li></ul></li><li><span class="tocitem">Handling Data</span><ul><li><a class="tocitem" href="../../data/onehot/">One-Hot Encoding</a></li><li><a class="tocitem" href="../../data/dataloader/">DataLoader</a></li></ul></li><li><span class="tocitem">Training Models</span><ul><li><a class="tocitem" href="../../training/optimisers/">Optimisers</a></li><li><a class="tocitem" href="../../training/training/">Training</a></li></ul></li><li><a class="tocitem" href="../../gpu/">GPU Support</a></li><li><a class="tocitem" href="../../saving/">Saving &amp; Loading</a></li><li><a class="tocitem" href="../../ecosystem/">The Julia Ecosystem</a></li><li><a class="tocitem" href="../../utilities/">Utility Functions</a></li><li><a class="tocitem" href="../../performance/">Performance Tips</a></li><li><a class="tocitem" href="../../datasets/">Datasets</a></li><li><a class="tocitem" href="../../community/">Community</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li><a class="is-disabled">Building Models</a></li><li class="is-active"><a href>Loss Functions</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Loss Functions</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/losses.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h2 id="Loss-Functions-1"><a class="docs-heading-anchor" href="#Loss-Functions-1">Loss Functions</a><a class="docs-heading-anchor-permalink" href="#Loss-Functions-1" title="Permalink"></a></h2><p>Flux provides a large number of common loss functions used for training machine learning models. </p><p>Loss functions for supervised learning typically expect as inputs a target <code>y</code>, and a prediction <code></code>. In Flux&#39;s convention, the order of the arguments is the following</p><pre><code class="language-julia">loss(ŷ, y)</code></pre><p>Most loss functions in Flux have an optional argument <code>agg</code>, denoting the type of aggregation performed over the batch: </p><pre><code class="language-julia">loss(ŷ, y) # defaults to `mean`
loss(ŷ, y, agg=sum) # use `sum` for reduction
loss(ŷ, y, agg=x-&gt;sum(x, dims=2)) # partial reduction
loss(ŷ, y, agg=x-&gt;mean(w .* x)) # weighted mean
loss(ŷ, y, agg=identity) # no aggregation. </code></pre><h3 id="Losses-Reference-1"><a class="docs-heading-anchor" href="#Losses-Reference-1">Losses Reference</a><a class="docs-heading-anchor-permalink" href="#Losses-Reference-1" title="Permalink"></a></h3><article class="docstring"><header><a class="docstring-binding" id="Flux.mae" href="#Flux.mae"><code>Flux.mae</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">mae(ŷ, y; agg=mean)</code></pre><p>Return the loss corresponding to mean absolute error: </p><pre><code class="language-none">agg(abs.(ŷ .- y))</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L1-L7">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.mse" href="#Flux.mse"><code>Flux.mse</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">mse(ŷ, y; agg=mean)</code></pre><p>Return the loss corresponding to mean square error: </p><pre><code class="language-none">agg((ŷ .- y).^2)</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L10-L16">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.msle" href="#Flux.msle"><code>Flux.msle</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">msle(ŷ, y; agg=mean, ϵ=eps(eltype(ŷ)))</code></pre><p>The loss corresponding to mean squared logarithmic errors, calculated as</p><pre><code class="language-none">agg((log.(ŷ .+ ϵ) .- log.(y .+ ϵ)).^2)</code></pre><p>The <code>ϵ</code> term provides numerical stability. Penalizes an under-predicted estimate more than an over-predicted estimate.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L19-L28">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.huber_loss" href="#Flux.huber_loss"><code>Flux.huber_loss</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">huber_loss(ŷ, y; δ=1, agg=mean)</code></pre><p>Return the mean of the <a href="https://en.wikipedia.org/wiki/Huber_loss">Huber loss</a> given the prediction <code></code> and true values <code>y</code>.</p><pre><code class="language-none"> | 0.5 * |ŷ - y|, for |ŷ - y| &lt;= δ
Huber loss = |
| δ * (|ŷ - y| - 0.5 * δ), otherwise</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L31-L40">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.crossentropy" href="#Flux.crossentropy"><code>Flux.crossentropy</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">crossentropy(ŷ, y; weight=nothing, dims=1, ϵ=eps(eltype(ŷ)),
logits=false, agg=mean)</code></pre><p>Return the cross entropy between the given probability distributions; calculated as</p><pre><code class="language-none">agg(.-sum(weight .* y .* log.(ŷ .+ ϵ); dims=dims))agg=mean,</code></pre><p><code>weight</code> can be <code>nothing</code>, a number or an array. <code>weight=nothing</code> acts like <code>weight=1</code> but is faster.</p><p>If <code>logits=true</code>, the input <code>̂y</code> is first fed to a <a href="../nnlib/#NNlib.softmax"><code>softmax</code></a> layer.</p><p>See also: <a href="#Flux.logitcrossentropy"><code>Flux.logitcrossentropy</code></a>, <a href="#Flux.binarycrossentropy"><code>Flux.binarycrossentropy</code></a>, <a href="#Flux.logitbinarycrossentropy"><code>Flux.logitbinarycrossentropy</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L52-L67">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.logitcrossentropy" href="#Flux.logitcrossentropy"><code>Flux.logitcrossentropy</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">logitcrossentropy(ŷ, y; weight=nothing, agg=mean, dims=1)</code></pre><p>Return the crossentropy computed after a <a href="../nnlib/#NNlib.logsoftmax"><code>Flux.logsoftmax</code></a> operation; calculated as</p><pre><code class="language-none">agg(.-sum(weight .* y .* logsoftmax(ŷ; dims=dims); dims=dims))</code></pre><p><code>logitcrossentropy(ŷ, y)</code> is mathematically equivalent to <a href="#Flux.crossentropy"><code>Flux.crossentropy(softmax(log.(ŷ)), y)</code></a> but it is more numerically stable.</p><p>See also: <a href="#Flux.crossentropy"><code>Flux.crossentropy</code></a>, <a href="#Flux.binarycrossentropy"><code>Flux.binarycrossentropy</code></a>, <a href="#Flux.logitbinarycrossentropy"><code>Flux.logitbinarycrossentropy</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L76-L88">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.binarycrossentropy" href="#Flux.binarycrossentropy"><code>Flux.binarycrossentropy</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">binarycrossentropy(ŷ, y; agg=mean, ϵ=epseltype(ŷ), logits=false)</code></pre><p>Return <span>$-y*\log(ŷ + ϵ) - (1-y)*\log(1-ŷ + ϵ)$</span>. The <code>ϵ</code> term provides numerical stability.</p><p>Typically, the prediction <code></code> is given by the output of a <a href="../nnlib/#NNlib.sigmoid"><code>sigmoid</code></a> activation. If <code>logits=true</code>, the input <code>̂y</code> is first fed to a <a href="../nnlib/#NNlib.sigmoid"><code>sigmoid</code></a> activation. See also: <a href="#Flux.crossentropy"><code>Flux.crossentropy</code></a>, <a href="#Flux.logitcrossentropy"><code>Flux.logitcrossentropy</code></a>, <a href="#Flux.logitbinarycrossentropy"><code>Flux.logitbinarycrossentropy</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L93-L101">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.logitbinarycrossentropy" href="#Flux.logitbinarycrossentropy"><code>Flux.logitbinarycrossentropy</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">logitbinarycrossentropy(ŷ, y; agg=mean)</code></pre><p><code>logitbinarycrossentropy(ŷ, y)</code> is mathematically equivalent to <a href="#Flux.binarycrossentropy"><code>Flux.binarycrossentropy(σ(log(ŷ)), y)</code></a> but it is more numerically stable.</p><p>See also: <a href="#Flux.crossentropy"><code>Flux.crossentropy</code></a>, <a href="#Flux.logitcrossentropy"><code>Flux.logitcrossentropy</code></a>, <a href="#Flux.binarycrossentropy"><code>Flux.binarycrossentropy</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L109-L116">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.kldivergence" href="#Flux.kldivergence"><code>Flux.kldivergence</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">kldivergence(ŷ, y; dims=1, agg=mean, ϵ=eps(eltype(ŷ)))</code></pre><p>Return the <a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence">Kullback-Leibler divergence</a> between the given arrays interpreted as probability distributions.</p><p>KL divergence is a measure of how much one probability distribution is different from the other. It is always non-negative and zero only when both the distributions are equal everywhere.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L121-L131">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.poisson_loss" href="#Flux.poisson_loss"><code>Flux.poisson_loss</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">poisson_loss(ŷ, y; agg=mean, ϵ=eps(eltype(ŷ))))</code></pre><p>Loss function derived from likelihood for a Poisson random variable with mean <code></code> to take value <code>y</code>. It is given by</p><pre><code class="language-none">agg(ŷ .- y .* log.(ŷ .+ ϵ))</code></pre><p><a href="https://peltarion.com/knowledge-center/documentation/modeling-view/build-an-ai-model/loss-functions/poisson">More information.</a>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L138-L147">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.hinge" href="#Flux.hinge"><code>Flux.hinge</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">hinge(ŷ, y; agg=mean)</code></pre><p>Return the <a href="https://en.wikipedia.org/wiki/Hinge_loss">hinge loss</a> given the prediction <code></code> and true labels <code>y</code> (containing 1 or -1); calculated as</p><pre><code class="language-none">agg(max.(0, 1 .- ŷ .* y))</code></pre><p>See also: <a href="#Flux.squared_hinge"><code>squared_hinge</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L150-L159">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.squared_hinge" href="#Flux.squared_hinge"><code>Flux.squared_hinge</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">squared_hinge(ŷ, y; agg=mean)</code></pre><p>Return the squared hinge loss given the prediction <code></code> and true labels <code>y</code> (containing 1 or -1); calculated as</p><pre><code class="language-none">agg(max.(0, 1 .- ŷ .* y).^2)</code></pre><p>See also: <a href="#Flux.hinge"><code>hinge</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L162-L171">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.dice_coeff_loss" href="#Flux.dice_coeff_loss"><code>Flux.dice_coeff_loss</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">dice_coeff_loss(ŷ, y; smooth=1, dims=size(ŷ)[1:end-1], agg=mean)</code></pre><p>Return a loss based on the Dice coefficient. Used in the <a href="https://arxiv.org/pdf/1606.04797v1.pdf">V-Net</a> architecture for image segmentation. Current implementation only works for the binary segmentation case.</p><p>The arrays <code></code> and <code>y</code> contain the predicted and true probabilities respectively for the foreground to be present in a certain pixel. The loss is computed as</p><pre><code class="language-none">1 - (2*sum(ŷ .* y; dims) .+ smooth) ./ (sum(ŷ.^2 .+ y.^2; dims) .+ smooth)</code></pre><p>and then aggregated with <code>agg</code> over the batch.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L174-L189">source</a></section></article><article class="docstring"><header><a class="docstring-binding" id="Flux.tversky_loss" href="#Flux.tversky_loss"><code>Flux.tversky_loss</code></a><span class="docstring-category">Function</span></header><section><div><pre><code class="language-julia">tversky_loss(ŷ, y; β=0.7, α=1-β, dims=size(ŷ)[1:end-1] agg=mean)</code></pre><p>Return the <a href="https://arxiv.org/pdf/1706.05721.pdf">Tversky loss</a> for binary classification. The arrays <code></code> and <code>y</code> contain the predicted and true probabilities respectively. Used with imbalanced data to give more weight to false negatives. Larger <code>β</code> weigh recall higher than precision (by placing more emphasis on false negatives) Calculated as:</p><pre><code class="language-none">num = sum(y .* ŷ, dims=dims)
den = sum(@.(ŷ*y + α*ŷ*(1-y) + β*(1-ŷ)*y)), dims=dims)
tversky_loss = 1 - num/den</code></pre><p>and then aggregated with <code>agg</code> over the batch.</p><p>When <code>α+β=1</code>, it is equal to <code>1-F_β</code>, where <code>F_β</code> is an F-score.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/FluxML/Flux.jl/blob/8e9cce94e9496c0de96ef85d7da676a1a5387565/src/layers/stateless.jl#L197-L214">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../layers/">« Model Reference</a><a class="docs-footer-nextpage" href="../regularisation/">Regularisation »</a></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <span class="colophon-date" title="Tuesday 5 May 2020 14:57">Tuesday 5 May 2020</span>. Using Julia version 1.3.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>