Loss Functions

Flux provides a large number of common loss functions used for training machine learning models.

Most loss functions in Flux have an optional argument agg, denoting the type of aggregation performed over the batch:

loss(ŷ, y; agg=mean)

Flux.mae — Function

mae(ŷ, y; agg=mean)

Return the loss corresponding to mean absolute error:

agg(abs.(ŷ .- y))

source

Flux.mse — Function

mse(ŷ, y; agg=mean)

Return the loss corresponding to mean square error:

agg((ŷ .- y).^2)

source

Flux.msle — Function

msle(ŷ, y; agg=mean, ϵ=eps(eltype(ŷ)))

The loss corresponding to mean squared logarithmic errors, calculated as

agg((log.(ŷ .+ ϵ) .- log.(y .+ ϵ)).^2)

The ϵ term provides numerical stability. Penalizes an under-predicted estimate more than an over-predicted estimate.

source

Flux.huber_loss — Function

huber_loss(ŷ, y; δ=1, agg=mean)

Return the mean of the Huber loss given the prediction ŷ and true values y.

             | 0.5 * |ŷ - y|,            for |ŷ - y| <= δ
Huber loss = |
             |  δ * (|ŷ - y| - 0.5 * δ), otherwise

source

Flux.crossentropy — Function

crossentropy(ŷ, y; weight=nothing, dims=1, ϵ=eps(eltype(ŷ)), agg=mean)

Return the cross entropy between the given probability distributions; calculated as

agg(.-sum(weight .* y .* log.(ŷ .+ ϵ); dims=dims))agg=mean,

weight can be nothing, a number or an array. weight=nothing acts like weight=1 but is faster.

source

Flux.logitcrossentropy — Function

logitcrossentropy(ŷ, y; weight=nothing, agg=mean, dims=1)

Return the crossentropy computed after a Flux.logsoftmax operation; calculated as

agg(.-sum(weight .* y .* logsoftmax(ŷ; dims=dims); dims=dims))

logitcrossentropy(ŷ, y) is mathematically equivalent to Flux.crossentropy(softmax(log.(ŷ)), y) but it is more numerically stable.

source

Flux.binarycrossentropy — Function

binarycrossentropy(ŷ, y; ϵ=eps(ŷ))

Return $-y*\log(ŷ + ϵ) - (1-y)*\log(1-ŷ + ϵ)$. The ϵ term provides numerical stability.

Typically, the prediction ŷ is given by the output of a sigmoid activation.

source

Flux.logitbinarycrossentropy — Function

logitbinarycrossentropy(ŷ, y; agg=mean)

logitbinarycrossentropy(ŷ, y) is mathematically equivalent to Flux.binarycrossentropy(σ(log(ŷ)), y) but it is more numerically stable.

source

Flux.kldivergence — Function

kldivergence(ŷ, y; dims=1, agg=mean, ϵ=eps(eltype(ŷ)))

Return the Kullback-Leibler divergence between the given arrays interpreted as probability distributions.

KL divergence is a measure of how much one probability distribution is different from the other. It is always non-negative and zero only when both the distributions are equal everywhere.

source

Flux.poisson_loss — Function

poisson_loss(ŷ, y; agg=mean, ϵ=eps(eltype(ŷ))))

Return how much the predicted distribution ŷ diverges from the expected Poisson

distribution y; calculated as sum(ŷ .- y .* log.(ŷ)) / size(y, 2).

REDO More information..

source

Flux.hinge — Function

hinge(ŷ, y; agg=mean)

Return the hinge loss given the prediction ŷ and true labels y (containing 1 or -1); calculated as agg(max.(0, 1 .- ŷ .* y)).