(x ≥ 0 ? x : α * (exp(x/α) - 1))</code></pre><p>Continuously Differentiable Exponential Linear Units See <ahref="https://arxiv.org/pdf/1704.07483.pdf">Continuously Differentiable Exponential Linear Units</a>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.elu"href="#NNlib.elu"><code>NNlib.elu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">elu(x, α=1) =
x > 0 ? x : α * (exp(x) - 1)</code></pre><p>Exponential Linear Unit activation function. See <ahref="https://arxiv.org/abs/1511.07289">Fast and Accurate Deep Network Learning by Exponential Linear Units</a>. You can also specify the coefficient explicitly, e.g. <code>elu(x, 1)</code>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.gelu"href="#NNlib.gelu"><code>NNlib.gelu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">gelu(x) = 0.5x * (1 + tanh(√(2/π) * (x + 0.044715x^3)))</code></pre><p><ahref="https://arxiv.org/pdf/1606.08415.pdf">Gaussian Error Linear Unit</a> activation function.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.hardsigmoid"href="#NNlib.hardsigmoid"><code>NNlib.hardsigmoid</code></a> — <spanclass="docstring-category">Function</span></header><section><div><p>hardσ(x, a=0.2) = max(0, min(1.0, a * x + 0.5))</p><p>Segment-wise linear approximation of sigmoid See: <ahref="https://arxiv.org/pdf/1511.00363.pdf">BinaryConnect: Training Deep Neural Networks withbinary weights during propagations</a></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.hardtanh"href="#NNlib.hardtanh"><code>NNlib.hardtanh</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">hardtanh(x) = max(-1, min(1, x))</code></pre><p>Segment-wise linear approximation of tanh. Cheaper and more computational efficient version of tanh. See: (http://ronan.collobert.org/pub/matos/2004<em>phdthesis</em>lip6.pdf)</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.leakyrelu"href="#NNlib.leakyrelu"><code>NNlib.leakyrelu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">leakyrelu(x, a=0.01) = max(a*x, x)</code></pre><p>Leaky <ahref="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">Rectified Linear Unit</a> activation function. You can also specify the coefficient explicitly, e.g. <code>leakyrelu(x, 0.01)</code>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.lisht"href="#NNlib.lisht"><code>NNlib.lisht</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">lisht(x) = x * tanh(x)</code></pre><p>Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function See <ahref="https://arxiv.org/abs/1901.05894">LiSHT</a></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.logcosh"href="#NNlib.logcosh"><code>NNlib.logcosh</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">logcosh(x)</code></pre><p>Return <code>log(cosh(x))</code> which is computed in a numerically stable way.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.logsigmoid"href="#NNlib.logsigmoid"><code>NNlib.logsigmoid</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">logσ(x)</code></pre><p>Return <code>log(σ(x))</code> which is computed in a numerically stable way.</p><pre><codeclass="language-none">julia> logσ(0)
-3.720075976020836e-44</code></pre></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.mish"href="#NNlib.mish"><code>NNlib.mish</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">mish(x) = x * tanh(softplus(x))</code></pre><p>Self Regularized Non-Monotonic Neural Activation Function See <ahref="https://arxiv.org/abs/1908.08681">Mish: A Self Regularized Non-Monotonic Neural Activation Function</a>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.relu"href="#NNlib.relu"><code>NNlib.relu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">relu(x) = max(0, x)</code></pre><p><ahref="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">Rectified Linear Unit</a> activation function.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.relu6"href="#NNlib.relu6"><code>NNlib.relu6</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">relu6(x) = min(max(0, x), 6)</code></pre><p><ahref="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">Rectified Linear Unit</a> activation function capped at 6. See <ahref="http://www.cs.utoronto.ca/%7Ekriz/conv-cifar10-aug2010.pdf">Convolutional Deep Belief Networks on CIFAR-10</a></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.rrelu"href="#NNlib.rrelu"><code>NNlib.rrelu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">rrelu(x, l=1/8, u=1/3) = max(a*x, x)
a = randomly sampled from uniform distribution U(l, u)</code></pre><p>Randomized Leaky <ahref="https://arxiv.org/pdf/1505.00853.pdf">Rectified Linear Unit</a> activation function. You can also specify the bound explicitly, e.g. <code>rrelu(x, 0.0, 1.0)</code>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.selu"href="#NNlib.selu"><code>NNlib.selu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">selu(x) = λ * (x ≥ 0 ? x : α * (exp(x) - 1))
(x ≥ λ ? x - λ : (-λ ≥ x ? x + λ : 0))</code></pre><p>See <ahref="https://www.gabormelli.com/RKB/Softshrink_Activation_Function">Softshrink Activation Function</a></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.softsign"href="#NNlib.softsign"><code>NNlib.softsign</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">softsign(x) = x / (1 + |x|)</code></pre><p>See <ahref="http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/205">Quadratic Polynomials Learn Better Image Features</a>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.swish"href="#NNlib.swish"><code>NNlib.swish</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">swish(x) = x * σ(x)</code></pre><p>Self-gated activation function. See <ahref="https://arxiv.org/pdf/1710.05941.pdf">Swish: a Self-Gated Activation Function</a>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.tanhshrink"href="#NNlib.tanhshrink"><code>NNlib.tanhshrink</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">tanhshrink(x) = x - tanh(x)</code></pre><p>See <ahref="https://www.gabormelli.com/RKB/Tanhshrink_Activation_Function">Tanhshrink Activation Function</a></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.trelu"href="#NNlib.trelu"><code>NNlib.trelu</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">trelu(x, theta = 1.0) = x > theta ? x : 0</code></pre><p>Threshold Gated Rectified Linear See <ahref="https://arxiv.org/pdf/1402.3337.pdf">ThresholdRelu</a></p></div></section></article><h2id="Softmax-1"><aclass="docs-heading-anchor"href="#Softmax-1">Softmax</a><aclass="docs-heading-anchor-permalink"href="#Softmax-1"title="Permalink"></a></h2><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.softmax"href="#NNlib.softmax"><code>NNlib.softmax</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">softmax(x; dims=1)</code></pre><p><ahref="https://en.wikipedia.org/wiki/Softmax_function">Softmax</a> turns input array <code>x</code> into probability distributions that sum to 1 along the dimensions specified by <code>dims</code>. It is semantically equivalent to the following:</p><pre><codeclass="language-none">softmax(x; dims=1) = exp.(x) ./ sum(exp.(x), dims=dims)</code></pre><p>with additional manipulations enhancing numerical stability.</p><p>For a matrix input <code>x</code> it will by default (<code>dims=1</code>) treat it as a batch of vectors, with each column independent. Keyword <code>dims=2</code> will instead treat rows independently, etc...</p><pre><codeclass="language-julia-repl">julia> softmax([1, 2, 3])
0.665241</code></pre><p>See also <ahref="#NNlib.logsoftmax"><code>logsoftmax</code></a>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.logsoftmax"href="#NNlib.logsoftmax"><code>NNlib.logsoftmax</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">logsoftmax(x; dims=1)</code></pre><p>Computes the log of softmax in a more numerically stable way than directly taking <code>log.(softmax(xs))</code>. Commonly used in computing cross entropy loss.</p><p>It is semantically equivalent to the following:</p><pre><codeclass="language-none">logsoftmax(x; dims=1) = x .- log.(sum(exp.(x), dims=dims))</code></pre><p>See also <ahref="#NNlib.softmax"><code>softmax</code></a>.</p></div></section></article><h2id="Pooling-1"><aclass="docs-heading-anchor"href="#Pooling-1">Pooling</a><aclass="docs-heading-anchor-permalink"href="#Pooling-1"title="Permalink"></a></h2><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.maxpool"href="#NNlib.maxpool"><code>NNlib.maxpool</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">maxpool(x, k::NTuple; pad=0, stride=k)</code></pre><p>Perform max pool operation with window size <code>k</code> on input tensor <code>x</code>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.meanpool"href="#NNlib.meanpool"><code>NNlib.meanpool</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">meanpool(x, k::NTuple; pad=0, stride=k)</code></pre><p>Perform mean pool operation with window size <code>k</code> on input tensor <code>x</code>.</p></div></section></article><h2id="Convolution-1"><aclass="docs-heading-anchor"href="#Convolution-1">Convolution</a><aclass="docs-heading-anchor-permalink"href="#Convolution-1"title="Permalink"></a></h2><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.conv"href="#NNlib.conv"><code>NNlib.conv</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">conv(x, w; stride=1, pad=0, dilation=1, flipped=false)</code></pre><p>Apply convolution filter <code>w</code> to input <code>x</code>. <code>x</code> and <code>w</code> are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively. </p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.depthwiseconv"href="#NNlib.depthwiseconv"><code>NNlib.depthwiseconv</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">depthwiseconv(x, w; stride=1, pad=0, dilation=1, flipped=false)</code></pre><p>Depthwise convolution operation with filter <code>w</code> on input <code>x</code>. <code>x</code> and <code>w</code> are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively. </p></div></section></article><h2id="Batched-Operations-1"><aclass="docs-heading-anchor"href="#Batched-Operations-1">Batched Operations</a><aclass="docs-heading-anchor-permalink"href="#Batched-Operations-1"title="Permalink"></a></h2><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.batched_mul"href="#NNlib.batched_mul"><code>NNlib.batched_mul</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">batched_mul(A, B) -> C</code></pre><p>Batched matrix multiplication. Result has <code>C[:,:,k] == A[:,:,k] * B[:,:,k]</code> for all <code>k</code>.</p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.batched_mul!"href="#NNlib.batched_mul!"><code>NNlib.batched_mul!</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">batched_mul!(C, A, B) -> C</code></pre><p>In-place batched matrix multiplication, equivalent to <code>mul!(C[:,:,k], A[:,:,
batched_adjoint(A)</code></pre><p>Equivalent to applying <code>transpose</code> or <code>adjoint</code> to each matrix <code>A[:,:,k]</code>.</p><p>These exist to control how <code>batched_mul</code> behaves, as it operated on such matrix slices of an array with <code>ndims(A)==3</code>.</p><pre><codeclass="language-none">BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}</code></pre><p>Lazy wrappers analogous to <code>Transpose</code> and <code>Adjoint</code>, returned by <code>batched_transpose</code></p></div></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="NNlib.batched_transpose"href="#NNlib.batched_transpose"><code>NNlib.batched_transpose</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia">batched_transpose(A::AbstractArray{T,3})
batched_adjoint(A)</code></pre><p>Equivalent to applying <code>transpose</code> or <code>adjoint</code> to each matrix <code>A[:,:,k]</code>.</p><p>These exist to control how <code>batched_mul</code> behaves, as it operated on such matrix slices of an array with <code>ndims(A)==3</code>.</p><pre><codeclass="language-none">BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}</code></pre><p>Lazy wrappers analogous to <code>Transpose</code> and <code>Adjoint</code>, returned by <code>batched_transpose</code></p></div></section></article></article><navclass="docs-footer"><aclass="docs-footer-prevpage"href="../advanced/">« Advanced Model Building</a><aclass="docs-footer-nextpage"href="../../data/onehot/">One-Hot Encoding »</a></nav></div><divclass="modal"id="documenter-settings"><divclass="modal-background"></div><divclass="modal-card"><headerclass="modal-card-head"><pclass="modal-card-title">Settings</p><buttonclass="delete"></button></header><sectionclass="modal-card-body"><p><labelclass="label">Theme</label><divclass="select"><selectid="documenter-themepicker"><optionvalue="documenter-light">documenter-light</option><optionvalue="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <ahref="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> on <spanclass="colophon-date"title="Monday 6 April 2020 14:20">Monday 6 April 2020</span>. Using Julia version 1.4.0.</p></section><footerclass="modal-card-foot"></footer></div></div></div></body></html>