Merge pull request #767 from oxinabox/patch-6
Some cleanup on performance tips docs
This commit is contained in:
commit
bab618d168
@ -14,8 +14,8 @@ Which means allocations occur much faster.
|
|||||||
And you use less memory.
|
And you use less memory.
|
||||||
|
|
||||||
|
|
||||||
## Make sure your custom activation functions preserve the type of their inputs
|
## Make sure your activation and loss functions preserve the type of their inputs
|
||||||
Not only should your activation functions be [type-stable](https://docs.julialang.org/en/v1/manual/performance-tips/#Write-%22type-stable%22-functions-1),
|
Not only should your activation and loss functions be [type-stable](https://docs.julialang.org/en/v1/manual/performance-tips/#Write-%22type-stable%22-functions-1),
|
||||||
they should also preserve the type of their inputs.
|
they should also preserve the type of their inputs.
|
||||||
|
|
||||||
A very artificial example using an activation function like
|
A very artificial example using an activation function like
|
||||||
@ -26,6 +26,7 @@ A very artificial example using an activation function like
|
|||||||
|
|
||||||
will result in performance on `Float32` input orders of magnitude slower than the normal `tanh` would,
|
will result in performance on `Float32` input orders of magnitude slower than the normal `tanh` would,
|
||||||
because it results in having to use slow mixed type multiplication in the dense layers.
|
because it results in having to use slow mixed type multiplication in the dense layers.
|
||||||
|
Similar situations can occur in the loss function during backpropagation.
|
||||||
|
|
||||||
Which means if you change your data say from `Float64` to `Float32` (which should give a speedup: see above),
|
Which means if you change your data say from `Float64` to `Float32` (which should give a speedup: see above),
|
||||||
you will see a large slow-down
|
you will see a large slow-down
|
||||||
@ -60,7 +61,7 @@ end
|
|||||||
|
|
||||||
It is much faster to concatenate them into a matrix,
|
It is much faster to concatenate them into a matrix,
|
||||||
as this will hit BLAS matrix-matrix multiplication, which is much faster than the equivalent sequence of matrix-vector multiplications.
|
as this will hit BLAS matrix-matrix multiplication, which is much faster than the equivalent sequence of matrix-vector multiplications.
|
||||||
Even though this means allocating new memory to store them contiguously.
|
The improvement is enough that it is worthwhile allocating new memory to store them contiguously.
|
||||||
|
|
||||||
```julia
|
```julia
|
||||||
x_batch = reduce(hcat, xs)
|
x_batch = reduce(hcat, xs)
|
||||||
|
Loading…
Reference in New Issue
Block a user