898: Fix problem in crossentropy breaking GPU compilation r=MikeInnes a=kshyatt
Trying to run this simple example
```
using Flux, CuArrays
using Flux: crossentropy
model = Chain(
Dense(728, 128, σ),
LSTM(128, 256),
LSTM(256, 128),
Dense(128, 10),
softmax) |> gpu
data = [rand(728) for i in 1:100];
out = [rand(10) for i in 1:100];
loss(x, y) = crossentropy(model(x), y);
Flux.train!(loss, params(model), zip(gpu.(data), gpu.(out)), ADAM())
```
Old version of `crossentropy`:
```
ERROR: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}) failed
KernelError: passing and using non-bitstype argument
Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel. .args is of type Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}} which is not isbits.
.1 is of type Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}} which is not isbits.
.x is of type Array{Float32,1} which is not isbits.
Stacktrace:
[1] check_invocation(::CUDAnative.CompilerJob, ::LLVM.Function) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/validation.jl:70
[2] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:187 [inlined]
[3] macro expansion at /mnt/home/khyatt/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
[4] #codegen#136(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.codegen), ::Symbol, ::CUDAnative.CompilerJob) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:186
[5] #codegen at ./none:0 [inlined]
[6] #compile#135(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.compile), ::Symbol, ::CUDAnative.CompilerJob) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:47
[7] #compile#134 at ./none:0 [inlined]
[8] #compile at ./none:0 [inlined] (repeats 2 times)
[9] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:389 [inlined]
[10] #cufunction#176(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::GPUArrays.var"#23#24", ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}}}) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:357
[11] cufunction(::Function, ::Type) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:357
[12] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:174 [inlined]
[13] macro expansion at ./gcutils.jl:91 [inlined]
[14] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:171 [inlined]
[15] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Float32,1}, ::Tuple{CuArray{Float32,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CuArray{Float32,1},Tuple{Bool},Tuple{Int64}}}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /mnt/home/khyatt/.julia/dev/CuArrays/src/gpuarray_interface.jl:60
[16] gpu_call at /mnt/home/khyatt/.julia/dev/GPUArrays/src/abstract_gpu_interface.jl:151 [inlined]
[17] gpu_call at /mnt/home/khyatt/.julia/dev/GPUArrays/src/abstract_gpu_interface.jl:128 [inlined]
[18] copyto! at /mnt/home/khyatt/.julia/dev/GPUArrays/src/broadcast.jl:48 [inlined]
[19] copyto! at ./broadcast.jl:863 [inlined]
[20] copy at ./broadcast.jl:839 [inlined]
[21] materialize at ./broadcast.jl:819 [inlined]
[22] (::Zygote.var"#1310#1311"{CuArray{Float32,1},CuArray{Float32,1}})(::Array{Float32,1}) at /mnt/home/khyatt/.julia/dev/Zygote/src/lib/broadcast.jl:68
```
New version:
```
julia> Flux.train!(loss, params(model), zip(gpu.(data), gpu.(out)), ADAM())
julia> # everyone finished happily and went on with their lives
```
Co-authored-by: Katharine Hyatt <khyatt@flatironinstitute.org>
904: Documenting Optimiser Interface r=MikeInnes a=MikeInnes
I needed to add one extra commit to #875 before merging.
Co-authored-by: Dhairya Gandhi <dhairya@juliacopmuting.com>
Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
882: Check if CUDA availability changed during init. r=MikeInnes a=maleadt
With this PR, Flux checks using CUDAapi if CUDA is available during initialization, and forces recompilation if that does not agree with what was decided during precompilation. This avoids the scenario where Flux was precompiled without GPU support, consequently not allowing use of the GPU even if the user fixed his CUDA/GPU set-up because that does not force recompilation (and we can't add precompilation dependencies on stuff that doesn't exist).
However, we can't do the same for the case where we have a GPU/CUDA but CuArrays fails to import (checking if it imports during `__init__` would be much too expensive, if even possible), so this PR removes support for having CUDA/a GPU but CuArrays being broken. That's a little risky now that Flux depends on CuArrays, but the package is pretty mature and I haven't seen many bug reports failing to load it recently.
Fixes https://github.com/FluxML/Flux.jl/pull/852#issuecomment-538028314
cc @MikeInnes @xukai92
Co-authored-by: Tim Besard <tim.besard@gmail.com>
865: Functor r=MikeInnes a=MikeInnes
This refactors our current `@treelike` infrastructure. It somewhat formalises what we're doing around the idea of a Flux model as a functor, i.e. something that can be mapped over.
This is much more flexible than what we had before, and avoids some issues. It allows layers to have state that isn't mappable; it allows for dispatch when walking the tree, which means layers like `BatchNorm` can have non-trainable parameters; and it also allows for zipped mapping like `fmap(+, xs, ys)`, which isn't implemented yet but will be useful for the new optimisers work.
The main downside is that the term `functor` has been previously used in the Julia community as a malapropism for "thing that behaves like a function"; but hopefully this can start to reduce that usage.
Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
Added missing export
Corrected channel placement
Dimension 4 cannot be assumed to always be the Channel dimension
Deprecation of `treelike`
Code now makes use of `@treelike` macro instead of the deprecated `treelike` function (it worked on my end because I'm on Julia 0.7, while Julia 1.0 deprecated stuff)
Update basic.jl
Renaming to SkipConnection
* Update Flux.jl
* Update basic.jl
Updated `SkipConnection` with a `connection` field
I'm pretty sure I broke something now, but this PR should follow along these lines `cat` needs special treatment (the user can declare his own `concatenate` connection, but I foresee it's going to be used often so we can simply define special treatment)
Forgot to remove some rebasing text
Forgot to remove some more rebasing text
Removed local copy and default cat method from the function calls
Adjusted some more types for inference, could improve on this as well
Re-placed some left-over spaces
563: noise shape for dropout r=MikeInnes a=chengchingwen
I add the noise shape for dropout, similar to the `noise_shape` argument in [`tf.nn.dropout`](https://www.tensorflow.org/api_docs/python/tf/nn/dropout)
Co-authored-by: chengchingwen <adgjl5645@hotmail.com>
Co-authored-by: Peter <adgjl5645@hotmail.com>