Commit Graph

2261 Commits

Author SHA1 Message Date
Troels Arnfred Bojesen af96a197c1 Fix Glorot initialization
Should fix #442
2019-11-20 13:20:42 +09:00
Troels Arnfred Bojesen 2b80573248 Fix Glorot initialization, add He initialization
Should fix #442 .
Adds He weight initialization as a bonus :-)
2019-11-19 18:16:29 +09:00
Troels Arnfred Bojesen 4530ac65c7 Fix Glorot initialization, add He initialization
Should fix the issue reported at https://github.com/FluxML/Flux.jl/issues/442 .
Adds He weight initialization as a bonus :-)
2019-11-19 16:50:40 +09:00
Mike J Innes 967cc1c175
Merge pull request #927 from heliosdrm/patch-1
Extend docs about `train!`
2019-11-18 12:22:16 +00:00
Helios De Rosario a0e3729679
Update docs/src/training/training.md
Co-Authored-By: Mike J Innes <mike.j.innes@gmail.com>
2019-11-15 21:17:45 +01:00
bors[bot] 7eb6a0c98c
Merge #932
932: Travis: test on 1.0 r=MikeInnes a=MikeInnes



Co-authored-by: Mike J Innes <mike.j.innes@gmail.com>
Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
2019-11-15 16:21:30 +00:00
Mike Innes e24215ca98 guard test on 1.0 2019-11-15 15:59:42 +00:00
Mike J Innes 665e441919 pkg up 2019-11-15 12:12:28 +00:00
Mike J Innes 9d6f6fdaa3
Merge pull request #926 from janEbert/bc-cuda-fix
Fix binarycrossentropy on CuArrays
2019-11-15 13:05:52 +01:00
Mike J Innes 2471596cdb test on 1.0 2019-11-15 11:50:13 +00:00
Helios De Rosario ba4e3be0d3
explanations about params in `train!` 2019-11-14 16:22:31 +01:00
Helios De Rosario 074eb47246
Update training.md 2019-11-12 23:29:38 +01:00
Helios De Rosario 7e1ffd6507
Extend docs about `train!`
Related to #921: explain why it is not needed to pass the model as argument.
2019-11-08 21:39:00 +01:00
janEbert a00d8d94ec Add test for CUDA binarycrossentropy 2019-11-08 17:28:54 +01:00
janEbert 3dceef427f Fix binarycrossentropy on CuArrays 2019-11-08 16:48:11 +01:00
Tim Besard 9d05afaccc
Merge pull request #922 from FluxML/tb/backward
Restore Julia 1.0 compatibility.
2019-11-06 20:15:31 +01:00
Tim Besard 8a0745faab Restore Julia 1.0 compatibility. 2019-11-06 18:53:45 +01:00
bors[bot] 84d4ab083d
Merge #920
920: use release versions of packages r=MikeInnes a=MikeInnes

bors r+

Co-authored-by: Mike J Innes <mike.j.innes@gmail.com>
2019-11-06 12:23:44 +00:00
Mike J Innes 61078f3ef0 use release versions of packages 2019-11-06 12:23:12 +00:00
Tim Besard 08804a06d2
Merge pull request #916 from FluxML/tb/runtime_use_cuda
Check for CUDA availability at run time.
2019-11-06 09:46:39 +01:00
Tim Besard c9f369de86 Update packages. 2019-11-06 07:53:20 +01:00
Tim Besard 6e8f8c1f46 Use latest GPU CI templates. 2019-11-04 16:41:57 +01:00
Tim Besard 916d3dabbd Bump Julia version. 2019-11-04 15:51:33 +01:00
Tim Besard 33d276cdb7 Fix GPU-less tests. 2019-11-04 15:51:33 +01:00
Tim Besard dbcdf4d1bd Bump GPU packages. 2019-11-04 15:51:33 +01:00
Tim Besard a82b76cf24 Conditionally include the CUDNN glue code. 2019-11-04 15:27:11 +01:00
Tim Besard 39ab740fb7 Check for CUDA availability at run time. 2019-11-02 11:18:06 +01:00
bors[bot] 7104fd9332
Merge #907
907: Change `gate` function to `view` instead of copy r=MikeInnes a=janEbert

This speeds up code with large inputs by quite a lot. I only added it to the function accepting an `AbstractVector` as input as copying matrices may be faster than viewing them due to caching (they are sliced per row so will the data will not necessarily have a low stride).

Co-authored-by: janEbert <janpublicebert@posteo.net>
2019-10-24 11:06:41 +00:00
janEbert 7b41bc4ab5 Change `gate` function to `view` instead of copy
Only for vector input as copying a matrix may be more efficient due to
caching. A matrix is sliced per row, meaning the view will not be
aligned.
2019-10-24 12:45:22 +02:00
bors[bot] 645aa04464
Merge #898
898: Fix problem in crossentropy breaking GPU compilation r=MikeInnes a=kshyatt

Trying to run this simple example
```
using Flux, CuArrays
using Flux: crossentropy
model = Chain(
        Dense(728, 128, σ),
        LSTM(128, 256),
        LSTM(256, 128),
        Dense(128, 10),
        softmax) |> gpu
data = [rand(728) for i in 1:100];
out  = [rand(10) for i in 1:100];
loss(x, y) = crossentropy(model(x), y);
Flux.train!(loss, params(model), zip(gpu.(data), gpu.(out)), ADAM())
```
Old version of `crossentropy`:
```
ERROR: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel.  .args is of type Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}} which is not isbits.
    .1 is of type Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}} which is not isbits.
      .x is of type Array{Float32,1} which is not isbits.


Stacktrace:
 [1] check_invocation(::CUDAnative.CompilerJob, ::LLVM.Function) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/validation.jl:70
 [2] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:187 [inlined]
 [3] macro expansion at /mnt/home/khyatt/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
 [4] #codegen#136(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.codegen), ::Symbol, ::CUDAnative.CompilerJob) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:186
 [5] #codegen at ./none:0 [inlined]
 [6] #compile#135(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.compile), ::Symbol, ::CUDAnative.CompilerJob) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/compiler/driver.jl:47
 [7] #compile#134 at ./none:0 [inlined]
 [8] #compile at ./none:0 [inlined] (repeats 2 times)
 [9] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:389 [inlined]
 [10] #cufunction#176(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::GPUArrays.var"#23#24", ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}}}) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:357
 [11] cufunction(::Function, ::Type) at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:357
 [12] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:174 [inlined]
 [13] macro expansion at ./gcutils.jl:91 [inlined]
 [14] macro expansion at /mnt/home/khyatt/.julia/dev/CUDAnative/src/execution.jl:171 [inlined]
 [15] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Float32,1}, ::Tuple{CuArray{Float32,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(*),Tuple{Base.Broadcast.Extruded{Array{Float32,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(conj),Tuple{Base.Broadcast.Extruded{CuArray{Float32,1},Tuple{Bool},Tuple{Int64}}}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /mnt/home/khyatt/.julia/dev/CuArrays/src/gpuarray_interface.jl:60
 [16] gpu_call at /mnt/home/khyatt/.julia/dev/GPUArrays/src/abstract_gpu_interface.jl:151 [inlined]
 [17] gpu_call at /mnt/home/khyatt/.julia/dev/GPUArrays/src/abstract_gpu_interface.jl:128 [inlined]
 [18] copyto! at /mnt/home/khyatt/.julia/dev/GPUArrays/src/broadcast.jl:48 [inlined]
 [19] copyto! at ./broadcast.jl:863 [inlined]
 [20] copy at ./broadcast.jl:839 [inlined]
 [21] materialize at ./broadcast.jl:819 [inlined]
 [22] (::Zygote.var"#1310#1311"{CuArray{Float32,1},CuArray{Float32,1}})(::Array{Float32,1}) at /mnt/home/khyatt/.julia/dev/Zygote/src/lib/broadcast.jl:68
```
New version:
```
julia> Flux.train!(loss, params(model), zip(gpu.(data), gpu.(out)), ADAM())

julia> # everyone finished happily and went on with their lives
```

Co-authored-by: Katharine Hyatt <khyatt@flatironinstitute.org>
2019-10-23 14:31:53 +00:00
Katharine Hyatt 8913c9c741 Make the vector of weights test pass on GPU 2019-10-23 09:53:09 -04:00
Katharine Hyatt f7ce717aaa Add tests 2019-10-23 09:22:22 -04:00
Katharine Hyatt e0c1c0e057 Fix problem in crossentropy breaking GPU compilation 2019-10-22 14:00:57 -04:00
bors[bot] fa5737fb5c
Merge #904
904: Documenting Optimiser Interface r=MikeInnes a=MikeInnes

I needed to add one extra commit to #875 before merging.

Co-authored-by: Dhairya Gandhi <dhairya@juliacopmuting.com>
Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
2019-10-22 12:38:19 +00:00
Mike Innes 7ead2d6c7b typo 2019-10-22 13:36:39 +01:00
Dhairya Gandhi a9955fec8a correct train! syntax 2019-10-22 16:25:55 +05:30
bors[bot] b03f34dcb6
Merge #902
902: Backticks and examples for normalise r=MikeInnes a=kshyatt



Co-authored-by: Katharine Hyatt <khyatt@flatironinstitute.org>
2019-10-21 14:35:45 +00:00
Katharine Hyatt b8b4bc48b9 Backticks and examples for normalise 2019-10-21 10:31:44 -04:00
Dhairya Gandhi 776023ddad fixes 2019-10-10 20:35:28 +05:30
Dhairya Gandhi 4477dd8d54 reviews 2019-10-10 20:27:11 +05:30
Dhairya Gandhi a55878453c
typo
Co-Authored-By: Mike J Innes <mike.j.innes@gmail.com>
2019-10-10 20:16:29 +05:30
Dhairya Gandhi 623ee2c29c
typo
Co-Authored-By: Mike J Innes <mike.j.innes@gmail.com>
2019-10-10 20:16:00 +05:30
Dhairya Gandhi f19066ee29 more docstrings 2019-10-10 16:48:12 +05:30
Dhairya Gandhi fe52689cfe in depth docstrings 2019-10-09 16:16:11 +05:30
bors[bot] af0dcb2c63
Merge #882
882: Check if CUDA availability changed during init. r=MikeInnes a=maleadt

With this PR, Flux checks using CUDAapi if CUDA is available during initialization, and forces recompilation if that does not agree with what was decided during precompilation. This avoids the scenario where Flux was precompiled without GPU support, consequently not allowing use of the GPU even if the user fixed his CUDA/GPU set-up because that does not force recompilation (and we can't add precompilation dependencies on stuff that doesn't exist).

However, we can't do the same for the case where we have a GPU/CUDA but CuArrays fails to import (checking if it imports during `__init__` would be much too expensive, if even possible), so this PR removes support for having CUDA/a GPU but CuArrays being broken. That's a little risky now that Flux depends on CuArrays, but the package is pretty mature and I haven't seen many bug reports failing to load it recently.

Fixes https://github.com/FluxML/Flux.jl/pull/852#issuecomment-538028314

cc @MikeInnes @xukai92

Co-authored-by: Tim Besard <tim.besard@gmail.com>
2019-10-08 13:24:49 +00:00
Dhairya Gandhi b503741651 expanded docstrings 2019-10-04 14:46:03 +05:30
Tim Besard 8aea15e6e0 Demote to const variables. 2019-10-03 21:28:55 +02:00
Tim Besard 2369b2b3fd Add an environment variable to disable CUDA usage. 2019-10-03 21:27:54 +02:00
Tim Besard 63d196aa37 Check if CUDA availability changed during init. 2019-10-03 20:05:32 +02:00
bors[bot] 0d3aa8fa5e
Merge #877
877: Fix functor's `params!` to work with complex numbers r=MikeInnes a=PhilipVinc

I believe you forgot to define `params!` for complex-valued arrays.

If I'm wrong, feel free to close this.

Co-authored-by: Filippo Vicentini <filippovicentini@gmail.com>
2019-10-01 15:11:55 +00:00