1141: Speedup matmul of CuMatrix and OneHotMatrix r=CarloLucibello a=AStupidBear

This solves #189.

```julia
julia> using Flux


julia> using Flux: CuArrays

julia> A = zeros(300, 10000) |> gpu;

julia> B = Flux.onehotbatch(rand(1:10000, 256), 1:10000) |> gpu;

julia> A * B; CuArrays.@time A * B;
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`
└ @ GPUArrays ~/shared/.julia/packages/GPUArrays/OXvxB/src/host/indexing.jl:43
  0.002824 seconds (951 CPU allocations: 38.156 KiB) (2 GPU allocations: 301.000 KiB, 2.32% gc time of which 46.42% spent allocating)

julia> import Base: *

julia> A::AbstractMatrix * B::Flux.OneHotMatrix = @inbounds A[:, map(x->x.ix, B.data)]
* (generic function with 522 methods)

julia> A * B; CuArrays.@time A * B;
  0.000343 seconds (169 CPU allocations: 5.000 KiB) (2 GPU allocations: 301.000 KiB, 15.53% gc time of which 65.97% spent allocating)
```

Co-authored-by: Yao Lu <luyaocns@gmail.com>
This commit is contained in:
bors[bot] 2020-06-06 17:00:01 +00:00 committed by GitHub
commit 9ebbe8cb4c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 1 deletions

View File

@ -27,7 +27,8 @@ Base.getindex(xs::OneHotMatrix, ::Colon, ::Colon) = OneHotMatrix(xs.height, copy
Base.getindex(xs::OneHotMatrix, i::Integer, ::Colon) = map(x -> x[i], xs.data)
A::AbstractMatrix * B::OneHotMatrix = A[:, map(x->x.ix, B.data)]
# remove workaround when https://github.com/JuliaGPU/CuArrays.jl/issues/676 is fixed
A::AbstractMatrix * B::OneHotMatrix = A[:, cpu(map(x->x.ix, B.data))]
Base.hcat(x::OneHotVector, xs::OneHotVector...) = OneHotMatrix(length(x), [x, xs...])