Merge #1141
1141: Speedup matmul of CuMatrix and OneHotMatrix r=CarloLucibello a=AStupidBear This solves #189. ```julia julia> using Flux julia> using Flux: CuArrays julia> A = zeros(300, 10000) |> gpu; julia> B = Flux.onehotbatch(rand(1:10000, 256), 1:10000) |> gpu; julia> A * B; CuArrays.@time A * B; ┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)` └ @ GPUArrays ~/shared/.julia/packages/GPUArrays/OXvxB/src/host/indexing.jl:43 0.002824 seconds (951 CPU allocations: 38.156 KiB) (2 GPU allocations: 301.000 KiB, 2.32% gc time of which 46.42% spent allocating) julia> import Base: * julia> A::AbstractMatrix * B::Flux.OneHotMatrix = @inbounds A[:, map(x->x.ix, B.data)] * (generic function with 522 methods) julia> A * B; CuArrays.@time A * B; 0.000343 seconds (169 CPU allocations: 5.000 KiB) (2 GPU allocations: 301.000 KiB, 15.53% gc time of which 65.97% spent allocating) ``` Co-authored-by: Yao Lu <luyaocns@gmail.com>
This commit is contained in:
commit
9ebbe8cb4c
|
@ -27,7 +27,8 @@ Base.getindex(xs::OneHotMatrix, ::Colon, ::Colon) = OneHotMatrix(xs.height, copy
|
|||
|
||||
Base.getindex(xs::OneHotMatrix, i::Integer, ::Colon) = map(x -> x[i], xs.data)
|
||||
|
||||
A::AbstractMatrix * B::OneHotMatrix = A[:, map(x->x.ix, B.data)]
|
||||
# remove workaround when https://github.com/JuliaGPU/CuArrays.jl/issues/676 is fixed
|
||||
A::AbstractMatrix * B::OneHotMatrix = A[:, cpu(map(x->x.ix, B.data))]
|
||||
|
||||
Base.hcat(x::OneHotVector, xs::OneHotVector...) = OneHotMatrix(length(x), [x, xs...])
|
||||
|
||||
|
|
Loading…
Reference in New Issue