diff --git a/latest/gpu.html b/latest/gpu.html index 1c71d7d4..2f416ffc 100644 --- a/latest/gpu.html +++ b/latest/gpu.html @@ -15,11 +15,11 @@ predict(x) = W*x .+ b loss(x, y) = sum((predict(x) .- y).^2) x, y = cu(rand(5)), cu(rand(2)) # Dummy data -loss(x, y) # ~ 3
Note that we convert both the parameters (W
, b
) and the data set (x
, y
) to cuda arrays. Taking derivatives and training works exactly as before.
If you define a structured model, like a Dense
layer or Chain
, you just need to convert the internal parameters. Flux provides mapparams
, which allows you to alter all parameters of a model at once.
d = Dense(10, 5, σ)
-d = mapparams(cu, d)
+loss(x, y) # ~ 3
Note that we convert both the parameters (W
, b
) and the data set (x
, y
) to cuda arrays. Taking derivatives and training works exactly as before.
If you define a structured model, like a Dense
layer or Chain
, you just need to convert the internal parameters. Flux provides fmap
, which allows you to alter all parameters of a model at once.
d = Dense(10, 5, σ)
+d = fmap(cu, d)
d.W # Tracked CuArray
d(cu(rand(10))) # CuArray output
m = Chain(Dense(10, 5, σ), Dense(5, 2), softmax)
-m = mapparams(cu, m)
+m = fmap(cu, m)
d(cu(rand(10)))
The mnist example contains the code needed to run the model on the GPU; just uncomment the lines after using CuArrays
.