diff --git a/latest/apis/backends.html b/latest/apis/backends.html index 3e536d21..b0f43da5 100644 --- a/latest/apis/backends.html +++ b/latest/apis/backends.html @@ -150,7 +150,7 @@ Backends - + diff --git a/latest/apis/batching.html b/latest/apis/batching.html index 82657715..eb013471 100644 --- a/latest/apis/batching.html +++ b/latest/apis/batching.html @@ -155,7 +155,7 @@ Batching - + @@ -197,7 +197,7 @@ Batches are represented the way we think - about them; as an list of data points. We can do all the usual array operations with them, including getting the first with + about them; as a list of data points. We can do all the usual array operations with them, including getting the first with xs[1] , iterating over them and so on. The trick is that under the hood, the data is batched into a single array:

diff --git a/latest/apis/storage.html b/latest/apis/storage.html index 1f8b65ee..18b26fdc 100644 --- a/latest/apis/storage.html +++ b/latest/apis/storage.html @@ -139,7 +139,7 @@ Storing Models
- + diff --git a/latest/contributing.html b/latest/contributing.html index 0cb166d6..f5dd1392 100644 --- a/latest/contributing.html +++ b/latest/contributing.html @@ -136,7 +136,7 @@ Contributing & Help - + diff --git a/latest/examples/char-rnn.html b/latest/examples/char-rnn.html index a7bfe4bb..a38de58c 100644 --- a/latest/examples/char-rnn.html +++ b/latest/examples/char-rnn.html @@ -139,7 +139,7 @@ Char RNN - + @@ -174,16 +174,21 @@ Firstly, we define up front how many steps we want to unroll the RNN, and the nu
nunroll = 50
 nbatch = 50
 
-getseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)
-getbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)
+getseqs(chars, alphabet) = + sequences((onehot(Float32, char, alphabet) for char in chars), nunroll) +getbatches(chars, alphabet) = + batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)

Because we want the RNN to predict the next letter at each iteration, our target data is simply our input data offset by one. For example, if the input is "The quick brown fox", the target will be "he quick brown fox ". Each letter is one-hot encoded and sequences are batched together to create the training data.

-
input = readstring("shakespeare_input.txt")
+
input = readstring("shakespeare_input.txt");
 alphabet = unique(input)
 N = length(alphabet)
 
-Xs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)
+# An iterator of (input, output) pairs +train = zip(getbatches(input, alphabet), getbatches(input[2:end], alphabet)) +# We will evaluate the loss on a particular batch to monitor the training. +eval = tobatch.(first(drop(train, 5)))

Creating the model and training it is straightforward:

@@ -196,7 +201,11 @@ Creating the model and training it is straightforward: m = tf(unroll(model, nunroll)) -@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1) +# Call this to see how the model is doing +evalcb = () -> @show logloss(m(eval[1]), eval[2]) + +@time Flux.train!(m, train, η = 0.1, loss = logloss, cb = [evalcb]) +

Finally, we can sample the model. For sampling we remove the softmax @@ -204,9 +213,9 @@ Finally, we can sample the model. For sampling we remove the

function sample(model, n, temp = 1)
   s = [rand(alphabet)]
-  m = tf(unroll(model, 1))
-  for i = 1:n
-    push!(s, wsample(alphabet, softmax(m(Seq((onehot(Float32, s[end], alphabet),)))[1]./temp)))
+  m = unroll1(model)
+  for i = 1:n-1
+    push!(s, wsample(alphabet, softmax(m(unsqueeze(onehot(s[end], alphabet)))./temp)[1,:]))
   end
   return string(s...)
 end
diff --git a/latest/examples/logreg.html b/latest/examples/logreg.html
index d9b1ceaf..5fd4efc3 100644
--- a/latest/examples/logreg.html
+++ b/latest/examples/logreg.html
@@ -139,7 +139,7 @@ Simple MNIST
               
             
           
-          
+          
             
@@ -160,6 +160,7 @@ This walkthrough example will take you through writing a multi-layer perceptron
 First, we load the data using the MNIST package:
       

using Flux, MNIST
+using Flux: accuracy
 
 data = [(trainfeatures(i), onehot(trainlabel(i), 0:9)) for i = 1:60_000]
 train = data[1:50_000]
@@ -190,7 +191,7 @@ Otherwise, the format of the data is simple enough, it's just a list of tupl
       

Now we define our model, which will simply be a function from one to the other.

-
m = Chain(
+
m = @Chain(
   Input(784),
   Affine(128), relu,
   Affine( 64), relu,
@@ -200,7 +201,7 @@ model = mxnet(m) # Convert to MXNet

We can try this out on our data already:

-
julia> model(data[1][1])
+
julia> model(tobatch(data[1][1]))
 10-element Array{Float64,1}:
  0.10614  
  0.0850447
@@ -209,7 +210,8 @@ We can try this out on our data already:
       

The model gives a probability of about 0.1 to each class – which is a way of saying, "I have no idea". This isn't too surprising as we haven't shown it any data yet. This is easy to fix:

-
Flux.train!(model, train, test, η = 1e-4)
+
Flux.train!(model, train, η = 1e-3,
+            cb = [()->@show accuracy(m, test)])

The training step takes about 5 minutes (to make it faster we can do smarter things like batching). If you run this code in Juno, you'll see a progress meter, which you can hover over to see the remaining computation time.

@@ -231,7 +233,7 @@ Notice the class at 93%, suggesting our model is very confident about this image
julia> onecold(data[1][2], 0:9)
 5
 
-julia> onecold(model(data[1][1]), 0:9)
+julia> onecold(model(tobatch(data[1][1])), 0:9)
 5

Success! diff --git a/latest/index.html b/latest/index.html index ea560455..709b7ce7 100644 --- a/latest/index.html +++ b/latest/index.html @@ -147,7 +147,7 @@ Home - + @@ -169,6 +169,16 @@ Flux aims to be an intuitive and powerful notation, close to the mathematics, th

So what's the catch? Flux is at an early "working prototype" stage; many things work but the API is still in a state of... well, it might change. If you're interested to find out what works, read on! +

+

+ +Note: + + If you're using Julia v0.5 please see + +this version + + of the docs instead.

@@ -233,6 +243,12 @@ TensorFlow

Pkg.add("MXNet") # or "TensorFlow"
 Pkg.test("Flux") # Make sure everything installed properly
+

+ +Note: + + TensorFlow integration may not work properly on Julia v0.6 yet. +