diff --git a/latest/apis/backends.html b/latest/apis/backends.html index 7ea56e8c..25f340f4 100644 --- a/latest/apis/backends.html +++ b/latest/apis/backends.html @@ -150,7 +150,7 @@ Backends - + diff --git a/latest/apis/batching.html b/latest/apis/batching.html index b2bc08d4..e0a2ac7a 100644 --- a/latest/apis/batching.html +++ b/latest/apis/batching.html @@ -155,7 +155,7 @@ Batching - + diff --git a/latest/apis/storage.html b/latest/apis/storage.html index a3c7eb07..ee31e629 100644 --- a/latest/apis/storage.html +++ b/latest/apis/storage.html @@ -139,7 +139,7 @@ Storing Models - + diff --git a/latest/contributing.html b/latest/contributing.html index 2c29de1e..af1520f3 100644 --- a/latest/contributing.html +++ b/latest/contributing.html @@ -136,7 +136,7 @@ Contributing & Help - + diff --git a/latest/examples/char-rnn.html b/latest/examples/char-rnn.html index cb916329..2b1f0091 100644 --- a/latest/examples/char-rnn.html +++ b/latest/examples/char-rnn.html @@ -139,7 +139,7 @@ Char RNN - + diff --git a/latest/examples/logreg.html b/latest/examples/logreg.html index ac592e99..3207cf28 100644 --- a/latest/examples/logreg.html +++ b/latest/examples/logreg.html @@ -139,7 +139,7 @@ Logistic Regression - + @@ -196,7 +196,7 @@ Now we define our model, which will simply be a function from one to the other. Affine( 64), relu, Affine( 10), softmax) -model = tf(model) +model = tf(m)

We can try this out on our data already:

diff --git a/latest/index.html b/latest/index.html index fa3071b0..fb0a2343 100644 --- a/latest/index.html +++ b/latest/index.html @@ -147,7 +147,7 @@ Home
- + diff --git a/latest/internals.html b/latest/internals.html index 4919500a..2c9a05a6 100644 --- a/latest/internals.html +++ b/latest/internals.html @@ -136,7 +136,7 @@ Internals - + diff --git a/latest/models/basics.html b/latest/models/basics.html index 1659fb60..a387e967 100644 --- a/latest/models/basics.html +++ b/latest/models/basics.html @@ -155,7 +155,7 @@ Model Building Basics - + diff --git a/latest/models/debugging.html b/latest/models/debugging.html index 519791ee..ebd20b2e 100644 --- a/latest/models/debugging.html +++ b/latest/models/debugging.html @@ -139,7 +139,7 @@ Debugging - + diff --git a/latest/models/recurrent.html b/latest/models/recurrent.html index 5f97b317..777d0702 100644 --- a/latest/models/recurrent.html +++ b/latest/models/recurrent.html @@ -139,7 +139,7 @@ Recurrence - + diff --git a/latest/models/templates.html b/latest/models/templates.html index b4ad9532..b6703b8a 100644 --- a/latest/models/templates.html +++ b/latest/models/templates.html @@ -155,7 +155,7 @@ Model Templates - + diff --git a/latest/search_index.js b/latest/search_index.js index 5d499632..0410ef16 100644 --- a/latest/search_index.js +++ b/latest/search_index.js @@ -245,7 +245,7 @@ var documenterSearchIndex = {"docs": [ "page": "Logistic Regression", "title": "Logistic Regression with MNIST", "category": "section", - "text": "This walkthrough example will take you through writing a multi-layer perceptron that classifies MNIST digits with high accuracy.First, we load the data using the MNIST package:using Flux, MNIST\n\ndata = [(trainfeatures(i), onehot(trainlabel(i), 0:9)) for i = 1:60_000]\ntrain = data[1:50_000]\ntest = data[50_001:60_000]The only Flux-specific function here is onehot, which takes a class label and turns it into a one-hot-encoded vector that we can use for training. For example:julia> onehot(:b, [:a, :b, :c])\n3-element Array{Int64,1}:\n 0\n 1\n 0Otherwise, the format of the data is simple enough, it's just a list of tuples from input to output. For example:julia> data[1]\n([0.0,0.0,0.0, … 0.0,0.0,0.0],[0,0,0,0,0,1,0,0,0,0])data[1][1] is a 28*28 == 784 length vector (mostly zeros due to the black background) and data[1][2] is its classification.Now we define our model, which will simply be a function from one to the other.m = Chain(\n Input(784),\n Affine(128), relu,\n Affine( 64), relu,\n Affine( 10), softmax)\n\nmodel = tf(model)We can try this out on our data already:julia> model(data[1][1])\n10-element Array{Float64,1}:\n 0.10614 \n 0.0850447\n 0.101474\n ...The model gives a probability of about 0.1 to each class – which is a way of saying, \"I have no idea\". This isn't too surprising as we haven't shown it any data yet. This is easy to fix:Flux.train!(model, train, test, η = 1e-4)The training step takes about 5 minutes (to make it faster we can do smarter things like batching). If you run this code in Juno, you'll see a progress meter, which you can hover over to see the remaining computation time.Towards the end of the training process, Flux will have reported that the accuracy of the model is now about 90%. We can try it on our data again:10-element Array{Float32,1}:\n ...\n 5.11423f-7\n 0.9354 \n 3.1033f-5 \n 0.000127077\n ...Notice the class at 93%, suggesting our model is very confident about this image. We can use onecold to compare the true and predicted classes:julia> onecold(data[1][2], 0:9)\n5\n\njulia> onecold(model(data[1][1]), 0:9)\n5Success!" + "text": "This walkthrough example will take you through writing a multi-layer perceptron that classifies MNIST digits with high accuracy.First, we load the data using the MNIST package:using Flux, MNIST\n\ndata = [(trainfeatures(i), onehot(trainlabel(i), 0:9)) for i = 1:60_000]\ntrain = data[1:50_000]\ntest = data[50_001:60_000]The only Flux-specific function here is onehot, which takes a class label and turns it into a one-hot-encoded vector that we can use for training. For example:julia> onehot(:b, [:a, :b, :c])\n3-element Array{Int64,1}:\n 0\n 1\n 0Otherwise, the format of the data is simple enough, it's just a list of tuples from input to output. For example:julia> data[1]\n([0.0,0.0,0.0, … 0.0,0.0,0.0],[0,0,0,0,0,1,0,0,0,0])data[1][1] is a 28*28 == 784 length vector (mostly zeros due to the black background) and data[1][2] is its classification.Now we define our model, which will simply be a function from one to the other.m = Chain(\n Input(784),\n Affine(128), relu,\n Affine( 64), relu,\n Affine( 10), softmax)\n\nmodel = tf(m)We can try this out on our data already:julia> model(data[1][1])\n10-element Array{Float64,1}:\n 0.10614 \n 0.0850447\n 0.101474\n ...The model gives a probability of about 0.1 to each class – which is a way of saying, \"I have no idea\". This isn't too surprising as we haven't shown it any data yet. This is easy to fix:Flux.train!(model, train, test, η = 1e-4)The training step takes about 5 minutes (to make it faster we can do smarter things like batching). If you run this code in Juno, you'll see a progress meter, which you can hover over to see the remaining computation time.Towards the end of the training process, Flux will have reported that the accuracy of the model is now about 90%. We can try it on our data again:10-element Array{Float32,1}:\n ...\n 5.11423f-7\n 0.9354 \n 3.1033f-5 \n 0.000127077\n ...Notice the class at 93%, suggesting our model is very confident about this image. We can use onecold to compare the true and predicted classes:julia> onecold(data[1][2], 0:9)\n5\n\njulia> onecold(model(data[1][1]), 0:9)\n5Success!" }, {