From 05cca818ec80b46301730bbebd4d2928ea373105 Mon Sep 17 00:00:00 2001 From: autodocs Date: Tue, 28 Feb 2017 17:06:34 +0000 Subject: [PATCH] build based on 015c3b0 --- latest/apis/backends.html | 2 +- latest/apis/batching.html | 2 +- latest/apis/storage.html | 2 +- latest/contributing.html | 2 +- latest/examples/char-rnn.html | 53 ++++++++++++++++++++++++++--------- latest/examples/logreg.html | 2 +- latest/index.html | 2 +- latest/internals.html | 2 +- latest/models/basics.html | 2 +- latest/models/debugging.html | 2 +- latest/models/recurrent.html | 2 +- latest/models/templates.html | 2 +- latest/search_index.js | 2 +- 13 files changed, 52 insertions(+), 25 deletions(-) diff --git a/latest/apis/backends.html b/latest/apis/backends.html index 0785ba35..ca2e5c36 100644 --- a/latest/apis/backends.html +++ b/latest/apis/backends.html @@ -150,7 +150,7 @@ Backends - + diff --git a/latest/apis/batching.html b/latest/apis/batching.html index dc10ab59..ea72e19a 100644 --- a/latest/apis/batching.html +++ b/latest/apis/batching.html @@ -155,7 +155,7 @@ Batching - + diff --git a/latest/apis/storage.html b/latest/apis/storage.html index 65c0e2bf..2284e6e9 100644 --- a/latest/apis/storage.html +++ b/latest/apis/storage.html @@ -139,7 +139,7 @@ Storing Models - + diff --git a/latest/contributing.html b/latest/contributing.html index 48f0f758..0b3fcfae 100644 --- a/latest/contributing.html +++ b/latest/contributing.html @@ -136,7 +136,7 @@ Contributing & Help - + diff --git a/latest/examples/char-rnn.html b/latest/examples/char-rnn.html index c5a891df..736eec01 100644 --- a/latest/examples/char-rnn.html +++ b/latest/examples/char-rnn.html @@ -139,7 +139,7 @@ Char RNN - + @@ -153,22 +153,41 @@ Char RNN Char RNN +

+This walkthrough will take you through a model like that used in + +Karpathy's 2015 blog post + +, which can learn to generate text in the style of Shakespeare (or whatever else you may use as input). +shakespeare_input.txt + is + +here + +. +

using Flux
-import StatsBase: wsample
-
-nunroll = 50
+import StatsBase: wsample
+

+Firstly, we define up front how many steps we want to unroll the RNN, and the number of data points to batch together. Then we create some functions to prepare our data, using Flux's built-in utilities. +

+
nunroll = 50
 nbatch = 50
 
 getseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)
-getbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)
-
-input = readstring("$(homedir())/Downloads/shakespeare_input.txt")
+getbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)
+

+Because we want the RNN to predict the next letter at each iteration, our target data is simply our input data offset by one. For example, if the input is "The quick brown fox", the target will be "he quick brown fox ". Each letter is one-hot encoded and sequences are batched together to create the training data. +

+
input = readstring("shakespeare_input.txt")
 alphabet = unique(input)
 N = length(alphabet)
 
-Xs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)
-
-model = Chain(
+Xs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)
+

+Creating the model and training it is straightforward: +

+
model = Chain(
   Input(N),
   LSTM(N, 256),
   LSTM(256, 256),
@@ -177,9 +196,13 @@ model = Chain(
 
 m = tf(unroll(model, nunroll))
 
-@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)
-
-function sample(model, n, temp = 1)
+@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)
+

+Finally, we can sample the model. For sampling we remove the +softmax + from the end of the chain so that we can "sharpen" the resulting probabilities. +

+
function sample(model, n, temp = 1)
   s = [rand(alphabet)]
   m = tf(unroll(model, 1))
   for i = 1:n
@@ -189,6 +212,10 @@ function sample(model, n, temp = 1)
 end
 
 sample(model[1:end-1], 100)
+

+sample + then produces a string of Shakespeare-like text. This won't produce great results after only a single epoch (though they will be recognisably different from the untrained model). Going for 30 epochs or so produces good results. +