build based on 015c3b0
This commit is contained in:
parent
30d2c127c0
commit
05cca818ec
@ -150,7 +150,7 @@ Backends
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/apis/backends.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/apis/backends.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -155,7 +155,7 @@ Batching
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/apis/batching.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/apis/batching.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -139,7 +139,7 @@ Storing Models
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/apis/storage.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/apis/storage.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -136,7 +136,7 @@ Contributing & Help
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/contributing.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/contributing.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -139,7 +139,7 @@ Char RNN
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/examples/char-rnn.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/examples/char-rnn.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
@ -153,22 +153,41 @@ Char RNN
|
|||||||
Char RNN
|
Char RNN
|
||||||
</a>
|
</a>
|
||||||
</h1>
|
</h1>
|
||||||
|
<p>
|
||||||
|
This walkthrough will take you through a model like that used in
|
||||||
|
<a href="http://karpathy.github.io/2015/05/21/rnn-effectiveness/">
|
||||||
|
Karpathy's 2015 blog post
|
||||||
|
</a>
|
||||||
|
, which can learn to generate text in the style of Shakespeare (or whatever else you may use as input).
|
||||||
|
<code>shakespeare_input.txt</code>
|
||||||
|
is
|
||||||
|
<a href="http://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt">
|
||||||
|
here
|
||||||
|
</a>
|
||||||
|
.
|
||||||
|
</p>
|
||||||
<pre><code class="language-julia">using Flux
|
<pre><code class="language-julia">using Flux
|
||||||
import StatsBase: wsample
|
import StatsBase: wsample</code></pre>
|
||||||
|
<p>
|
||||||
nunroll = 50
|
Firstly, we define up front how many steps we want to unroll the RNN, and the number of data points to batch together. Then we create some functions to prepare our data, using Flux's built-in utilities.
|
||||||
|
</p>
|
||||||
|
<pre><code class="language-julia">nunroll = 50
|
||||||
nbatch = 50
|
nbatch = 50
|
||||||
|
|
||||||
getseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)
|
getseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)
|
||||||
getbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)
|
getbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)</code></pre>
|
||||||
|
<p>
|
||||||
input = readstring("$(homedir())/Downloads/shakespeare_input.txt")
|
Because we want the RNN to predict the next letter at each iteration, our target data is simply our input data offset by one. For example, if the input is "The quick brown fox", the target will be "he quick brown fox ". Each letter is one-hot encoded and sequences are batched together to create the training data.
|
||||||
|
</p>
|
||||||
|
<pre><code class="language-julia">input = readstring("shakespeare_input.txt")
|
||||||
alphabet = unique(input)
|
alphabet = unique(input)
|
||||||
N = length(alphabet)
|
N = length(alphabet)
|
||||||
|
|
||||||
Xs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)
|
Xs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)</code></pre>
|
||||||
|
<p>
|
||||||
model = Chain(
|
Creating the model and training it is straightforward:
|
||||||
|
</p>
|
||||||
|
<pre><code class="language-julia">model = Chain(
|
||||||
Input(N),
|
Input(N),
|
||||||
LSTM(N, 256),
|
LSTM(N, 256),
|
||||||
LSTM(256, 256),
|
LSTM(256, 256),
|
||||||
@ -177,9 +196,13 @@ model = Chain(
|
|||||||
|
|
||||||
m = tf(unroll(model, nunroll))
|
m = tf(unroll(model, nunroll))
|
||||||
|
|
||||||
@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)
|
@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)</code></pre>
|
||||||
|
<p>
|
||||||
function sample(model, n, temp = 1)
|
Finally, we can sample the model. For sampling we remove the
|
||||||
|
<code>softmax</code>
|
||||||
|
from the end of the chain so that we can "sharpen" the resulting probabilities.
|
||||||
|
</p>
|
||||||
|
<pre><code class="language-julia">function sample(model, n, temp = 1)
|
||||||
s = [rand(alphabet)]
|
s = [rand(alphabet)]
|
||||||
m = tf(unroll(model, 1))
|
m = tf(unroll(model, 1))
|
||||||
for i = 1:n
|
for i = 1:n
|
||||||
@ -189,6 +212,10 @@ function sample(model, n, temp = 1)
|
|||||||
end
|
end
|
||||||
|
|
||||||
sample(model[1:end-1], 100)</code></pre>
|
sample(model[1:end-1], 100)</code></pre>
|
||||||
|
<p>
|
||||||
|
<code>sample</code>
|
||||||
|
then produces a string of Shakespeare-like text. This won't produce great results after only a single epoch (though they will be recognisably different from the untrained model). Going for 30 epochs or so produces good results.
|
||||||
|
</p>
|
||||||
<footer>
|
<footer>
|
||||||
<hr/>
|
<hr/>
|
||||||
<a class="previous" href="logreg.html">
|
<a class="previous" href="logreg.html">
|
||||||
|
@ -139,7 +139,7 @@ Logistic Regression
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/examples/logreg.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/examples/logreg.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -147,7 +147,7 @@ Home
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/index.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/index.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -136,7 +136,7 @@ Internals
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/internals.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/internals.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -155,7 +155,7 @@ Model Building Basics
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/models/basics.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/models/basics.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -139,7 +139,7 @@ Debugging
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/models/debugging.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/models/debugging.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -139,7 +139,7 @@ Recurrence
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/models/recurrent.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/models/recurrent.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -155,7 +155,7 @@ Model Templates
|
|||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/4d4979b401d04a212436021033c00f4c985b9222/docs/src/models/templates.md">
|
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/015c3b0856fcfcb7949c9188fc9c3ab393eed950/docs/src/models/templates.md">
|
||||||
<span class="fa">
|
<span class="fa">
|
||||||
|
|
||||||
</span>
|
</span>
|
||||||
|
@ -261,7 +261,7 @@ var documenterSearchIndex = {"docs": [
|
|||||||
"page": "Char RNN",
|
"page": "Char RNN",
|
||||||
"title": "Char RNN",
|
"title": "Char RNN",
|
||||||
"category": "section",
|
"category": "section",
|
||||||
"text": "using Flux\nimport StatsBase: wsample\n\nnunroll = 50\nnbatch = 50\n\ngetseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)\ngetbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)\n\ninput = readstring(\"$(homedir())/Downloads/shakespeare_input.txt\")\nalphabet = unique(input)\nN = length(alphabet)\n\nXs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)\n\nmodel = Chain(\n Input(N),\n LSTM(N, 256),\n LSTM(256, 256),\n Affine(256, N),\n softmax)\n\nm = tf(unroll(model, nunroll))\n\n@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)\n\nfunction sample(model, n, temp = 1)\n s = [rand(alphabet)]\n m = tf(unroll(model, 1))\n for i = 1:n\n push!(s, wsample(alphabet, softmax(m(Seq((onehot(Float32, s[end], alphabet),)))[1]./temp)))\n end\n return string(s...)\nend\n\nsample(model[1:end-1], 100)"
|
"text": "This walkthrough will take you through a model like that used in Karpathy's 2015 blog post, which can learn to generate text in the style of Shakespeare (or whatever else you may use as input). shakespeare_input.txt is here.using Flux\nimport StatsBase: wsampleFirstly, we define up front how many steps we want to unroll the RNN, and the number of data points to batch together. Then we create some functions to prepare our data, using Flux's built-in utilities.nunroll = 50\nnbatch = 50\n\ngetseqs(chars, alphabet) = sequences((onehot(Float32, char, alphabet) for char in chars), nunroll)\ngetbatches(chars, alphabet) = batches((getseqs(part, alphabet) for part in chunk(chars, nbatch))...)Because we want the RNN to predict the next letter at each iteration, our target data is simply our input data offset by one. For example, if the input is \"The quick brown fox\", the target will be \"he quick brown fox \". Each letter is one-hot encoded and sequences are batched together to create the training data.input = readstring(\"shakespeare_input.txt\")\nalphabet = unique(input)\nN = length(alphabet)\n\nXs, Ys = getbatches(input, alphabet), getbatches(input[2:end], alphabet)Creating the model and training it is straightforward:model = Chain(\n Input(N),\n LSTM(N, 256),\n LSTM(256, 256),\n Affine(256, N),\n softmax)\n\nm = tf(unroll(model, nunroll))\n\n@time Flux.train!(m, Xs, Ys, η = 0.1, epoch = 1)Finally, we can sample the model. For sampling we remove the softmax from the end of the chain so that we can \"sharpen\" the resulting probabilities.function sample(model, n, temp = 1)\n s = [rand(alphabet)]\n m = tf(unroll(model, 1))\n for i = 1:n\n push!(s, wsample(alphabet, softmax(m(Seq((onehot(Float32, s[end], alphabet),)))[1]./temp)))\n end\n return string(s...)\nend\n\nsample(model[1:end-1], 100)sample then produces a string of Shakespeare-like text. This won't produce great results after only a single epoch (though they will be recognisably different from the untrained model). Going for 30 epochs or so produces good results."
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
|
Loading…
Reference in New Issue
Block a user