342 lines
11 KiB
HTML
342 lines
11 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="UTF-8"/>
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
|
||
<title>
|
||
Batching · Flux
|
||
</title>
|
||
<script>
|
||
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
|
||
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
|
||
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
|
||
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
|
||
|
||
ga('create', 'UA-36890222-9', 'auto');
|
||
ga('send', 'pageview');
|
||
|
||
</script>
|
||
<link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/>
|
||
<link href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.5.0/styles/default.min.css" rel="stylesheet" type="text/css"/>
|
||
<link href="https://fonts.googleapis.com/css?family=Lato|Ubuntu+Mono" rel="stylesheet" type="text/css"/>
|
||
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/>
|
||
<link href="../assets/documenter.css" rel="stylesheet" type="text/css"/>
|
||
<script>
|
||
documenterBaseURL=".."
|
||
</script>
|
||
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script>
|
||
<script src="../../versions.js"></script>
|
||
<link href="../../flux.css" rel="stylesheet" type="text/css"/>
|
||
</head>
|
||
<body>
|
||
<nav class="toc">
|
||
<h1>
|
||
Flux
|
||
</h1>
|
||
<form class="search" action="../search.html">
|
||
<select id="version-selector" onChange="window.location.href=this.value">
|
||
<option value="#" selected="selected" disabled="disabled">
|
||
Version
|
||
</option>
|
||
</select>
|
||
<input id="search-query" name="q" type="text" placeholder="Search docs"/>
|
||
</form>
|
||
<ul>
|
||
<li>
|
||
<a class="toctext" href="../index.html">
|
||
Home
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<span class="toctext">
|
||
Building Models
|
||
</span>
|
||
<ul>
|
||
<li>
|
||
<a class="toctext" href="../models/basics.html">
|
||
Model Building Basics
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="../models/templates.html">
|
||
Model Templates
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="../models/recurrent.html">
|
||
Recurrence
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="../models/debugging.html">
|
||
Debugging
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
<li>
|
||
<span class="toctext">
|
||
Other APIs
|
||
</span>
|
||
<ul>
|
||
<li class="current">
|
||
<a class="toctext" href="batching.html">
|
||
Batching
|
||
</a>
|
||
<ul class="internal">
|
||
<li>
|
||
<a class="toctext" href="#Basics-1">
|
||
Basics
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="#Sequences-and-Nesting-1">
|
||
Sequences and Nesting
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="#Future-Work-1">
|
||
Future Work
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="backends.html">
|
||
Backends
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
<li>
|
||
<span class="toctext">
|
||
In Action
|
||
</span>
|
||
<ul>
|
||
<li>
|
||
<a class="toctext" href="../examples/logreg.html">
|
||
Logistic Regression
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="../contributing.html">
|
||
Contributing & Help
|
||
</a>
|
||
</li>
|
||
<li>
|
||
<a class="toctext" href="../internals.html">
|
||
Internals
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</nav>
|
||
<article id="docs">
|
||
<header>
|
||
<nav>
|
||
<ul>
|
||
<li>
|
||
Other APIs
|
||
</li>
|
||
<li>
|
||
<a href="batching.html">
|
||
Batching
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
<a class="edit-page" href="https://github.com/MikeInnes/Flux.jl/tree/08b67d9b76d93cf4c7ae971a4e5cf9ba07a7df69/docs/src/apis/batching.md">
|
||
<span class="fa">
|
||
|
||
</span>
|
||
Edit on GitHub
|
||
</a>
|
||
</nav>
|
||
<hr/>
|
||
</header>
|
||
<h1>
|
||
<a class="nav-anchor" id="Batching-1" href="#Batching-1">
|
||
Batching
|
||
</a>
|
||
</h1>
|
||
<h2>
|
||
<a class="nav-anchor" id="Basics-1" href="#Basics-1">
|
||
Basics
|
||
</a>
|
||
</h2>
|
||
<p>
|
||
Existing machine learning frameworks and libraries represent batching, and other properties of data, only implicitly. Your machine learning data is a large
|
||
<code>N</code>
|
||
-dimensional array, which may have a shape like:
|
||
</p>
|
||
<pre><code class="language-julia">100 × 50 × 256 × 256</code></pre>
|
||
<p>
|
||
Typically, this might represent that you have (say) a batch of 100 samples, where each sample is a 50-long sequence of 256×256 images. This is great for performance, but array operations often become much more cumbersome as a result. Especially if you manipulate dimensions at runtime as an optimisation, debugging models can become extremely fiddly, with a proliferation of
|
||
<code>X × Y × Z</code>
|
||
arrays and no information about where they came from.
|
||
</p>
|
||
<p>
|
||
Flux introduces a new approach where the batch dimension is represented explicitly as part of the data. For example:
|
||
</p>
|
||
<pre><code class="language-julia">julia> xs = Batch([[1,2,3], [4,5,6]])
|
||
2-element Batch of Vector{Int64}:
|
||
[1,2,3]
|
||
[4,5,6]</code></pre>
|
||
<p>
|
||
Batches are represented the way we
|
||
<em>
|
||
think
|
||
</em>
|
||
about them; as an list of data points. We can do all the usual array operations with them, including getting the first with
|
||
<code>xs[1]</code>
|
||
, iterating over them and so on. The trick is that under the hood, the data is batched into a single array:
|
||
</p>
|
||
<pre><code class="language-julia">julia> rawbatch(xs)
|
||
2×3 Array{Int64,2}:
|
||
1 2 3
|
||
4 5 6</code></pre>
|
||
<p>
|
||
When we put a
|
||
<code>Batch</code>
|
||
object into a model, the model is ultimately working with a single array, which means there's no performance overhead and we get the full benefit of standard batching.
|
||
</p>
|
||
<p>
|
||
Turning a set of vectors into a matrix is fairly easy anyway, so what's the big deal? Well, it gets more interesting as we start working with more complex data. Say we were working with 4×4 images:
|
||
</p>
|
||
<pre><code class="language-julia">julia> xs = Batch([[1 2; 3 4], [5 6; 7 8]])
|
||
2-element Flux.Batch of Array{Int64,2}:
|
||
[1 2; 3 4]
|
||
[5 6; 7 8]</code></pre>
|
||
<p>
|
||
The raw batch array is much messier, and harder to recognise:
|
||
</p>
|
||
<pre><code class="language-julia">julia> rawbatch(xs)
|
||
2×2×2 Array{Int64,3}:
|
||
[:, :, 1] =
|
||
1 3
|
||
5 7
|
||
|
||
[:, :, 2] =
|
||
2 4
|
||
6 8</code></pre>
|
||
<p>
|
||
Furthermore, because the batches acts like a list of arrays, we can use simple and familiar operations on it:
|
||
</p>
|
||
<pre><code class="language-julia">julia> map(flatten, xs)
|
||
2-element Array{Array{Int64,1},1}:
|
||
[1,3,2,4]
|
||
[5,7,6,8]</code></pre>
|
||
<p>
|
||
<code>flatten</code>
|
||
is simple enough over a single data point, but flattening a batched data set is more complex and you end up needing arcane array operations like
|
||
<code>mapslices</code>
|
||
. A
|
||
<code>Batch</code>
|
||
can just handle this for you for free, and more importantly it ensures that your operations are
|
||
<em>
|
||
correct
|
||
</em>
|
||
– that you haven't mixed up your batch and data dimensions, or used the wrong array op, and so on.
|
||
</p>
|
||
<h2>
|
||
<a class="nav-anchor" id="Sequences-and-Nesting-1" href="#Sequences-and-Nesting-1">
|
||
Sequences and Nesting
|
||
</a>
|
||
</h2>
|
||
<p>
|
||
As well as
|
||
<code>Batch</code>
|
||
, there's a structure called
|
||
<code>Seq</code>
|
||
which behaves very similarly. Let's say we have two one-hot encoded DNA sequences:
|
||
</p>
|
||
<pre><code class="language-julia">julia> x1 = Seq([[0,1,0,0], [1,0,0,0], [0,0,0,1]]) # [A, T, C, G]
|
||
julia> x2 = Seq([[0,0,1,0], [0,0,0,1], [0,0,1,0]])
|
||
|
||
julia> rawbatch(x1)
|
||
3×4 Array{Int64,2}:
|
||
0 1 0 0
|
||
1 0 0 0
|
||
0 0 0 1</code></pre>
|
||
<p>
|
||
This is identical to
|
||
<code>Batch</code>
|
||
so far; but where it gets interesting is that you can actually nest these types:
|
||
</p>
|
||
<pre><code class="language-julia">julia> xs = Batch([x1, x2])
|
||
2-element Batch of Seq of Vector{Int64}:
|
||
[[0,1,0,0],[1,0,0,0],[0,0,0,1]]
|
||
[[0,0,1,0],[0,0,0,1],[0,0,1,0]]</code></pre>
|
||
<p>
|
||
Again, this represents itself intuitively as a list-of-lists-of-lists, but
|
||
<code>rawbatch</code>
|
||
shows that the real underlying value is an
|
||
<code>Array{Int64,3}</code>
|
||
of shape
|
||
<code>2×3×4</code>
|
||
.
|
||
</p>
|
||
<h2>
|
||
<a class="nav-anchor" id="Future-Work-1" href="#Future-Work-1">
|
||
Future Work
|
||
</a>
|
||
</h2>
|
||
<p>
|
||
The design of batching is still a fairly early work in progress, though it's used in a few places in the system. For example, all Flux models expect to be given
|
||
<code>Batch</code>
|
||
objects which are unwrapped into raw arrays for the computation. Models will convert their arguments if necessary, so it's convenient to call a model with a single data point like
|
||
<code>f([1,2,3])</code>
|
||
.
|
||
</p>
|
||
<p>
|
||
Right now, the
|
||
<code>Batch</code>
|
||
or
|
||
<code>Seq</code>
|
||
types always stack along the left-most dimension. In future, this will be customisable, and Flux will provide implementations of common functions that are generic across the batch dimension. This brings the following benefits:
|
||
</p>
|
||
<ul>
|
||
<li>
|
||
<p>
|
||
Code can be written in a batch-agnostic way or be generic across batching setups. Code works with a single data point, and batching happens independently.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
Automatic batching can be done with correctness assured, reducing programmer errors when manipulating dimensions.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
Optimisations, like switching batch dimensions, can be expressed by the programmer with compiler support; fewer code changes are required and optimisations are guaranteed not to break the model.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
This also opens the door for more automatic optimisations, e.g. having the compiler explore the search base of possible batching combinations.
|
||
</p>
|
||
</li>
|
||
</ul>
|
||
<footer>
|
||
<hr/>
|
||
<a class="previous" href="../models/debugging.html">
|
||
<span class="direction">
|
||
Previous
|
||
</span>
|
||
<span class="title">
|
||
Debugging
|
||
</span>
|
||
</a>
|
||
<a class="next" href="backends.html">
|
||
<span class="direction">
|
||
Next
|
||
</span>
|
||
<span class="title">
|
||
Backends
|
||
</span>
|
||
</a>
|
||
</footer>
|
||
</article>
|
||
</body>
|
||
</html>
|