2017-02-18 15:11:53 +00:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "UTF-8" / >
< meta name = "viewport" content = "width=device-width, initial-scale=1.0" / >
< title >
Batching · Flux
< / title >
< script >
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-36890222-9', 'auto');
ga('send', 'pageview');
< / script >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel = "stylesheet" type = "text/css" / >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.5.0/styles/default.min.css" rel = "stylesheet" type = "text/css" / >
< link href = "https://fonts.googleapis.com/css?family=Lato|Ubuntu+Mono" rel = "stylesheet" type = "text/css" / >
< link href = "https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel = "stylesheet" type = "text/css" / >
2017-02-20 11:05:06 +00:00
< link href = "../assets/documenter.css" rel = "stylesheet" type = "text/css" / >
2017-02-18 15:11:53 +00:00
< script >
2017-02-20 11:05:06 +00:00
documenterBaseURL=".."
2017-02-18 15:11:53 +00:00
< / script >
2017-02-20 11:05:06 +00:00
< script src = "https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main = "../assets/documenter.js" > < / script >
< script src = "../../versions.js" > < / script >
< link href = "../../flux.css" rel = "stylesheet" type = "text/css" / >
2017-02-18 15:11:53 +00:00
< / head >
< body >
< nav class = "toc" >
< h1 >
Flux
< / h1 >
2017-02-20 11:05:06 +00:00
< form class = "search" action = "../search.html" >
2017-02-18 15:11:53 +00:00
< select id = "version-selector" onChange = "window.location.href=this.value" >
< option value = "#" selected = "selected" disabled = "disabled" >
Version
< / option >
< / select >
< input id = "search-query" name = "q" type = "text" placeholder = "Search docs" / >
< / form >
< ul >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../index.html" >
2017-02-18 15:11:53 +00:00
Home
< / a >
< / li >
< li >
< span class = "toctext" >
Building Models
< / span >
< ul >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../models/basics.html" >
2017-02-18 15:11:53 +00:00
Model Building Basics
< / a >
< / li >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../models/templates.html" >
2017-02-18 15:11:53 +00:00
Model Templates
< / a >
< / li >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../models/recurrent.html" >
2017-02-18 15:11:53 +00:00
Recurrence
< / a >
< / li >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../models/debugging.html" >
2017-02-18 15:11:53 +00:00
Debugging
< / a >
< / li >
< / ul >
< / li >
< li >
2017-02-20 10:53:09 +00:00
< span class = "toctext" >
Other APIs
< / span >
< ul >
< li class = "current" >
< a class = "toctext" href = "batching.html" >
Batching
< / a >
2017-02-20 16:28:35 +00:00
< ul class = "internal" >
< li >
< a class = "toctext" href = "#Basics-1" >
Basics
< / a >
< / li >
< li >
< a class = "toctext" href = "#Sequences-and-Nesting-1" >
Sequences and Nesting
< / a >
< / li >
< li >
< a class = "toctext" href = "#Future-Work-1" >
Future Work
< / a >
< / li >
< / ul >
2017-02-20 10:53:09 +00:00
< / li >
< li >
< a class = "toctext" href = "backends.html" >
2017-02-18 15:11:53 +00:00
Backends
2017-02-20 10:53:09 +00:00
< / a >
< / li >
< / ul >
2017-02-18 15:11:53 +00:00
< / li >
< li >
< span class = "toctext" >
In Action
< / span >
< ul >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../examples/logreg.html" >
2017-02-18 15:11:53 +00:00
Logistic Regression
< / a >
< / li >
< / ul >
< / li >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../contributing.html" >
2017-02-18 15:11:53 +00:00
Contributing & Help
< / a >
< / li >
< li >
2017-02-20 11:05:06 +00:00
< a class = "toctext" href = "../internals.html" >
2017-02-18 15:11:53 +00:00
Internals
< / a >
< / li >
< / ul >
< / nav >
< article id = "docs" >
< header >
< nav >
< ul >
< li >
2017-02-20 10:53:09 +00:00
Other APIs
< / li >
< li >
2017-02-18 15:11:53 +00:00
< a href = "batching.html" >
Batching
< / a >
< / li >
< / ul >
2017-02-20 16:28:35 +00:00
< a class = "edit-page" href = "https://github.com/MikeInnes/Flux.jl/tree/b0a097316cba8f52245c755e4898a0311db734d2/docs/src/apis/batching.md" >
2017-02-18 15:11:53 +00:00
< span class = "fa" >
< / span >
Edit on GitHub
< / a >
< / nav >
< hr / >
< / header >
< h1 >
< a class = "nav-anchor" id = "Batching-1" href = "#Batching-1" >
Batching
< / a >
< / h1 >
2017-02-20 16:28:35 +00:00
< h2 >
< a class = "nav-anchor" id = "Basics-1" href = "#Basics-1" >
Basics
< / a >
< / h2 >
< p >
Existing machine learning frameworks and libraries represent batching, and other properties of data, only implicitly. Your machine learning data is a large
< code > N< / code >
-dimensional array, which may have a shape like:
< / p >
< pre > < code class = "language-julia" > 100 × 50 × 256 × 256< / code > < / pre >
< p >
Typically, this might represent that you have (say) a batch of 100 samples, where each sample is a 50-long sequence of 256× 256 images. This is great for performance, but array operations often become much more cumbersome as a result. Especially if you manipulate dimensions at runtime as an optimisation, debugging models can become extremely fiddly, with a proliferation of
< code > X × Y × Z< / code >
arrays and no information about where they came from.
< / p >
< p >
Flux introduces a new approach where the batch dimension is represented explicitly as part of the data. For example:
< / p >
< pre > < code class = "language-julia" > julia> xs = Batch([[1,2,3], [4,5,6]])
2-element Batch of Vector{Int64}:
[1,2,3]
[4,5,6]< / code > < / pre >
< p >
Batches are represented the way we
< em >
think
< / em >
about them; as an list of data points. We can do all the usual array operations with them, including getting the first with
< code > xs[1]< / code >
, iterating over them and so on. The trick is that under the hood, the data is batched into a single array:
< / p >
< pre > < code class = "language-julia" > julia> rawbatch(xs)
2× 3 Array{Int64,2}:
1 2 3
4 5 6< / code > < / pre >
< p >
When we put a
< code > Batch< / code >
object into a model, the model is ultimately working with a single array, which means there' s no performance overhead and we get the full benefit of standard batching.
< / p >
< p >
Turning a set of vectors into a matrix is fairly easy anyway, so what' s the big deal? Well, it gets more interesting as we start working with more complex data. Say we were working with 4× 4 images:
< / p >
< pre > < code class = "language-julia" > julia> xs = Batch([[1 2; 3 4], [5 6; 7 8]])
2-element Flux.Batch of Array{Int64,2}:
[1 2; 3 4]
[5 6; 7 8]< / code > < / pre >
< p >
The raw batch array is much messier, and harder to recognise:
< / p >
< pre > < code class = "language-julia" > julia> rawbatch(xs)
2× 2× 2 Array{Int64,3}:
[:, :, 1] =
1 3
5 7
[:, :, 2] =
2 4
6 8< / code > < / pre >
< p >
Furthermore, because the batches acts like a list of arrays, we can use simple and familiar operations on it:
< / p >
< pre > < code class = "language-julia" > julia> map(flatten, xs)
2-element Array{Array{Int64,1},1}:
[1,3,2,4]
[5,7,6,8]< / code > < / pre >
< p >
< code > flatten< / code >
is simple enough over a single data point, but flattening a batched data set is more complex and you end up needing arcane array operations like
< code > mapslices< / code >
. A
< code > Batch< / code >
can just handle this for you for free, and more importantly it ensures that your operations are
< em >
correct
< / em >
– that you haven' t mixed up your batch and data dimensions, or used the wrong array op, and so on.
< / p >
< h2 >
< a class = "nav-anchor" id = "Sequences-and-Nesting-1" href = "#Sequences-and-Nesting-1" >
Sequences and Nesting
< / a >
< / h2 >
2017-02-18 15:11:53 +00:00
< p >
2017-02-20 16:28:35 +00:00
As well as
< code > Batch< / code >
, there' s a structure called
< code > Seq< / code >
which behaves very similarly. Let' s say we have two one-hot encoded DNA sequences:
2017-02-18 15:11:53 +00:00
< / p >
2017-02-20 16:28:35 +00:00
< pre > < code class = "language-julia" > julia> x1 = Seq([[0,1,0,0], [1,0,0,0], [0,0,0,1]]) # [A, T, C, G]
julia> x2 = Seq([[0,0,1,0], [0,0,0,1], [0,0,1,0]])
julia> rawbatch(x1)
3× 4 Array{Int64,2}:
0 1 0 0
1 0 0 0
0 0 0 1< / code > < / pre >
< p >
This is identical to
< code > Batch< / code >
so far; but where it gets interesting is that you can actually nest these types:
< / p >
< pre > < code class = "language-julia" > julia> xs = Batch([x1, x2])
2-element Batch of Seq of Vector{Int64}:
[[0,1,0,0],[1,0,0,0],[0,0,0,1]]
[[0,0,1,0],[0,0,0,1],[0,0,1,0]]< / code > < / pre >
< p >
Again, this represents itself intuitively as a list-of-lists-of-lists, but
< code > rawbatch< / code >
shows that the real underlying value is an
< code > Array{Int64,3}< / code >
of shape
< code > 2× 3× 4< / code >
.
< / p >
< h2 >
< a class = "nav-anchor" id = "Future-Work-1" href = "#Future-Work-1" >
Future Work
< / a >
< / h2 >
< p >
The design of batching is still a fairly early work in progress, though it' s used in a few places in the system. For example, all Flux models expect to be given
< code > Batch< / code >
objects which are unwrapped into raw arrays for the computation. Models will convert their arguments if necessary, so it' s convenient to call a model with a single data point like
< code > f([1,2,3])< / code >
.
< / p >
< p >
Right now, the
< code > Batch< / code >
or
< code > Seq< / code >
types always stack along the left-most dimension. In future, this will be customisable, and Flux will provide implementations of common functions that are generic across the batch dimension. This brings the following benefits:
< / p >
< ul >
< li >
< p >
Code can be written in a batch-agnostic way, i.e. as if working with a single data point, with batching happening independently.
< / p >
< / li >
< li >
< p >
Automatic batching can be done with correctness assured, reducing programmer errors when manipulating dimensions.
< / p >
< / li >
< li >
< p >
Optimisations, like switching batch dimensions, can be expressed by the programmer with compiler support; fewer code changes are required and optimisations are guaranteed not to break the model.
< / p >
< / li >
< li >
< p >
This also opens the door for more automatic optimisations, e.g. having the compiler explore the search base of possible batching combinations.
< / p >
< / li >
< / ul >
2017-02-18 15:11:53 +00:00
< footer >
< hr / >
2017-02-20 11:05:06 +00:00
< a class = "previous" href = "../models/debugging.html" >
2017-02-18 15:11:53 +00:00
< span class = "direction" >
Previous
< / span >
< span class = "title" >
Debugging
< / span >
< / a >
< a class = "next" href = "backends.html" >
< span class = "direction" >
Next
< / span >
< span class = "title" >
Backends
< / span >
< / a >
< / footer >
< / article >
< / body >
< / html >