Overview
Many data loading, training, and prediction tasks can be thought of as chaining a sequence of operations. For example, to train a model to segment cells in a large 2D+t movie, one might want to:
Read the data (the movie) and meta data (spatial resolution, data type, ground-truth annotations).
Pick a sample to train on from a random frame, at a random location.
Augment the sample (rotate, mirror, flip, elastically deform, change intensity, etc.).
Stack several samples together into a batch.
Perform a training iteration on the batch
In gunpowder
, a sequence of operations like the example above is assembled
in the form of a pipeline of linked nodes. Once the pipeline is built,
requests can be made at the end of the pipeline (e.g, “give me a batch with
ten images of size (100, 100)
”). This request will then be passed upstream
from node to node. Nodes will update the request to ask for additional data
they need (e.g., a node performing a rotation of 45° will require an image of
size (142, 142)
to satisfy the requested size (100, 100)
). Once the
request hits a source node, a batch with the requested data will be
created. This batch is then sent downstream the pipeline, once again visiting
each node along the path to perform the actual operation (e.g., rotate by 45°
and crop to (100, 100)
).
The Simple Pipeline demonstrates how such
a pipeline in assembled in gunpowder
(and also how to use a similar
pipeline for prediction).