Importing Tracks

This document describes the code architecture for importing tracking data from various formats into funtracks using the Builder Pattern.

TracksBuilder Overview

The builder pattern provides a unified interface for importing tracks from different formats while sharing common validation and construction logic.

---
title: TracksBuilder
---
graph LR
    subgraph Prepare["prepare(source)"]
        direction TB
        ReadHeader["read_header<br/><small>get available properties</small>"]
        InferMap["infer_node_name_map<br/><small>auto-map to standard keys</small>"]
        ReadHeader --> InferMap
    end

    subgraph Build["build(source, seg, ...)"]
        direction TB
        ValidateMap["validate_name_map<br/><small>check required mappings</small>"]
        LoadSource["load_source<br/><small>load into InMemoryGeff</small>"]
        Validate["validate<br/><small>check graph structure</small>"]
        ConstructGraph["construct_graph<br/><small>build NetworkX graph</small>"]
        HandleSeg["handle_segmentation<br/><small>load & relabel if needed</small>"]
        CreateTracks[Create SolutionTracks]
        EnableFeatures["enable_features<br/><small>register & compute</small>"]

        ValidateMap --> LoadSource
        LoadSource --> Validate
        Validate --> ConstructGraph
        ConstructGraph --> HandleSeg
        HandleSeg --> CreateTracks
        CreateTracks --> EnableFeatures
    end

    Prepare --> Build

    subgraph Legend["Legend"]
        direction TB
        L1[Format-specific]
        L2[Common]

        L1 ~~~ L2
    end

    Build ~~~ Legend

    style ReadHeader fill:#b58a2f,color:#fff,stroke:#8a6823
    style LoadSource fill:#b58a2f,color:#fff,stroke:#8a6823
    style InferMap fill:#8a2fb5,color:#fff,stroke:#6a2390
    style ValidateMap fill:#8a2fb5,color:#fff,stroke:#6a2390
    style Validate fill:#8a2fb5,color:#fff,stroke:#6a2390
    style ConstructGraph fill:#8a2fb5,color:#fff,stroke:#6a2390
    style HandleSeg fill:#8a2fb5,color:#fff,stroke:#6a2390
    style CreateTracks fill:#8a2fb5,color:#fff,stroke:#6a2390
    style EnableFeatures fill:#8a2fb5,color:#fff,stroke:#6a2390
    style L1 fill:#b58a2f,color:#fff,stroke:#8a6823
    style L2 fill:#8a2fb5,color:#fff,stroke:#6a2390

Process Steps

Preparation Phase

`read_header(source_path)` Format-specific

Show API documentation

Read metadata/headers from source without loading data.

Should populate self.importable_node_props and self.importable_edge_props with property/column names.

Parameters:

Name	Type	Description	Default
`source`	`Path \| DataFrame`	Path to data source (zarr store, CSV file, etc.) or DataFrame	required

`infer_node_name_map()` Common

Show API documentation

Infer node_name_map by matching source properties to standard keys.

The node_name_map maps standard funtracks keys to source property names

{standard_key: source_property_name}

For example: {"time": "t", "pos": ["y", "x"], "seg_id": "label"} - "time", "pos", "seg_id" are standard funtracks keys - "t", "y", "x", "label" are property names from the source data

Uses difflib fuzzy matching with the following priority: 1. Exact matches to standard keys (time, seg_id, etc.) 2. Fuzzy matches to standard keys (case-insensitive, 40% similarity cutoff) 3. Exact matches to feature display names/value_names (including position z/y/x) 4. Fuzzy matches to feature display names (case-insensitive, 40% cutoff) 5. Remaining properties map to themselves (custom properties)

Position attributes (z, y, x) are matched via Position feature's value_names, resulting in a composite mapping like {"pos": ["z", "y", "x"]}.

Returns:

Type	Description
`dict[str, str \| list[str]]`	Inferred node_name_map mapping standard keys to source property names

Raises:

Type	Description
`ValueError`	If required features cannot be inferred

`prepare(source_path)` Common

Show API documentation

Prepare for building by reading headers and inferring name maps.

This method reads the data source headers/metadata and automatically infers both node_name_map and edge_name_map. After calling this, you can inspect and modify self.node_name_map and self.edge_name_map before calling build().

Parameters:

Name	Type	Description	Default
`source`	`Path \| DataFrame`	Path to data source or DataFrame	required
`segmentation`	`Path \| ArrayLike \| None`	Optional path to segmentation or array to infer ndim	`None`

Example

builder = CSVTracksBuilder() builder.prepare("data.csv")

Optionally modify the inferred mappings

builder.node_name_map["circularity"] = "circ" builder.edge_name_map["iou"] = "overlap" tracks = builder.build("data.csv", segmentation_path="seg.tif")

Build Phase

`validate_name_map()` Common

Show API documentation

Validate that node_name_map and edge_name_map contain valid mappings.

Checks for nodes: - No None values in required mappings - All required_features are mapped - Position ("pos") is mapped to coordinate columns (unless segmentation provided) - All mapped properties exist in importable_node_props - Features with spatial_dims=True have correct number of list elements

Checks for edges: - All mapped edge properties exist in importable_edge_props

Checks for both: - No feature key collisions between node and edge features

Note: Array shapes for spatial_dims features are validated after loading via validate_spatial_dims().

Parameters:

Name	Type	Description	Default
`has_segmentation`	`bool`	If True, position can be computed from segmentation and is not required in name_map	`False`

Raises:

Type	Description
`ValueError`	If validation fails

`load_source(source_path, name_map, node_features)` Format-specific

Loads data from source file and converts to InMemoryGeff format. Implemented differently for each format.

Show API documentation

Load data from source file and convert to InMemoryGeff format.

Should populate self.in_memory_geff with all properties using standard keys.

Parameters:

Name	Type	Description	Default
`source`	`Path \| DataFrame`	Path to data source (zarr store, CSV file, etc.) or DataFrame	required
`node_name_map`	`dict[str, str \| list[str]]`	Maps standard keys to source property names	required
`node_features`	`dict[str, bool] \| None`	Optional features dict for backward compatibility	`None`

`validate()` Common

Show API documentation

Validate the loaded InMemoryGeff data.

Common validation logic shared across all formats. Validates: - Graph structure (unique nodes, valid edges, etc.) - Spatial_dims features have correct array shapes - Optional properties (lineage_id, track_id) - removed with warning if invalid

Raises:

Type	Description
`ValueError`	If required validation fails

`construct_graph()` Common

Show API documentation

Construct NetworkX graph from validated InMemoryGeff data.

Common logic shared across all formats.

Returns:

Type	Description
`DiGraph`	NetworkX DiGraph with standard keys

Raises:

Type	Description
`ValueError`	If data not loaded or validated

`handle_segmentation(segmentation_path, scale)` Common

Show API documentation

Load, validate, and optionally relabel segmentation.

Common logic shared across all formats.

Parameters:

Name	Type	Description	Default
`graph`	`DiGraph`	Constructed NetworkX graph for validation	required
`segmentation`	`Path \| ndarray \| None`	Path to segmentation data or pre-loaded segmentation array	required
`scale`	`list[float] \| None`	Spatial scale for coordinate transformation	required

Returns:

Type	Description
`tuple[ndarray \| None, list[float] \| None]`	Tuple of (segmentation array, scale) or (None, scale)

Raises:

Type	Description
`ValueError`	If segmentation validation fails

`enable_features(tracks, node_features)` Common

Show API documentation

Enable and register features on tracks object.

Common logic shared across all formats for both node and edge features.

Parameters:

Name	Type	Description	Default
`tracks`	`SolutionTracks`	SolutionTracks object to add features to	required
`features`	`dict[str, bool] \| None`	Dict mapping feature names to recompute flags	required
`feature_type`	`Literal['node', 'edge']`	Type of features ("node" or "edge")	`'node'`

`build(source, segmentation, ...)` Common

Show API documentation

Orchestrate the full construction process.

Parameters:

Name	Type	Description	Default
`source`	`Path \| DataFrame`	Path to data source or DataFrame	required
`segmentation`	`Path \| ndarray \| None`	Optional path to segmentation or pre-loaded segmentation array	`None`
`scale`	`list[float] \| None`	Optional spatial scale	`None`
`node_features`	`dict[str, bool] \| None`	Optional node features to enable/load	`None`
`edge_features`	`dict[str, bool] \| None`	Optional edge features to enable/load	`None`

Returns:

Type	Description
`SolutionTracks`	Fully constructed SolutionTracks object

Raises:

Type	Description
`ValueError`	If self.node_name_map is not set or validation fails

Example

Using prepare() to auto-infer node_name_map

builder = CSVTracksBuilder() builder.prepare("data.csv") tracks = builder.build("data.csv")

Or set node_name_map manually

builder = CSVTracksBuilder() builder.read_header("data.csv") builder.node_name_map = {"time": "t", "x": "x", "y": "y", "id": "id"} tracks = builder.build("data.csv")

Format-Specific Builders

Builder	Required Properties	Edge Properties	API Reference
CSVTracksBuilder	`time`, `id`, `parent_id`, `[z]`, `y`, `x`	No	API
GeffTracksBuilder	`time`, `[z]`, `y`, `x`	Yes	API

Usage: Wrapper Functions vs Builder Pattern

Most users should use the wrapper functions which provide a simpler API:

Recommended: Using Wrapper Functions

GEFF Import (auto-infers name_map):

from funtracks.import_export import import_from_geff

tracks = import_from_geff(
    directory=Path("data.zarr"),
    name_map=None,  # Auto-infer column mappings
    segmentation_path=Path("seg.tif"),
    scale=[1.0, 1.0, 1.0],
    node_features={"area": True}
)

CSV/DataFrame Import:

import pandas as pd
from funtracks.import_export import tracks_from_df

# Read CSV into DataFrame
df = pd.read_csv("tracks.csv")

# Load segmentation array
seg = ... # Load your segmentation array (e.g., from tiff, zarr)

# Import tracks
tracks = tracks_from_df(
    df=df,
    segmentation=seg,  # Pre-loaded segmentation array
    scale=[1.0, 1.0, 1.0],
    features={"Area": "area"}  # Load area from 'area' column
)

Advanced: Using Builder Pattern Directly

For advanced use cases where you need to inspect/modify the inferred name_map:

GEFF Builder:

from funtracks.import_export.geff._import import GeffTracksBuilder

builder = GeffTracksBuilder()
builder.prepare(Path("data.zarr"))  # Auto-infer name_map

# Inspect and optionally modify inferred mappings
print(builder.name_map)
builder.name_map["circularity"] = "circ"  # Override a mapping

tracks = builder.build(
    source_path=Path("data.zarr"),
    segmentation_path=Path("seg.tif"),
    scale=[1.0, 1.0, 1.0],
    node_features={"area": True}
)

CSV Builder:

from funtracks.import_export.csv._import import CSVTracksBuilder

builder = CSVTracksBuilder()
builder.prepare("data.csv")  # Auto-infer name_map

# Inspect and modify
print(builder.name_map)
builder.name_map["time"] = "frame_number"

tracks = builder.build(
    source_path="data.csv",
    segmentation_path="seg.tif"
)

Importing Tracks

TracksBuilder Overview

Process Steps

Preparation Phase

read_header(source_path) Format-specific

infer_node_name_map() Common

prepare(source_path) Common

Optionally modify the inferred mappings

Build Phase

validate_name_map() Common

load_source(source_path, name_map, node_features) Format-specific

validate() Common

construct_graph() Common

handle_segmentation(segmentation_path, scale) Common

enable_features(tracks, node_features) Common

build(source, segmentation, ...) Common

Using prepare() to auto-infer node_name_map

Or set node_name_map manually

Format-Specific Builders

Usage: Wrapper Functions vs Builder Pattern

Recommended: Using Wrapper Functions

Advanced: Using Builder Pattern Directly

`read_header(source_path)` Format-specific

`infer_node_name_map()` Common

`prepare(source_path)` Common

`validate_name_map()` Common

`load_source(source_path, name_map, node_features)` Format-specific

`validate()` Common

`construct_graph()` Common

`handle_segmentation(segmentation_path, scale)` Common

`enable_features(tracks, node_features)` Common

`build(source, segmentation, ...)` Common