Recompute¶
This example demonstrates how Metaxy automatically detects when upstream features change and recomputes downstream dependencies.
Warning
This examples is a WIP
How It Works¶
When a feature's code_version changes, Metaxy:
- Detects the change in the feature definition
- Identifies all downstream features that depend on it
- Automatically recomputes those features with the new upstream data
Plan¶
1. Setup upstream data
Create initial upstream data for the pipeline
- Generate raw data samples
2. Initial pipeline run
First execution with version 1 features
- Run pipeline with initial feature definitions
3. Idempotent rerun
Second execution should detect no changes
- Rerun pipeline without any code changes
4. Code evolution
Apply patch to change parent feature code_version from 1 to 2, triggering child recomputation
- Change parent embedding code_version from 1 to 2:
- Run pipeline with updated parent feature
Feature Definitions¶
Initial Code¶
The parent feature starts with code_version="1":
src/example_recompute/features.py
"""Feature definitions for recompute example."""
from metaxy import (
Feature,
FeatureDep,
FeatureKey,
FeatureSpec,
FieldDep,
FieldKey,
FieldSpec,
)
class ParentFeature(
Feature,
spec=FeatureSpec(
key=FeatureKey(["examples", "parent"]),
fields=[
FieldSpec(
key=FieldKey(["embeddings"]),
code_version="1",
),
],
id_columns=("sample_uid",),
),
):
"""Parent feature that generates embeddings from raw data."""
pass
class ChildFeature(
Feature,
spec=FeatureSpec(
key=FeatureKey(["examples", "child"]),
deps=[FeatureDep(feature=ParentFeature.spec().key)],
fields=[
FieldSpec(
key=FieldKey(["predictions"]),
code_version="1",
deps=[
FieldDep(
feature=ParentFeature.spec().key,
fields=[FieldKey(["embeddings"])],
)
],
),
],
id_columns=("sample_uid",),
),
):
"""Child feature that uses parent embeddings to generate predictions."""
pass
The Change¶
Let's change the code_version:
patches/01_update_parent_algorithm.patch
Updated Code¶
Error
Failed to process metaxy-example file
Key Takeaway¶
Metaxy ensures all features remain consistent with their dependencies. When ParentFeature.code_version changes from "1" to "2", ChildFeature automatically recomputes—no manual tracking required.