Skip to content

Ibis API Reference

metaxy.metadata_store.ibis.IbisMetadataStore

IbisMetadataStore(versioning_engine: VersioningEngineOptions = 'auto', connection_string: str | None = None, *, backend: str | None = None, connection_params: dict[str, Any] | None = None, table_prefix: str | None = None, **kwargs: Any)

Bases: MetadataStore, ABC

Generic SQL metadata store using Ibis.

Supports any Ibis backend that supports struct types, such as: DuckDB, PostgreSQL, ClickHouse, and others.

Warning

Backends without native struct support (e.g., SQLite) are NOT supported.

Storage layout: - Each feature gets its own table: {feature}__{key} - System tables: metaxy__system__feature_versions, metaxy__system__migrations - Uses Ibis for cross-database compatibility

Note: Uses MD5 hash by default for cross-database compatibility. DuckDBMetadataStore overrides this with dynamic algorithm detection. For other backends, override the calculator instance variable with backend-specific implementations.

Example
# ClickHouse
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

# PostgreSQL
store = IbisMetadataStore("postgresql://user:pass@host:5432/db")

# DuckDB (use DuckDBMetadataStore instead for better hash support)
store = IbisMetadataStore("duckdb:///metadata.db")

with store:
    store.write_metadata(MyFeature, df)

Parameters:

  • versioning_engine (VersioningEngineOptions, default: 'auto' ) –

    Which versioning engine to use. - "auto": Prefer the store's native engine, fall back to Polars if needed - "native": Always use the store's native engine, raise VersioningEngineMismatchError if provided dataframes are incompatible - "polars": Always use the Polars engine

  • connection_string (str | None, default: None ) –

    Ibis connection string (e.g., "clickhouse://host:9000/db") If provided, backend and connection_params are ignored.

  • backend (str | None, default: None ) –

    Ibis backend name (e.g., "clickhouse", "postgres", "duckdb") Used with connection_params for more control.

  • connection_params (dict[str, Any] | None, default: None ) –

    Backend-specific connection parameters e.g., {"host": "localhost", "port": 9000, "database": "default"}

  • table_prefix (str | None, default: None ) –

    Optional prefix applied to all feature and system table names. Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL identifier when combined with the generated table name.

  • **kwargs (Any, default: {} ) –

    Passed to MetadataStore.init (e.g., fallback_stores, hash_algorithm)

Raises:

  • ValueError

    If neither connection_string nor backend is provided

  • ImportError

    If Ibis or required backend driver not installed

Example
# Using connection string
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

# Using backend + params
store = IbisMetadataStore(
    backend="clickhouse",
    connection_params={"host": "localhost", "port": 9000}
    )
Source code in src/metaxy/metadata_store/ibis.py
def __init__(
    self,
    versioning_engine: VersioningEngineOptions = "auto",
    connection_string: str | None = None,
    *,
    backend: str | None = None,
    connection_params: dict[str, Any] | None = None,
    table_prefix: str | None = None,
    **kwargs: Any,
):
    """
    Initialize Ibis metadata store.

    Args:
        versioning_engine: Which versioning engine to use.
            - "auto": Prefer the store's native engine, fall back to Polars if needed
            - "native": Always use the store's native engine, raise `VersioningEngineMismatchError`
                if provided dataframes are incompatible
            - "polars": Always use the Polars engine
        connection_string: Ibis connection string (e.g., "clickhouse://host:9000/db")
            If provided, backend and connection_params are ignored.
        backend: Ibis backend name (e.g., "clickhouse", "postgres", "duckdb")
            Used with connection_params for more control.
        connection_params: Backend-specific connection parameters
            e.g., {"host": "localhost", "port": 9000, "database": "default"}
        table_prefix: Optional prefix applied to all feature and system table names.
            Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL
            identifier when combined with the generated table name.
        **kwargs: Passed to MetadataStore.__init__ (e.g., fallback_stores, hash_algorithm)

    Raises:
        ValueError: If neither connection_string nor backend is provided
        ImportError: If Ibis or required backend driver not installed

    Example:
        ```py
        # Using connection string
        store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

        # Using backend + params
        store = IbisMetadataStore(
            backend="clickhouse",
            connection_params={"host": "localhost", "port": 9000}
            )
        ```
    """
    import ibis

    self.connection_string = connection_string
    self.backend = backend
    self.connection_params = connection_params or {}
    self._conn: ibis.BaseBackend | None = None
    self._table_prefix = table_prefix or ""

    super().__init__(
        **kwargs,
        versioning_engine=versioning_engine,
        versioning_engine_cls=IbisVersioningEngine,
    )

Attributes

metaxy.metadata_store.ibis.IbisMetadataStore.ibis_conn property

ibis_conn: BaseBackend

Get Ibis backend connection.

Returns:

  • BaseBackend

    Active Ibis backend connection

Raises:

metaxy.metadata_store.ibis.IbisMetadataStore.conn property

conn: BaseBackend

Get connection (alias for ibis_conn for consistency).

Returns:

  • BaseBackend

    Active Ibis backend connection

Raises:

metaxy.metadata_store.ibis.IbisMetadataStore.sqlalchemy_url property

sqlalchemy_url: str

Get SQLAlchemy-compatible connection URL for tools like Alembic.

Returns the connection string if available. If the store was initialized with backend + connection_params instead of a connection string, raises an error since constructing a proper URL is backend-specific.

Returns:

  • str

    SQLAlchemy-compatible URL string

Raises:

  • ValueError

    If connection_string is not available

Example
store = IbisMetadataStore("postgresql://user:pass@host:5432/db")
print(store.sqlalchemy_url)  # postgresql://user:pass@host:5432/db

Functions

metaxy.metadata_store.ibis.IbisMetadataStore.get_table_name

get_table_name(key: FeatureKey) -> str

Generate the storage table name for a feature or system table.

Applies the configured table_prefix (if any) to the feature key's table name. Subclasses can override this method to implement custom naming logic.

Parameters:

  • key (FeatureKey) –

    Feature key to convert to storage table name.

Returns:

  • str

    Storage table name with optional prefix applied.

Source code in src/metaxy/metadata_store/ibis.py
def get_table_name(
    self,
    key: FeatureKey,
) -> str:
    """Generate the storage table name for a feature or system table.

    Applies the configured table_prefix (if any) to the feature key's table name.
    Subclasses can override this method to implement custom naming logic.

    Args:
        key: Feature key to convert to storage table name.

    Returns:
        Storage table name with optional prefix applied.
    """
    base_name = key.table_name

    return f"{self._table_prefix}{base_name}" if self._table_prefix else base_name

metaxy.metadata_store.ibis.IbisMetadataStore.open

open(mode: AccessMode = 'read') -> Iterator[Self]

Open connection to database via Ibis.

Subclasses should override this to add backend-specific initialization (e.g., loading extensions) and must call this method via super().open(mode).

Parameters:

  • mode (AccessMode, default: 'read' ) –

    Access mode. Subclasses may use this to set backend-specific connection parameters (e.g., read_only for DuckDB).

Yields:

  • Self ( Self ) –

    The store instance with connection open

Source code in src/metaxy/metadata_store/ibis.py
@contextmanager
def open(self, mode: AccessMode = "read") -> Iterator[Self]:
    """Open connection to database via Ibis.

    Subclasses should override this to add backend-specific initialization
    (e.g., loading extensions) and must call this method via super().open(mode).

    Args:
        mode: Access mode. Subclasses may use this to set backend-specific connection
            parameters (e.g., `read_only` for DuckDB).

    Yields:
        Self: The store instance with connection open
    """
    import ibis

    # Increment context depth to support nested contexts
    self._context_depth += 1

    try:
        # Only perform actual open on first entry
        if self._context_depth == 1:
            # Setup: Connect to database
            if self.connection_string:
                # Use connection string
                self._conn = ibis.connect(self.connection_string)
            else:
                # Use backend + params
                # Get backend-specific connect function
                assert self.backend is not None, (
                    "backend must be set if connection_string is None"
                )
                backend_module = getattr(ibis, self.backend)
                self._conn = backend_module.connect(**self.connection_params)

            # Mark store as open and validate
            self._is_open = True
            self._validate_after_open()

        yield self
    finally:
        # Decrement context depth
        self._context_depth -= 1

        # Only perform actual close on last exit
        if self._context_depth == 0:
            # Teardown: Close connection
            if self._conn is not None:
                # Ibis connections may not have explicit close method
                # but setting to None releases resources
                self._conn = None
            self._is_open = False

metaxy.metadata_store.ibis.IbisMetadataStore.write_metadata_to_store

write_metadata_to_store(feature_key: FeatureKey, df: Frame, **kwargs: Any) -> None

Internal write implementation using Ibis.

Parameters:

  • feature_key (FeatureKey) –

    Feature key to write to

  • df (Frame) –

    DataFrame with metadata (already validated)

  • **kwargs (Any, default: {} ) –

    Backend-specific parameters (currently unused)

Raises:

Source code in src/metaxy/metadata_store/ibis.py
def write_metadata_to_store(
    self,
    feature_key: FeatureKey,
    df: Frame,
    **kwargs: Any,
) -> None:
    """
    Internal write implementation using Ibis.

    Args:
        feature_key: Feature key to write to
        df: DataFrame with metadata (already validated)
        **kwargs: Backend-specific parameters (currently unused)

    Raises:
        TableNotFoundError: If table doesn't exist and auto_create_tables is False
    """
    if df.implementation == nw.Implementation.IBIS:
        df_to_insert = df.to_native()  # Ibis expression
    else:
        from metaxy._utils import collect_to_polars

        df_to_insert = collect_to_polars(df)  # Polars DataFrame

    table_name = self.get_table_name(feature_key)

    try:
        self.conn.insert(table_name, obj=df_to_insert)  # type: ignore[attr-defined]  # pyright: ignore[reportAttributeAccessIssue]
    except Exception as e:
        import ibis.common.exceptions

        if not isinstance(e, ibis.common.exceptions.TableNotFound):
            raise
        if self.auto_create_tables:
            # Warn about auto-create (first time only)
            if self._should_warn_auto_create_tables:
                import warnings

                warnings.warn(
                    f"AUTO_CREATE_TABLES is enabled - automatically creating table '{table_name}'. "
                    "Do not use in production! "
                    "Use proper database migration tools like Alembic for production deployments.",
                    UserWarning,
                    stacklevel=4,
                )

            # Note: create_table(table_name, obj=df) both creates the table AND inserts the data
            # No separate insert needed - the data from df is already written
            self.conn.create_table(table_name, obj=df_to_insert)
        else:
            raise TableNotFoundError(
                f"Table '{table_name}' does not exist for feature {feature_key.to_string()}. "
                f"Enable auto_create_tables=True to automatically create tables, "
                f"or use proper database migration tools like Alembic to create the table first."
            ) from e

metaxy.metadata_store.ibis.IbisMetadataStore.read_metadata_in_store

read_metadata_in_store(feature: CoercibleToFeatureKey, *, feature_version: str | None = None, filters: Sequence[Expr] | None = None, columns: Sequence[str] | None = None, **kwargs: Any) -> LazyFrame[Any] | None

Read metadata from this store only (no fallback).

Parameters:

  • feature (CoercibleToFeatureKey) –

    Feature to read

  • feature_version (str | None, default: None ) –

    Filter by specific feature_version (applied as SQL WHERE clause)

  • filters (Sequence[Expr] | None, default: None ) –

    List of Narwhals filter expressions (converted to SQL WHERE clauses)

  • columns (Sequence[str] | None, default: None ) –

    Optional list of columns to select

  • **kwargs (Any, default: {} ) –

    Backend-specific parameters (currently unused)

Returns:

  • LazyFrame[Any] | None

    Narwhals LazyFrame with metadata, or None if not found

Source code in src/metaxy/metadata_store/ibis.py
def read_metadata_in_store(
    self,
    feature: CoercibleToFeatureKey,
    *,
    feature_version: str | None = None,
    filters: Sequence[nw.Expr] | None = None,
    columns: Sequence[str] | None = None,
    **kwargs: Any,
) -> nw.LazyFrame[Any] | None:
    """
    Read metadata from this store only (no fallback).

    Args:
        feature: Feature to read
        feature_version: Filter by specific feature_version (applied as SQL WHERE clause)
        filters: List of Narwhals filter expressions (converted to SQL WHERE clauses)
        columns: Optional list of columns to select
        **kwargs: Backend-specific parameters (currently unused)

    Returns:
        Narwhals LazyFrame with metadata, or None if not found
    """
    feature_key = self._resolve_feature_key(feature)
    table_name = self.get_table_name(feature_key)

    # Check if table exists
    existing_tables = self.conn.list_tables()
    if table_name not in existing_tables:
        return None

    # Get Ibis table reference
    table = self.conn.table(table_name)

    # Wrap Ibis table with Narwhals (stays lazy in SQL)
    nw_lazy: nw.LazyFrame[Any] = nw.from_native(table, eager_only=False)

    # Apply feature_version filter (stays in SQL via Narwhals)
    if feature_version is not None:
        nw_lazy = nw_lazy.filter(
            nw.col("metaxy_feature_version") == feature_version
        )

    # Apply generic Narwhals filters (stays in SQL)
    if filters is not None:
        for filter_expr in filters:
            nw_lazy = nw_lazy.filter(filter_expr)

    # Select columns (stays in SQL)
    if columns is not None:
        nw_lazy = nw_lazy.select(columns)

    # Return Narwhals LazyFrame wrapping Ibis table (stays lazy in SQL)
    return nw_lazy

metaxy.metadata_store.ibis.IbisMetadataStore.display

display() -> str

Display string for this store.

Source code in src/metaxy/metadata_store/ibis.py
def display(self) -> str:
    """Display string for this store."""
    from metaxy.metadata_store.utils import sanitize_uri

    backend_info = self.connection_string or f"{self.backend}"
    # Sanitize connection strings that may contain credentials
    sanitized_info = sanitize_uri(backend_info)
    return f"{self.__class__.__name__}(backend={sanitized_info})"

metaxy.metadata_store.ibis.IbisMetadataStore.config_model classmethod

config_model() -> type[IbisMetadataStoreConfig]

Return the configuration model class for this store type.

Subclasses must override this to return their specific config class.

Returns:

Note

Subclasses override this with a more specific return type. Type checkers may show a warning about incompatible override, but this is intentional - each store returns its own config type.

Source code in src/metaxy/metadata_store/ibis.py
@classmethod
def config_model(cls) -> type[IbisMetadataStoreConfig]:  # pyright: ignore[reportIncompatibleMethodOverride]
    return IbisMetadataStoreConfig