BigQuery
Featureform supports BigQuery as an Offline Store.

Implementation

Primary Sources

Tables

Table sources are used directly via a view. Featureform will never write to a primary source.

Transformation Sources

SQL transformations are used to create a view. By default, those views are materialized and updated according to the schedule parameter. Deprecated transformations are converted to un-materialized views to save storage space.

Offline to Inference Store Materialization

When a feature is registered, Featureform creates an internal transformation to get the newest value of every feature and its associated entity. A Kubernetes job is then kicked off to sync this up with the Inference store.

Training Set Generation

Every registered feature and label is associated with a view table. That view contains three columns, the entity, value, and timestamp. When a training set is registered, it is created as a materialized view via a JOIN on the corresponding label and feature views.

Configuration

First we have to add a declarative BigQuery configuration in Python.
bigquery_config.py
import featureform as ff
bigquery = ff.register_bigquery(
name="bigquery-quickstart",
description="A BigQuery deployment we created for the Featureform quickstart",
project_id="<your-gcp-project-id>",
dataset_id="<your-big-query-dataset>",
credentials_path="<path-to-bigquery-credentials-file>"
)
Once our config file is complete, we can apply it to our Featureform deployment
featureform apply bigquery_config.py --host $FEATUREFORM_HOST
We can re-verify that the provider is created by checking the Providers tab of the Feature Registry.
Copy link
Outline
Implementation
Primary Sources
Transformation Sources
Offline to Inference Store Materialization
Training Set Generation
Configuration