ClickHouse
Featureform supports ClickHouse as an Offline Store.
Implementation
Primary Sources
Tables
Table sources are used directly via a view. Featureform will never write to a primary source.
Transformation Sources
SQL transformations are used to create a view. By default, those views are materialized and updated according to the schedule parameter. Deprecated transformations are converted to un-materialized views to save storage space.
Offline to Inference Store Materialization
When a feature is registered, Featureform creates an internal transformation to get the newest value of every feature and its associated entity using a ClickHouse ASOF JOIN. A Kubernetes job is then kicked off to sync this up with the Inference store.
Training Set Generation
Every registered feature and label is associated with a view table. That view contains three columns, the entity, value, and timestamp. When a training set is registered, it is created as a materialized view via a JOIN on the corresponding label and feature views.
Configuration
First we have to add a declarative ClickHouse configuration in Python.
Note: SSL connections typically use port 9440.
Once our config file is complete, we can apply it to our Featureform deployment
We can re-verify that the provider is created by checking the Providers tab of the Feature Registry.
Mutable Configuration Fields
-
description
-
username
-
password
-
port
-
ssl