Snowflake
Featureform supports Snowflake as an Offline Store.
Implementation
Primary Sources
Tables
Table sources are used directly via a view. Featureform will never write to a primary source.
Files
Files are copied into a Snowflake table via a Kubernetes Job kicked off by our coordinator. If scheduling is set, the table will atomically be re-copied over.
Transformation Sources
SQL transformations are used to create a view. By default, those views are materialized and updated according to the schedule parameter. Deprecated transformations are converted to un-materialized views to save storage space.
Offline to Inference Store Materialization
When a feature is registered, Featureform creates an internal transformation to get the newest value of every feature and its associated entity. A Kubernetes job is then kicked off to sync this up with the Inference store.
Training Set Generation
Every registered feature and label is associated with a view table. That view contains three columns, the entity, value, and timestamp. When a training set is registered, it is created as a materialized view via a JOIN on the corresponding label and feature views.
Configuration
First we have to add a declarative Snowflake configuration in Python.
Credentials with Account and Organization
Legacy Credentials
Older Snowflake accounts may have credentials that use an Account Locator
rather than an account
and organization
to connect. Featureform provides a separate registration function to support these credentials.
Once our config file is complete, we can apply it to our Featureform deployment
We can re-verify that the provider is created by checking the Providers tab of the Feature Registry.
Mutable Configuration Fields
-
description
-
username
-
password
-
role