Exploring the Feature Registry
Once we have everything registered (e.g. features, training sets, providers), we can see information about them on the Feature Registry.
Homepage
Homepage of the Feature Registry
The homepage contains links to:
-
Sources: Both primary sources and transformed sources, from streams to files to tables.
-
Features: Inputs to models, transformed from raw data. To be served for inference.
- e.g. Raw data about all user transactions can be transformed into features such as avg_transaction_amt and user_account_age.
-
Entities: Higher-level groupings of features, dependent on where a set of features originates from.
- e.g. user_account_age and avg_transaction_amt are both features under the user entity.
-
Labels: Features that indicate the “correct answer” of a model prediction, or what the model aims to predict.
- e.g. The is_fraud label is true if the transaction is fraudulent, and false if the transaction is not fraudulent.
-
Training Sets: Sets of features matched with the respective labels. To be served for training.
- e.g. The is_fraud training set contains a set of features __ (amt_spent, avg_transaction_amt, number_of_fraud, etc.) and labels.
-
Models: Programs that can make predictions on a previously unseen dataset, after being trained with training sets.
- e.g. The user_fraud_random_forest model is a classifier, predicting whether a user committed fraud.
-
Providers: Data infrastructure used for storage or computation.
-
Users: Individual data scientists who create, share, or reuse features and models.
Resource Pages: Features, Sources, Labels, Training Sets
Resources pages generally have the same format. They display a list of that resource type, along with descriptions.
Sources page of the Feature Registry
The feature page has additional columns, namely “type” and “default variant”.
Feature page of the Feature Registry
Click on the arrow next to a source name to see a list of variants of that resource.
Dropdown view of variants on the user transaction count, 30d and 7d
Next, click on a variant (or the resource name for the default variant) to pull up more details, including the description, owner, provider, data type, status, source, entity, and columns. Some fields link to more information. Change the variant by using the small dropdown menu on the top-right.
Detailed view of the user transaction count feature
Metrics, namely throughput, latency, and errors for that variant are displayed for features and training sets.
Metrics view of a feature variant
Entities
The entities page is similar to other resource types, except that there are 3 tabs (“Features”, “Labels”, “Training Sets”). The features, labels, and training sets corresponding to that entity are shown under these tabs, with the ability to select variants and see detailed views.
Entities page of the Feature Registry
Providers
The providers page shows all providers, with the corresponding name, description, type, and software.
Providers page of the Feature Registry
Click on the providers to pull up the sources, features, labels, and training sets originating from that provider.
Detailed view of a single provider
Models
The models page shows all model names, which the user can optionally provide at serving time (see example). Featureform tracks which features and training sets are associated with which model names. However, models are not stored.
Models page of the Feature Registry
Users
The users page show all users’ names. Click on a user to view features, labels, training sets, and sources associated with that user.
User page of the Feature Registry
Detailed view of a user