Defining a Label

With a Timestamp

Some labels can change over time. For example, you might have a label indicating a user’s most listened-to song in the last month. You define it in a manner similar to a feature by specifying the dataset and columns for the entity, the value, and the timestamp. You can optionally set the variant and specify the type as one of: ff.Int, ff.Int32, ff.Int64, ff.Float32, ff.Float64, ff.Timestamp, ff.String, ff.Bool. Unlike a Feature, you should not specify an inference store since labels are never served for inference.

import featureform as ff

@ff.entity
class User:
  top_song = ff.Label(dataset[["user", "top_song"]], variant="30days", type=ff.String)

Without a Timestamp

Some labels are set once per entity and remain static. For example, you might have a label indicating whether a user is a bot or not. You define it in a manner similar to a feature by specifying the dataset and columns for the entity and the value. You can optionally set the variant and specify the type as one of: ff.Int, ff.Int32, ff.Int64, ff.Float32, ff.Float64, ff.Timestamp, ff.String, ff.Bool. Unlike a Feature, you should not specify an inference store since labels are never served for inference.

import featureform as ff

@ff.entity
class User:
  is_bot = ff.Label(dataset[["user", "is_bot"]], variant="simple", type=ff.Bool)

Defining a Training Set

A training set comprises a label combined with a set of features. It can be defined as follows:

ff.register_training_set(name, variant, label=(name, variant), features=[(name, variant)])

In cases where both the label and feature have training sets, we will automatically generate a point-in-time correct training set for you.

You can then serve it as a dataframe or via a streaming method:

client.training_set(name, variant).dataframe()