A quick start guide for Featureform on AWS EKS.
This quickstart will walk through creating a few simple features, labels, and a training set using Postgres and Redis. We will use a transaction fraud training set.

Step 1: Install Featureform client


  • Python 3.7+
Install the Featureform SDK via Pip.
pip install featureform

Step 2: Deploy EKS

You can follow our Minikube or Kubernetes deployment guide. This will walk through a simple AWS deployment of Featureform with our quick start Helm chart containing Postgres and Redis.
Install the AWS CLI then run the following command to create an EKS cluster.
eksctl create cluster \
--name featureform \
--version 1.21 \
--region us-east-1 \
--nodegroup-name linux-nodes \
--nodes 1 \
--nodes-min 1 \
--nodes-max 4 \
--with-oidc \

Step 3: Install Helm charts

We'll be installing three Helm Charts: Featureform, the Quickstart Demo, and Certificate Manager.
First we need to add the Helm repositories.
helm repo add featureform
helm repo add jetstack
helm repo update
Now we can install the Helm charts.
helm install certmgr jetstack/cert-manager \
--set installCRDs=true \
--version v1.8.0 \
--namespace cert-manager \
helm install featureform featureform/featureform \
--set global.publicCert=true \
--set global.localCert=false \
--set global.hostname=$FEATUREFORM_HOST
helm install quickstart featureform/quickstart

Step 4: Create TLS certificate

We can a self-signed TLS certificate for connecting directly to the load balancer.
Wait until the load balancer has been created. It can be checked using:
kubectl get ingress
When the ingresses have a valid address, we can update the deployment to create the public certificate.
export FEATUREFORM_HOST=$(kubectl get ingress | grep "grpc-ingress" | awk {'print $4'} | column -t)
helm upgrade my-featureform featureform/featureform --set global.hostname=$FEATUREFORM_HOST
We can save and export our self-signed certificate.
kubectl get secret featureform-ca-secret -o=custom-columns='\.crt'| base64 -d > tls.crt
export FEATUREFORM_CERT=$(pwd)/tls.crt
The dashboard is now viewable at your ingress address.

Step 5: Register providers

The Quickstart helm chart creates a Postgres instance with preloaded data, as well as an empty Redis standalone instance. Now that they are deployed, we can write a config file in Python.
import featureform as ff
redis = ff.register_redis(
name = "redis-quickstart",
host="quickstart-redis", # The internal dns name for redis
description = "A Redis deployment we created for the Featureform quickstart"
postgres = ff.register_postgres(
name = "postgres-quickstart",
host="quickstart-postgres", # The internal dns name for postgres
description = "A Postgres deployment we created for the Featureform quickstart"
Once we create our config file, we can apply it to our Featureform deployment.
featureform apply

Step 6: Define our resources

We will create a user profile for us, and set it as the default owner for all the following resource definitions.
Now we'll register our user fraud dataset in Featureform.
transactions = postgres.register_table(
name = "transactions",
variant = "kaggle",
description = "Fraud Dataset From Kaggle",
table = "Transactions", # This is the table's name in Postgres
Next, we'll define a SQL transformation on our dataset.
def average_user_transaction():
"""the average transaction amount for a user """
return "SELECT CustomerID as user_id, avg(TransactionAmount) " \
"as avg_transaction_amt from {{transactions.kaggle}} GROUP BY user_id"
Next, we'll register a passenger entity to associate with a feature and label.
user = ff.register_entity("user")
# Register a column from our transformation as a feature
{"name": "avg_transactions", "variant": "quickstart", "column": "avg_transaction_amt", "type": "float32"},
# Register label from our base Transactions table
{"name": "fraudulent", "variant": "quickstart", "column": "isfraud", "type": "bool"},
Finally, we'll join together the feature and label into a training set.
"fraud_training", "quickstart",
label=("fraudulent", "quickstart"),
features=[("avg_transactions", "quickstart")],
Now that our definitions are complete, we can apply it to our Featureform instance.
featureform apply

Step 7: Serve features for training and inference

Once we have our training set and features registered, we can train our model.
import featureform as ff
client = ff.ServingClient()
dataset = client.training_set("fraud_training", "quickstart")
training_dataset = dataset.repeat(10).shuffle(1000).batch(8)
for row in training_dataset:
print(row.features(), row.label())
We can serve features in production once we deploy our trained model as well.
import featureform as ff
client = ff.ServingClient()
fpf = client.features([("avg_transactions", "quickstart")], {"user": "C1410926"})
# Run features through model