Learn to build a modular real-time feature pipeline, so you avoid Offline-Online Feature Skew, and your deployed ML models work as expected.
A real-time feature pipeline is a program that constantly transforms
and saves these features into a Feature Store.
Real-time feature pipelines are used for real-time ML problems like fraud detection, or cutting-edge recommender systems.
Once the features are in the store, you can fetch them to
To ensure your deployed model performance matches the test metrics you get at training time, you need to generate features IN THE EXACT SAME WAY.
This is especially tricky for real-time feature pipelines, where
We would like to re-use as much code as possible, and only re-write pre-processing and post-processing logic, depending on
Python alone is not a language designed for speed 🐢, which makes it unsuitable for real-time processing. Because of this, real-time feature pipelines were usually writen with Java-based tools like Apache Spark or Apache Flink.
However, things are changing fast with the emergence of Rust 🦀 and libraries like Bytewax 🐝 that expose a pure Python API on top of a highly-efficient language like Rust.
So you get the best from both worlds.
So you can develop highly performant and scalable real-time pipelines, leveraging top-notch Python libraries.
Create a Python virtual environment with the project dependencies with
$ make init
Set your Hopsworks API key and project name variables in set_environment_variables_template.sh
, rename the file and run it (sign up for free at hospworks.ai to get these 2 values)
$ . ./set_environment_variables.sh
To run the feature pipeline in production
mode run
$ make run
To run the feature pipeline in backfill
mode, set your PREFECT_API_KEY
in set_environment_variables_template.sh
, run the file, and then
$ from_day=2023-08-01 make backfill
To run the feature pipeline in debug
mode run
$ make debug
I am preparing a new hands-on tutorial where you will learn to buld a complete real-time ML system, from A to Z.
➡️ Subscribe to The Real-World ML Newsletter to access exclusive discounts.