Chronon Logo

End-to-End Feature Platform for Machine Learning

Chronon is an open-source feature platform that enables ML teams to rapidly build consistent ML features across training and serving environments at scale.

Proven in Production at Industry-Leading ML Teams

Point-in-Time Correctness

Eliminate data leakage with guaranteed point-in-time correctness. Generate accurate training datasets using state-of-the-art aggregation algorithms—no more expensive log-and-wait pipelines or costly retraining cycles.

Production-Grade Performance

Serve features with sub-10ms p99 latency using battle-tested Vert.x infrastructure. Deploy in embedded mode for minimal overhead or standalone for independent scaling—your choice, zero code changes.

Guaranteed Consistency

Write features once. The same declarative definitions automatically power both batch training datasets and real-time serving endpoints—eliminating training-serving skew and the bugs that come with maintaining duplicate implementations.

Developer-Friendly API

Express complex temporal aggregations in simple, declarative Python. One unified API works across batch (Spark), streaming (Flink), and serving contexts—no need to learn multiple frameworks or translate logic between execution engines.

Enterprise-Ready Technology Stack

Apache Spark

Scalable batch processing for historical features

Apache Flink

Streaming aggregations for real-time feature updates

Flexible Storage

Bring your own KV store (Redis, DynamoDB, etc.)

Vert.x Serving Layer

High-throughput serving with flexible deployment

Features Defined in Code, Not Configuration

Connect any data source—event streams, database tables, or external APIs—with a simple, type-safe configuration.

source = Source(
    events=EventSource(
        table="data.purchases",
        topic="events/purchases",
        query=Query(
            selects=select(user="user_id", price="purchase_price"),
            time_column="ts"
        )
    )
)
            
Define time-windowed aggregations that automatically compute both training data and serve real-time results—schema inference included.

feature_group = GroupBy(
    sources=[source],
    keys=["user_id"],
    aggregations=[
        Aggregation(
            input_column="price",
            operation=Operation.SUM,
            windows=[Window(3, TimeUnit.DAYS), Window(14, TimeUnit.DAYS)]
        ),
        Aggregation(
            input_column="price",
            operation=Operation.LAST_K(10),
        )
    ],
    online=True
)
            
Join multiple feature groups into training sets with guaranteed point-in-time correctness— preventing data leakage automatically.

training_set = Join(
    left=EventSource(table="data.checkouts", ...),
    right_parts=[JoinPart(group_by=purchases_v1)],
    online=True
)
            

Community & Resources

Blog Post

Chronon — A Declarative Feature Engineering Framework

Airbnb Tech Blog
Blog Post

Shepherd: Stripe's next-generation machine learning feature engineering platform

Stripe Engineering
Video

Chronon: Airbnb's Open-Source Data Platform

MLOps summit - 2025
Video

Building Generative Recommenders with Chronon

Feature Store summit - 2025