4 docs tagged with "real-time"

Low-Latency Feature Serving

Redis, Cassandra, and in-memory stores for sub-millisecond feature retrieval.

The fundamental split between pre-computed offline and real-time online features.

Overview of real-time feature engineering for low-latency ML systems.

Streaming LLM output in Python - server-sent events, async generators, FastAPI streaming endpoints, and building real-time chat UIs.