Low-Latency Feature Serving
Redis, Cassandra, and in-memory stores for sub-millisecond feature retrieval.
Redis, Cassandra, and in-memory stores for sub-millisecond feature retrieval.
The fundamental split between pre-computed offline and real-time online features.
Overview of real-time feature engineering for low-latency ML systems.
Streaming LLM output in Python - server-sent events, async generators, FastAPI streaming endpoints, and building real-time chat UIs.