Publications
Research
April 2026memoryretrieval
Your RAG Cache Is Serving Wrong Answers
Semantic caching for RAG systems introduces a correctness hazard that is hard to see and easy to ship. This post documents the failure mode, shows how often it occurs across model families, and proposes a lightweight retrieval-aware validation layer that cuts stale-hit errors by over 90% with minimal latency overhead.
Read
April 2026memoryretrieval
Access Patterns for Caching in AI Agent and RAG Systems
A taxonomy of nine distinct access patterns across enterprise RAG and agentic deployments — from single-hop factual lookup to parallel agent fleets — and what each requires from a cache. Includes a gap analysis of existing solutions.
Read