Publications

Research

April 2026memoryretrieval

Your RAG Cache Is Serving Wrong Answers

Semantic caching for RAG systems introduces a correctness hazard that is hard to see and easy to ship. This post documents the failure mode, shows how often it occurs across model families, and proposes a lightweight retrieval-aware validation layer that cuts stale-hit errors by over 90% with minimal latency overhead.

Read

April 2026memoryretrieval

Access Patterns for Caching in AI Agent and RAG Systems

A taxonomy of nine distinct access patterns across enterprise RAG and agentic deployments — from single-hop factual lookup to parallel agent fleets — and what each requires from a cache. Includes a gap analysis of existing solutions.

Read