Reimagining LLM Inference Infrastructure with Memory-Centric KV Cache Servers

Khyati Kiyawat, Kevin Skadron

Published in HotInfra 2026 (co-located with ISCA 2026), 2026