Tag: Inference Optimization
All the papers with the tag "Inference Optimization".
Overflow Prevention Enhances Long-Context Recurrent LLMs
grok-3-latestScore: 0.79Published: at 17:45本文提出 OPRM,一种训练无关的推理方法,通过分块处理缓解循环模型内存溢出问题,显著提升长上下文任务性能,并保持亚二次方复杂度优势。