Daily Paper Machine

Tag: Mathematical Reasoning

All the papers with the tag "Mathematical Reasoning".

Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving
grok-3-latest
Score: 0.69
Published:2025年5月12日 at 17:23
#LLM, #Reinforcement Learning, #Tool Integration, #Mathematical Reasoning, #Scaling Law
本文通过ZeroTIR框架，揭示了Agent RL Scaling Law，验证了基础LLM可通过强化学习自发学习代码执行工具，显著提升数学推理能力。
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
grok-3-latest
Score: 0.81
Published:2025年5月5日 at 07:38
#LLM, #Pre-Training, #Data Quality, #Code Generation, #Mathematical Reasoning
本文通过系统性重写预训练数据，构建 SwallowCode 和 SwallowMath 数据集，显著提升了大型语言模型在代码生成和数学推理任务上的性能，提出了一种创新的‘改造并保留’数据处理范式。
DeepCritic: Deliberate Critique with Large Language Models
grok-3-latest
Score: 0.72
Published:2025年5月1日 at 17:03
#LLM, #Critique Model, #Mathematical Reasoning, #Supervised Fine-Tuning, #Reinforcement Learning
本文提出 DeepCritic 框架，通过两阶段训练（监督微调与强化学习）显著提升大型语言模型在数学推理任务中的批判能力，为自动化监督和模型自我改进铺平道路。