ReasoningModel

Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

论文概览论文标题：Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search 研究机构：AI Lab 基座模型：Llama-3.1-8B-Instruct, DeepSeek-Math-7B-Instruct 论文地址：https://arxiv.org/abs/2501.01478 ...

s1: Simple test-time scaling

单位： Stanford 代码：https://github.com/simplescaling/s1 基座模型： Qwen2.5 32B-Instruct 原文地址：https://arxiv.org/abs/2501.19393 ...

Sky-T1: Train your own O1 preview model within $450

原博客地址：https://novasky-ai.github.io/posts/sky-t1/ 代码：https://github.com/NovaSky-AI/SkyThought ...

LIMO: Less Is More for Reasoning

单位： SJTU 代码：https://github.com/GAIR-NLP/LIMO 基座模型： Qwen2.5-32B-Instruct 原文地址：https://arxiv.org/pdf/2502.03387 ...