Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

论文概览 论文标题:Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search 研究机构:AI Lab 基座模型:Llama-3.1-8B-Instruct, DeepSeek-Math-7B-Instruct 论文地址:https://arxiv.org/abs/2501.01478 ...

2025年02月20日 · 6 分钟 · 2612 字 · ZhaoYang