Zhipeng Chen, Yingqian Min, Beichen Zhang, Jie Chen, Jinhao Jiang, Daixuan Cheng, Wayne Xin Zhao, Zheng Liu, Xu Miao, Yang Lu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen
An Empirical Study on Eliciting and Improving R1-like Reasoning Models
https://arxiv.org/abs/2503.04548

Comments