Sharing a brief intro note of RLHF algorithms that I made for the reading group presentation of our lab. The github repo here holds the slides as well as the list of interesting papers: https://github.com/yihedeng9/rlhf-summary-notes

Comments