Summary of the 2023 article "A General Theoretical Paradigm to Understand Learning from Human Preferences" (AKA the IPO paper) by Azar et al.
Read More ›
Summary of the 2022 article "Training language models to follow instructions with human feedback" by Ouyang et al. AKA the InstructGPT article
Read More ›