Standardization at Scale is a Superpower
14 Apr 2025 Structuring the codebase such that elements, components, files, etc follow a standard convention is not just an aesthetic taste; it is crucial to enable work and maintenance on large codebases.
paper-summary reinforcement-learning language-modeling
Paper Summary: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
06 Apr 2025 Summary of the 2024 article "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" by Shao et al.
paper-summary reinforcement-learning language-modeling
Paper Summary: Proximal Policy Optimization Algorithms
06 Apr 2025 Summary of the 2017 article "Proximal Policy Optimization Algorithms" by Schulman et al.
Experience Means Finding General Guidelines that Mitigate Risk and Increase Efficiency
18 Oct 2024 General organizational guidelines and lessons learned in tech companies.
Read More ›paper-summary alignment instruction-tuning
Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
06 Oct 2024 Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al.
paper-summary language-modeling
Paper Summary: The Science of Detecting LLM-Generated Texts
28 Jul 2024 Summary of the 2023 article "The Science of Detecting LLM-Generated Texts" by Tang et al.
Paper Summary: Few-shot Fine-Tuning vs In-context Learning: a Fair Comparison and Evaluation
22 Jul 2024 Summary of the 2023 article "Few-shot Fine-Tuning vs In-context Learning: a Fair Comparison and Evaluation" by Mosbach et al.
paper-summary instruction-tuning language-modeling
Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization
31 Mar 2024 Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article
Paper Summary: Learning to summarize from human feedback
09 Mar 2024 Summary of the 2020 article "Learning to summarize from human feedback" by Stiennon et al.
paper-summary instruction-tuning
Paper Summary: Zephyr: Direct Distillation of LM Alignment
02 Jan 2024 Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al.