queirozf.com

software-architecture

Standardization at Scale is a Superpower

14 Apr 2025   Structuring the codebase such that elements, components, files, etc follow a standard convention is not just an aesthetic taste; it is crucial to enable work and maintenance on large codebases.

Read More ›

paper-summary reinforcement-learning language-modeling

Paper Summary: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

06 Apr 2025   Summary of the 2024 article "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" by Shao et al.

Read More ›

paper-summary reinforcement-learning language-modeling

Paper Summary: Proximal Policy Optimization Algorithms

06 Apr 2025   Summary of the 2017 article "Proximal Policy Optimization Algorithms" by Schulman et al.

Read More ›

Experience Means Finding General Guidelines that Mitigate Risk and Increase Efficiency

18 Oct 2024   General organizational guidelines and lessons learned in tech companies.

Read More ›

paper-summary alignment instruction-tuning

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

06 Oct 2024   Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al.

Read More ›

paper-summary language-modeling

Paper Summary: The Science of Detecting LLM-Generated Texts

28 Jul 2024   Summary of the 2023 article "The Science of Detecting LLM-Generated Texts" by Tang et al.

Read More ›

paper-summary

Paper Summary: Few-shot Fine-Tuning vs In-context Learning: a Fair Comparison and Evaluation

22 Jul 2024   Summary of the 2023 article "Few-shot Fine-Tuning vs In-context Learning: a Fair Comparison and Evaluation" by Mosbach et al.

Read More ›

paper-summary instruction-tuning language-modeling

Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization

31 Mar 2024   Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article

Read More ›

paper-summary

Paper Summary: Learning to summarize from human feedback

09 Mar 2024   Summary of the 2020 article "Learning to summarize from human feedback" by Stiennon et al.

Read More ›

paper-summary instruction-tuning

Paper Summary: Zephyr: Direct Distillation of LM Alignment

02 Jan 2024   Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al.

Read More ›