queirozf.com

paper-summary language-modeling reasoning

Paper Summary: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

15 Jun 2025   Summary of the 2022 article "Chain-of-Though Prompting Elicits Reasoning in Large Language Models" by Wei et al.

Read More ›

paper-summary sequence-learning recurrent-neural-networks

Paper Summary: Learning to Forget: Continual Prediction with LSTM

31 May 2025   Summary of the 1999 article "Learning to Forget: Continual Prediction with LSTM" by Gers et al.

Read More ›

testing software-engineering software-architecture

Why Test Software? TL;DR Summary

27 Apr 2025   Minimum Viable Post to send to your colleague who isn't sure whether tests are really needed.

Read More ›

paper-summary language-modeling reinforcement-learning

Paper Summary: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

19 Apr 2025   Summary of the 2025 article "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" by DeepSeek AI.

Read More ›

apache-pinot

Manipulating Datetime Values in Apache Pinot: Reference and Examples

17 Apr 2025   Examples on how to manipulate and format datetime and datetime-like values in Apache Pinot.

Read More ›

software-architecture

Standardization at Scale is a Superpower

14 Apr 2025   Structuring the codebase such that elements, components, files, etc follow a standard convention is not just an aesthetic taste; it is crucial to enable work and maintenance on large codebases.

Read More ›

paper-summary reinforcement-learning language-modeling

Paper Summary: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

06 Apr 2025   Summary of the 2024 article "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" by Shao et al.

Read More ›

paper-summary reinforcement-learning language-modeling

Paper Summary: Proximal Policy Optimization Algorithms

06 Apr 2025   Summary of the 2017 article "Proximal Policy Optimization Algorithms" by Schulman et al.

Read More ›

Experience Means Finding General Guidelines that Mitigate Risk and Increase Efficiency

18 Oct 2024   General organizational guidelines and lessons learned in tech companies.

Read More ›

paper-summary alignment instruction-tuning

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

06 Oct 2024   Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al.

Read More ›