All Posts
AI coding tools have separated into two classes. One compensates for missing operator skill; one amplifies present operator skill. Senior engineers systematically prefer the second. The split is a market signal that operator skill is now the bottleneck, and a snapshot, not a destination.
April 2026
When the cost of the next step approaches zero, the question of whether to take it stops getting asked. Slow isn't a virtue. It's the protocol for keeping fast aligned with what you actually wanted.
March 2026
Every component eval passed. The agent pipeline still failed. What changes when you move from evaluating single calls to evaluating trajectories.
February 2026
LLMs pass the bar exam but can't count letters. The failures aren't random. They're architectural fingerprints, and understanding them changes how you use these systems.
February 2026
The hardest part of reinforcement learning isn't the algorithm. It's knowing what you actually want, and whether that can even be formalized.
January 2026
Watching AI tools solve the same problem in radically different ways reveals something about their cognitive architecture, and about the nature of problem-solving itself.
December 2025
My job quietly changed underneath me. I used to write code. Now I write instructions for systems that write code. Tracing a 50-year abstraction trend to understand what engineering is becoming.
December 2025
What it's like to build AI systems that automate your own skills, and the quiet fear that doesn't fit neatly into hype or doom.
December 2025
AI is collapsing the cost of building software, and with it, the value of technical advantages. If building is cheap, distribution, the ability to reach and retain people, becomes the only defensible position. An engineer wrestles with what that means.
December 2025
Most of building AI agents is debugging JSON. But sometimes you remember what you're actually building, and the gap between those two realities is where the real questions live.
November 2025
LLMs don't remember anything, yet agents built on them seem to learn and retain information. The gap between these facts reveals something surprising about what memory actually is.
November 2025
As AI handles more code generation, the human skill shifts from creation to curation. What is engineering taste, why can't AI have it, and what does that mean for us?
November 2025
Children forget everything yet learn faster than any AI. What are they exhibiting about learning that we've failed to capture in our models?
October 2025
Most AI engineers build evals like unit tests. That's why they fail. What changes when you treat evals as hypotheses about what matters instead of tests of model quality.
September 2025
As the agentic ecosystem matures, tools are no longer scarce. They're everywhere. The hard part now isn't wiring up tools — it's helping models discover which ones to use.
August 2025
In the past year, agent architectures have gone from niche experiments to front-page product strategies. But one area remains dramatically under-discussed: context engineering.
July 2025
Evaluation has quietly become the backbone of modern AI products. It's what separates a system that 'looks cool in demos' from one that actually works.
July 2025
Over the next 10 years, the GenAI landscape won't be shaped by prompt hacks or viral demos. It will be defined by who builds the infrastructure, systems, safety nets, and experiences that actually ship and scale.
June 2025
A deep dive into how companies are actually using large language models in production, from GitHub Copilot writing 46% of code to enterprises struggling with hallucination rates of 27%
July 2023
An in-depth analysis of the LLM ecosystem in May 2023, from Geoffrey Hinton's dramatic Google exit to the $50 billion funding frenzy reshaping Silicon Valley's power structure
May 2023
This is the first of a multi-part series exploring exciting new developments in AI. A deep dive into the models that power ChatGPT.
April 2023