Annas Bin Adil

Engineer at heart • Exploring AI, robotics & the human experience

All Posts

Evaluating Agents Is a Different Problem

Every component eval passed. The agent pipeline still failed. What changes when you move from evaluating single calls to evaluating trajectories.

February 2026

Why LLMs Are Brilliantly Stupid

LLMs pass the bar exam but can't count letters. The failures aren't random. They're architectural fingerprints, and understanding them changes how you use these systems.

February 2026

Debugging as a Window into How AI Thinks

Watching AI tools solve the same problem in radically different ways reveals something about their cognitive architecture, and about the nature of problem-solving itself.

December 2025

Building the Thing That Replaces You

What it's like to build AI systems that automate your own skills, and the quiet fear that doesn't fit neatly into hype or doom.

December 2025

Distribution Is the Only Moat: An Engineer's Reluctant Reckoning

AI is collapsing the cost of building software, and with it, the value of technical advantages. If building is cheap, distribution, the ability to reach and retain people, becomes the only defensible position. An engineer wrestles with what that means.

December 2025

What It Means to Learn

Children forget everything yet learn faster than any AI. What are they exhibiting about learning that we've failed to capture in our models?

October 2025

Evals Are Hypotheses, Not Tests

Most AI engineers build evals like unit tests. That's why they fail. What changes when you treat evals as hypotheses about what matters instead of tests of model quality.

September 2025

MCP Discoverability: The Hidden Cost of Scale

As the agentic ecosystem matures, tools are no longer scarce. They're everywhere. The hard part now isn't wiring up tools — it's helping models discover which ones to use.

August 2025

What are Large Language Models?

This is the first of a multi-part series exploring exciting new developments in AI. A deep dive into the models that power ChatGPT.

April 2023