There's a feeling I don't hear people talk about much in the AI community.
It's not the excitement. I have that too, and it's real. Watching an agent I built solve a problem I would have spent an afternoon on, there's genuine satisfaction in that. It's not the alignment worry either. Those conversations matter, but they're not what keeps me up.
The feeling I mean is quieter. It's the one where I'm sitting at my desk, teaching an AI agent to debug a system, and somewhere in the back of my mind a voice says: you're teaching it to do the thing you're good at. You are building, with your own hands, the thing that makes you slightly more replaceable.
I don't say this out loud much. In the AI community, the approved emotional registers are enthusiasm and existential concern. There's not a lot of space for the middle thing, for the engineer who genuinely likes building these systems and also feels the ground shifting under his own feet, slowly, one capability jump at a time. The feeling isn't dramatic enough for a conference talk. It's too specific for an op-ed. It just sits there, quiet, while you work.
This essay is about that middle thing. Not AI hype. Not AI doom. The honest in-between.
It's not like waking up one morning and finding your job has vanished. That's not how it works, at least not for me, at least not yet.
It's more like watching a cliff face lose ground to the ocean. You don't see the cliff collapse. You see the waterline inch higher. Each new model release, each capability improvement, the water rises a little. The cliff is still there. You're still standing on it. But you can feel the edge getting closer, and you know which direction it's moving.
I want to be specific about what this looks like in my actual work, because abstract anxiety is easy to dismiss.
Here's what's been automated, or substantially automated, in my own day-to-day over the past year:
Boilerplate generation. Completely automated. I don't write setup code anymore. Scaffolding a new service, wiring up config files, initializing project structures. I describe what I want and the code appears. This used to take me an afternoon. Now it takes ten minutes of prompting and review.
Bug pattern recognition. Mostly automated. AI catches roughly 80% of the bugs I used to catch manually during code review. Common anti-patterns, off-by-one errors, null reference risks, race conditions in obvious places. The model sees them before I do.
API integration. Largely automated. Connect to a new service? The AI reads the documentation, generates the client, handles the authentication flow. I used to spend hours reading API docs and writing client code. Now I spend minutes reviewing what the agent produced.
Test scaffolding. Mostly automated. Generate the test structure, set up mocks, create the fixtures. I fill in the actual assertions and edge cases, but the framework comes for free.
Documentation. Partially automated. First drafts are AI-generated. I rewrite for accuracy and voice, but the blank page problem is gone.
Code review for style and patterns. Partially automated. The AI catches consistency issues, naming convention violations, formatting problems. The mechanical parts of code review happen automatically.
Routine cognitive work is now automated. This is not a prediction. It's already happened. The question isn't whether this continues but how fast the boundary moves.
Now here's what's not automated, or not yet. The common thread is judgment: deciding what to build (which problem matters most, not which solution is cleverest), making architectural bets with long-term consequences (I wrote about this in "The Art of Saying No" as engineering taste, the implicit model built through consequences and failure), debugging truly novel problems where the AI loops through known fixes while a human eventually gets frustrated enough to try something genuinely different, and making trade-offs between competing priorities where the "right" answer depends on context the AI doesn't have.
The AI suggests architectures that look right in isolation. Whether they're right for this team, this product, this moment, that's still a human call. The pattern that emerges: judgment under uncertainty isn't automatable. Not yet.
I used to just assert that judgment was hard and leave it there. But recently I've been trying to understand what specifically makes these decisions resistant to automation. Not "judgment is special because humans are special," but what structural properties make it a different kind of problem.
Here's what I've come to, though I hold this loosely.
Judgment requires modeling consequences across time horizons that weren't in training data. An AI can pattern-match to "similar past decisions," but architectural choices play out over months or years in ways that are deeply entangled with a specific organization's codebase, team dynamics, hiring plans, and product roadmap. The state space of possible consequences is enormous, and most of the relevant information is private, never appearing in any training corpus. When I decide against microservices for a small team, I'm simulating eighteen months of operational burden based on knowing this specific team's on-call tolerance and hiring timeline. That simulation draws on private context that no model has access to.
Judgment involves risk assessment calibrated by living with consequences. When I made a bad architectural decision three years ago, I spent fourteen months maintaining the mess. That pain recalibrated my intuition in ways I can still feel. RLHF provides feedback on outputs, but not on the downstream consequences of decisions months later. There's no training signal for "this looked right in code review but made the team miserable for a year." The feedback loop that builds human judgment operates on a timescale and specificity that current training approaches don't capture.
Judgment often means knowing what NOT to optimize for. An AI will optimize whatever metric you give it. But knowing which metric to give it, knowing what to leave unmeasured, knowing when to override the data because you have context the data doesn't capture: this is a meta-cognitive skill. It requires a model of the problem that goes beyond the problem's own representation. I've seen AI-generated architectures that were beautifully optimized for latency in systems where reliability mattered more. The AI didn't know to ask "what does this team actually care about?" because that question sits outside the frame.
I'm genuinely curious whether these are permanent structural barriers or just current limitations that will be engineered around. I suspect the answer is different for each one.
I need to sit with that "not yet" for a moment, because it's doing a lot of work.
On good days, I read "not yet" as a moat. The things that remain hard to automate are the things I've spent a career developing, and they seem structurally resistant for the reasons I described above.
On bad days, I wonder if I'm doing motivated reasoning. I used to think writing couldn't be automated. Then GPT-3 happened. I used to think code generation was a decade away. Then Copilot happened. Every time I say "this requires something human," I'm aware that people have said that about every capability that later got automated. Arithmetic. Chess. Translation. Legal research. Medical diagnosis. The history of "this is uniquely human" claims is not encouraging.
The strongest version of the counter-argument goes like this: judgment is just pattern-matching on a larger dataset, and AI will get enough data eventually. This has been true for every previous "uniquely human" skill. Chess intuition turned out to be pattern-matching. Medical diagnosis turned out to be pattern-matching. Why should engineering judgment be different?
I think there might be a qualitative difference, and it comes down to feedback loops. Chess has a clear signal: you win or you lose. Medical diagnosis has a relatively fast signal: the patient gets better or doesn't. But architectural decisions have delayed, ambiguous, context-dependent feedback. The outcome is visible months later, entangled with dozens of other decisions, team changes, and market shifts. If judgment is pattern-matching but the feedback loop is fundamentally incompatible with current training approaches, the bottleneck isn't about data quantity. It's structural.
I honestly don't know if that structural argument holds. I'd put roughly 30% odds that judgment gets meaningfully automated within 10 years, and 70% that the feedback-loop problem keeps it hard for longer. But I notice I want the odds to be lower than 30%, and that wanting is itself a signal I should be careful.
Here's how I'd test my beliefs. If AI systems can make context-dependent architectural trade-offs that experienced engineers rate as "good judgment" more than 70% of the time within 3 years, the "judgment can't be automated" thesis is probably wrong. I'd currently bet against this at 60/40 odds, but I'd reassess with every major model release.
A second prediction: if judgment is primarily pattern-matching, we should see AI performance on architectural decisions improve linearly with training data and compute. If it plateaus despite scaling (the way common-sense reasoning plateaued for a while before chain-of-thought unlocked it), that suggests something beyond simple pattern-matching is involved. I predict we'll see such a plateau by 2028.
A third: within 5 years, I predict that more than 50% of software engineers will describe their primary value-add as "judgment and context" rather than "code writing." If fewer than 30% describe it that way, the automation is either slower than I think or judgment itself has been more automatable than I expect.
I believe judgment is hard to automate. I also know I might be wrong. And I can't fully tell the difference between genuine analysis and motivated reasoning when the thing I'm analyzing is also the thing I need to believe in to feel okay about my career.
This is the question that actually keeps me up.
How do you stay motivated to sharpen skills that might depreciate?
I sit down to study distributed systems, or work through a new paper on agent architectures, or practice debugging a complex concurrency issue, and two voices argue at once: "Why bother, if AI handles this in two years?" and "You can't evaluate what you don't understand. You can't say no to the wrong architecture if you don't know what the right one looks like." Both feel true simultaneously. That's the paradox. Investing in skills that might depreciate feels irrational. Stopping investment feels suicidal.
I've landed somewhere uncomfortable. You have to invest in skills while knowing they might not hold their value, because the alternative is definitely worse. The person who stops developing because "AI will handle it" becomes a rubber stamp, and rubber stamps are the easiest things to automate. Deep skill is what lets you evaluate what AI produces, catch the confident-but-incorrect answer, say no to the wrong architecture. Without it, you're not collaborating with AI. You're just approving its output.
So I keep investing. Not with confidence. With something closer to Pascal's Wager applied to professional development.
What scares me isn't the destination. I can imagine a future where my role looks different, and I can see how that future might be fine, even good. What scares me is the pace.
I've adapted before. From embedded systems to web development, from web to ML, from ML to agent architectures. Each transition required months of feeling incompetent and building up from scratch. I can do that. I've proven it. But each cycle is faster than the last, and that's the part that's new. The time between "this technology is emerging" and "this technology is reshaping roles" used to be measured in decades. Mainframes took twenty-odd years. Web development, fifteen. Mobile, ten. Cloud, seven or eight. AI agents? The field barely existed two years ago, and the work I was doing six months ago already looks noticeably different from the work I'm doing now. At some point, does the cycle compress below the minimum human adaptation time? Is there a speed limit on how fast a person can retool? I don't have an answer. I notice I want one, and the wanting makes me anxious. I'm trying to sit with the anxiety instead of resolving it prematurely with a reassuring story.
Someone (I wish I could remember who) told me the race metaphor is wrong. You're not racing against AI. You're surfing. The wave is bigger than you, more powerful than you, and completely indifferent to your existence. Your job isn't to outrun it but to position yourself well, read the conditions, and ride it. For me, that means spending more time on architecture and design, less on implementation. Investing in the kind of deep, local, tacit knowledge about specific teams and products that isn't in any training data. Figuring out not just how to use AI tools but where they should go, which is the gap between "AI can do X" and "X is useful for this business." But I want to be honest about the limits of this reframe. Not everyone has the background, the resources, or the career runway to be positioned well when the wave arrives. The metaphor helps me. I'm not sure it's universal.
The only thing I'm genuinely confident about: the only definitely losing strategy is to stop learning. Everything else, every reassurance about human judgment, every claim about what AI can't do, every reframe about surfing instead of racing, those are stories I tell myself to stay functional. Some of them might be true. I'm betting my career on the hope that they are.
But I'm building the wave while I'm trying to surf it. And I can feel the water rising.