29 January 2026

Thoughts on Coding Agents and Architecture

Cross-posted on: Substack

Why agentic coding demands new tools for architectural sensemaking, not less human judgment

The software engineering industry is at a crossroads. There’s an emerging tension and increasing failure risk with the widespread adoption of agentic coding. This article argues that software engineering culture has eliminated the opportunities for new developers to learn systems thinking, so we have a generation of coders who can deliver working lines of code but can’t reason about it in context. This makes sense in terms of the incentive structures. What’s commonly measured is velocity and “impact”. Education focuses on CS theory and frameworks, but gives little operational exposure. Career ladders reward shipping. Clarity, architecture, governance, and stewardship are quietly devalued. The resulting gap in systems thinking is the predictable outcome of rational behavior under existing incentives. But that carries risk that we must address. Code that’s written without sufficient understanding will show up as critical failures or organization-wide slowdowns as unaddressed technical debt introduces drag. This article outlines a system for thinking about this risk and approaches to addressing it within your organization.

As abstraction increases, systems thinking doesn’t disappear—it must be re-externalized into tools, or it collapses under speed.

Let’s talk about systems thinking. Traditionally this has involved three levels:

Implementation (memory, threads, IO, algorithms)
Application (architecture, data flow, boundaries)
Operation (latency, failure modes, load, degradation)

Implementation-level thinking has been absorbed by tools, and operational thinking has been formalized into SRE practice. What’s left is application-level systems thinking: architecture, boundaries, and flow. This layer is still essential, but no longer well-supported by training or tooling. It lives in the gaps between IDEs and dashboards, owned by no system and learned only through painful failure. That’s the layer we need to make visible again.

Without tools that make causality legible, augmentation reduces agency instead of increasing it.

In practice, there are guardrails that can be put in place as workflows around agentic coding to address many of the issues with the weakness at the application architecture level in the code that’s being generated. One common pattern is to treat the agent’s first output as a spike, not a finished solution. The goal isn’t to ship the code immediately, but to pause and interrogate it: does this approach actually fit the broader context? Are we introducing unnecessary complexity or dependencies? Are we accidentally encoding outdated patterns, security risks, or architectural shortcuts?

A sophisticated coding agent scaffold can ask these questions by itself, perform a review, make informed decisions, and generate higher quality code as a result. But while that may improve the architectural quality of the generated code, it does not address the growing gulf of systems thinking for coders. In fact, the risk is that increasingly capable agents further obscure architectural causality, widening the comprehension gap even as surface-level quality improves.

What if we had tools that helped us to think architecturally? Like a self-updating architecture explorer? With visualizations of existing architecture and where the new changes fit into it, what changes they make, what the resultant system looks like?

What if we had an integrated performance analysis tool that was able to measure performance at the architectural unit level? Tracking drift. Giving visual feedback on performance impact of new code. Helping devs to dive down into the code from there to understand exactly why something is performing poorly.

Essentially I’m saying what if we try to build our way out of it, bringing enhanced visibility at the right level to address the loss of visibility at a different level? As we move faster and faster, we have to have sensemaking tools that help offload the cognitive burdens that don’t scale, while preserving and improving the human decision and oversight capacity.

This isn’t unprecedented. Fighter pilots have faced the same problem for decades: the environment moves faster than raw human perception can handle. The solution wasn’t to expect pilots to “think harder,” but to give them systems that fuse signals, surface what matters, and preserve human judgment for decisions that actually require it. Modern avionics compress complexity into intelligible representations so pilots can stay inside the decision loop.

Does that mean pilots don’t learn the foundations? No, quite the contrary. They have to know the systems they’re relying on (propulsion, flight controls, sensors, weapons), their failure modes and degradation states, what happens when automation lies, lags, or drops out. The human in the loop is non-negotiable. Augmentation is fallible, not authoritative.

There’s also an analogy in medicine. Doctors still learn anatomy and physiology in depth, even though they have advance imaging, lab tests, and algorithmic decision support. Diagnostics don’t replace understanding and judgement, they compress perception.

Software development is crossing that same threshold. As codebases, dependencies, and agentic workflows accelerate beyond individual comprehension, we don’t need less human thinking, we need better cognitive augmentation: tools that fuse architectural reality, performance signals, and change impact into a coherent picture, so humans can still reason, decide, and intervene at the right level.

For pilots, this is represented as the OODA loop: Observe, Orient, Decide, Act. Compress the loop to be able to act faster than the opponent. This means speeding up Observe and Orient, preventing cognitive saturation, and staying alert at the critical levels of awareness. In the software industry, speed is also critical.

One takeaway is that we need to shift training from abstract CS + tool use to full system understanding: “Here’s the fused picture. Here’s how it’s produced. Here’s when to trust it, question it, or override it.” We need systems thinking, but at a higher layer. When we increase abstraction and automation without a corresponding shift in training and sensemaking tools, we’re adding risk to the system.

Pilots spend enormous time in simulators where sensors fail, data is contradictory, the system behaves “plausibly wrong”, and cognitive load is intentionally pushed past comfort. The goal isn’t memorization, it’s sensemaking under stress. This sounds a lot like debugging emergent behavior, understanding performance regressions, reasoning about distributed failures. These things don’t go away in the new world of agentic coding.

I would argue that this shift to a higher layer of systems thinking is happening, in an ad-hoc way, across the industry now. It’s messy and unformalized, because it’s new and emerging. But throw a new grad into the deep end and I believe they will learn sensemaking and systems thinking, and maybe even be able to adapt more quickly to it if given the right feedback. The key is to augment sensemaking at the right levels. Shorten the OODA loop.

We already have a massive set of tools dedicated to this. DevOps as a discipline is dedicated to enabling speed and agility–continuous integration and delivery practices, security scanning, performance profiling, these safeguards allow teams to ship code at velocities unimaginable a couple of decades ago. We also have an increasing amount of tools that help us to Act quickly using LLMs as copilots, autonomous agents, etc.

What this article is highlighting is an opportunity to develop better tools for the Observe and Orient steps, so that we can support both junior and senior developers in that critical stage of sensemaking. We need to work towards human-in-the-loop cognitive augmentation under extreme time compression. We need to bias away from tools that slow the Decide and Act steps by obscuring causality. Automation should be evaluated on whether it preserves agency, not just efficiency.

As Gene Kim and Steve Yegge point out in the excellent book Vibe Coding, this should be FUN. We don’t have to keep everything in our head. We have increased agency. We can do things we’ve never been able to do because of time constraints or unfamiliarity. Agentic coding and the tooling around it is in its infancy, so it’ll take some time to get where we want to be. And change is always hard. But at each step, with every iteration of the increasingly rapid change cycle, we can take a step back, look around, and ask “what’s needed in order to thrive in this environment?”

What do you think? What tools do we need to write next? How can we augment our Observe and Orient capacities so we can Decide and Act more quickly? Today’s unicorns are built on speed as much as experience and novel innovation.