Henry Lucco | Posts

Enterprise Productivity Increases With Agentic Development

March 23, 2026

I want to preface these observations with the following: I do not claim to be an expert in the practical applications of large language models or their applications for generating software. I am just a programer and am exploring and learning about their practical applications at the same rate as everyone else. Much slower than the majority if Twitter and Reddit are to be believed. This being said, due to the experiments I have run and my early adoption of the various tools, I have been thrust into a position where I am regarded as such an expert by the people I work with, and have thus attempted to adapt to that responsibility.

First, some relevant context about where I work. I am currently a senior software engineer at a very late stage (series F) software as a service startup of about a thousand people. Of these thousand there are about two hundred and fifty engineers managed by engineering managers, directors, and a vice president of engineering. Per directive from the CEO and the CPO (Head of Product), it is the responsibility of the VP and the directors to harness the latent potential of these new generative tools to obtain the fabled "LLM Productivity Boost" that they are hearing about at various conferences, interactions with peers, and social media.

LLM Productivity Boost

I define the "LLM Productivity Boost" as a hype cycle where various teams claim to complete herculean tasks by properly organizing their team with agentic development in mind, then relying heavily on the agent to generate vast amounts of complex code incredibly quickly. Some examples include: the Claude C Compiler, the Eight Person Bedrock Rewrite, and Stripe's Minions. The reason I say claim here being that the Claude C Compiler cannot compile hello world and the eight person Bedrock rewrite has been debunked by numerous AWS employees. Stripe Minions appear to be effective, at least they seem to be from the internal tools we have built replicating their described workflow, but are limited to small tasks.

To achieve these productivity gains, the VP and their directors are dutifully studying the tools and appointing individual contributors (deemed "expert in generative AI") to carry out the task of equipping the other engineers with the environment needed to facilitate fully agentic development. The nature of the tasks needed to accomplish this goal are anyone's guess, but have at least started with a series of debates on how productivity itself should be measured.

Hypotheses of Agentic Development

There are a multitude of proposed paradigms circulating for how to efficiently use LLMs to build software, but the majority can be distilled into the following two hypotheses:

  1. Autonomous LLM agents will increase the rate at which I can ship features.
  2. The bottlenecks I previously faced in my engineering org can now be automated away.

Alone, each of these statements appears reasonable, especially with the overwhelming marketing campaigns enacted by the model providers (OpenAI, Anthropic, etc.) themselves. It is evident from the marketing example A and example B, that billions of the bubble dollars are being allocated towards driving adoption that will increase the bubble's size.

When building a video game, teams employ the "player fantasy" to inform design decisions. The player fantasy refers to "core emotional experience or power trip a player wants to feel when playing a game". Generative and agentic tools are marketed with the same mechanism, but instead of game mechanics, agentic tools promise the hypotheses above: I can ship faster, and my bottlenecks will be automated away. Or worded differently: I am no longer dependent on people with skills I do not possess to accomplish my goals.

The reality is, predictably, much messier.

Let's take a new settings menu as an example. The current agentic tools are geared towards generating code, and a functioning enterprise organization has multiple parts before and after implementation that also take time. Product must create a spec for what settings will be available and what the knobs will change, design must create the flows for the UI, multiple system design reviews are conducted to ensure the implementation of the knobs will not break the rest of the application, a security review must take place to ensure customers can't change anything dangerous. Once this is all finally complete, a single developer with an agentic tool might finish the implementation in a day, but two months have already gone by since the ideas conception.

This all takes place under the generous assumption that the org's procedures were properly followed.

Human nature is much more dynamic, and while some organizations have a strong culture and process, the majority do not. Rather than waiting for the design reviews the developer implements without a plan. The UI is added on top later once design is finished, and product decides the designs don't match their specification when everything is revealed at an unrecorded ad-hoc meeting. Engineers from another team on which the feature depends block various pull requests, then go on vacation. Nobody will remember the security team's existence until there is a P0 security incident months in the future.

The culmination of this dysfunctional process is a feature released months after it's conception, but the actual programming work (especially with the agentic tools) only took a few hours.

So then how do we actually speed things up?

It is undeniable that if properly prompted, a model can take days worth of engineering work and spit out average quality code that eighty percent of the time accomplishes the intent described. Additionally, the models shine within the realms of interpolation and repetition. By narrowing the scope and providing the model with a closed and complete system, teams can automate large amounts of work in ways that were previously impossible.

LLMs and agentic tools are multipliers, not creators. Given a well written codebase with minimal debt, most models can duplicate the patterns defined by the code and teams can and should automate small and menial tasks within these closed systems.

Due to their nature as multipliers, these models lift up the rock that covers most engineering orgs. If the org has inefficient processes, bottlenecks, or other issues that hinder productivity, these will not be magically fixed by the multiplying agent, they will exponentially worse than they were before.

LLMs and agentic tools hold up a mirror to previously functional orgs, and highlight imperfections that were previously invisible.

An organization that is purely blocked by implementation speed will feel the effects of agentic development instantly. For everyone else, things are going to get a lot slower before they get any better.

Further reading: Every layer of review makes you 10x slower