The art of not shipping everything
Building is cheap
For many this is common knowledge, I still believe it's hard to grasp what happened over the last 1.5-2 years. I was building features almost entirely by hand when AI still wasn't at a point where it was able to generate flawless code end to end.
This has changed.
Two years ago I wrote 95% of code by hand. Cursor and GitHub Copilot were useful for auto-suggestions. Then, one year ago, I was doing 70% of the work manually. AI has gotten significantly better, but I still needed to tweak almost every output. Now I'm barely doing anything manually anymore. I'm reviewing code, but I often don't even have to touch the files themselves anymore. AI has gotten incredibly good at writing code.
While the amount of features I'm committing increased, the amount of lines of code I'm writing on my own decreased. At the same time the code I'm reviewing, not only from myself, but from non-technical folks drastically increased too.
The pattern is clear: Many more people can contribute to the success of products. PMs, designers, marketing, and sales all, for the first time, have the tools to actually execute what they have in mind.
The more, the better?
I've seen that most companies measure their efficiency by lines of code changed, average PRs, net new features shipped per week. Everyone is adding more, and more, and more.
This is a problem.
If everyone is just adding things, who is the one staying in control? Who is the one making sure the design aligns with other features and doesn't look and feel like slop? Who is even testing these new features and making sure users actually need them?
There are hundreds of more questions like these, but I think the point gets clear: Software companies don't need to care about having enough employees to ship something. The cost to create software will go to zero. Instead the challenge will be: If resources are unlimited, how do we even know what to ship?
The hard part isn't building anymore
There's a strong need to automate reviews, but most of what we have only finds out if things work or don't. That's the easy part. The testing I'd expect goes further:
- Test features against actual needs
- Open PRs in a preview
- Look at changes and whatever they affected
- Give an estimate of the extent a new feature is needed vs. refactoring something that already exists
None of this is about writing more code. It's about deciding whether the code should exist at all.
Knowing what to ship
With more code, more reviews, more tokens, come more bugs. This is wrong. What should happen instead is that agents know a codebase well enough to loop around what already exists, rather than creating new UX challenges that aren't necessary at all.
Shipping a lot of features is not the benchmark. Knowing what to ship is.