We're not the the Rhino anymore

The SDLC was built for humans. Agents need something else.

Mar 12, 2026

0:00

-1:08:40

There’s a phenomenon in nature where a small bird sits on a rhino, or climbs into a crocodile’s mouth, and eats the dirt and parasites off it. Why is it there? Because a massive body moving through the world has problems it’s too big to solve on its own. Parasites, grime, blind spots. And the small creature? It’s exactly the right size to handle them.

But does the bird decide where the animal goes?

No. There’s symbiosis, but it’s not symmetrical. The rhino is not the bird’s tool. The bird doesn’t “manage” it either. It lives alongside it, benefits from it, and helps keep it healthy. Nobody looking at that picture gets confused about who’s carrying whom.

The more I read about agentic SDLC and dark factories, the more I feel this is the most accurate image for what’s happening right now to my profession.

Programming.

The Earth spins at 1,600 km/h. We don’t feel it, because large movements don’t always register as movement. Sometimes they just feel like the natural state of the world. And that’s exactly what’s confusing the AI debate right now. We assume it will remain our tool. That we’ll be the ones holding it, telling it what to do, pressing Enter every now and then. But what’s becoming clear to me is that our role is shifting. Less and less writing, more and more maintenance. We’ll stop being the ones who write the code, and even the ones who verify the code, and become the ones who plow the snow so the machine can get through.

In this essay, I’ll try to explain why the entire software profession is beginning to reorganize itself around what agents need in order to work — not humans.

Fair Disclosure

If you’re skeptical, that’s a healthy response. You use these tools. You know the slop. You know an agent can hallucinate an API that doesn’t exist, miss an obvious constraint, or solve the wrong problem at impressive speed. And from that experience it’s easy to reach the comforting conclusion: sure, it’s a powerful tool, but at the end of the day, we’re still the drivers.

And if I wanted to cheat, I’d skip the next data point, because it’s the most uncomfortable one for my argument:

In a controlled study by METR, experienced open-source developers who worked with AI tools completed tasks slower than developers who worked without. Not by a little. About 20% slower. And the real sting was that they themselves thought they were faster. In other words, not just a slowdown — a slowdown with the illusion of acceleration.

Case closed, apparently.

Except the experiment broke in another interesting place. Between 30% and 50% of participants who were assigned tasks without AI started avoiding them. They didn’t want to go back to working that way. The researchers had to change the experimental design because developers simply refused to measure themselves in the old world.

Put these two data points together:

With AI, many developers get slower.

Without AI, many developers already refuse to work.

What a bizarre paradox, right? What can you even conclude from that?

Sometimes when you encounter two facts that are both true but contradictory, you need to check the categories. So what could explain both at the same time? It’s not that AI is good or bad. It’s something else — maybe the measurement itself is outdated?

Most people measure AI inside a development process built for a world where the human writes, the human tests, the human syncs, and the human carries the entire chain on their back. Inside that process, AI really can add noise, debugging overhead, mistrust, and slowdowns. But that doesn’t prove the tool is weak. It proves the system around it hasn’t yet moved in synergy.

The debate isn’t “does the agent write good code.”

The debate is whether the SDLC we know still fits a world where agents write a growing share of the code.

Where Are You on the Scale

Dan Shapiro, CEO of Glowforge, described this year five levels that map the distance between daily AI use and what he calls the “dark factory.”

At the lower levels, 1–2, you’re still the driver. The tool completes lines, writes functions, maybe even a feature spanning several files — but you still read almost every line. You’re the gatekeeper. The filter. The bottleneck.

At the middle levels, 2–4, the relationship starts to flip. You’re no longer the one typing most of the work — you’re the one defining a task, approving a result, and deciding whether to continue or stop.

And at Level 5 — the dark factory — no human writes and no human reviews. A spec goes in, software comes out.

Shapiro estimates that most developers who think they’re at the frontier are actually at Level 2: using tools every day, but still treating them like an unreliable junior.

Anthropic themselves admit that in practice, engineers use AI in about 60% of their work, but can fully delegate a complete task only 0% to 20% of the time.

That number sounds like a win for the skeptics. But pay attention to what it actually says: not that the model isn’t capable — but that in the current work structure, there’s no room to delegate. No specs sharp enough. No automated verification mechanisms. No environment where an agent can run on its own. People use AI all day, but still inside a process built for them, not for it.

And that’s exactly why most of the debate is judged from a skewed angle. It reminds me of the parable about the fish that doesn’t know what water is, precisely because it lives in it. The developer living at Level 2 struggles to see that their experience isn’t the future — it’s their current aquarium. Someone living at Level 2 imagines Level 5 like science fiction. But in practice, the gap between them isn’t a leap to another world — it’s a gradual transition where the human is pushed step by step from the line of code to the maintenance layer.

Some Are Already Working Differently

So to understand where this is heading, you need to look not at the average user at Levels 1–4, but at those who are already working at 5 and have defined an entirely different workflow with AI.

OpenAI published in February a description that sounds almost absurd on its face: over five months, a small team built a real internal product with zero hand-written lines of code. Application, documentation, CI, tests, dev tools, observability — everything. About a million lines of code. Over 1,500 PRs. Estimated development time: one-tenth of what it would have taken to write by hand.

But the numbers are beside the point. The story is what broke along the way. It turns out that progress at the start was slower than expected — just like the study we discussed — not because the agent was weak, but because the environment was undefined. The agent lacked tools, abstractions, and internal structure to make progress.

And then they figured out what the work needed to look like.

They described it like this: “Humans steer. Agents execute.”

When something failed, the question wasn’t “how do we write this ourselves.” The question was: “what was missing from the system in hindsight for the agent to succeed on its own?” OpenAI gave it a name: harness engineering. Not “how to make AI try harder,” but what’s missing from the environment so the agent can succeed without us.

And the agent succeeded.

And once writing stopped being the bottleneck, it immediately moved to the next one: who reviews all this code? Agents produce dozens of PRs a day, and if a human has to sit and verify every change works, they become the constraint. So they connected the agents to the browser itself, gave them the ability to open the application, take screenshots, navigate, and debug on their own.

Is this still programming? It’s an entirely different way of working. And it produces an entirely different profession. And according to a Forbes piece, even within Cursor the realization emerged that models are approaching a point where developers no longer need to go over every line of output. That’s significant precisely because Cursor was built on the assumption that the human and the AI would sit together inside a code editor and work diff by diff.

StrongDM went even further.

Three people. Two rules:

Code is not written by humans.
Code is not reviewed by humans.

Instead of human review, they built external evaluation scenarios that the agent doesn’t see during development. Not tests living inside the codebase that AI can cheat or adapt to, but blind exams. They also built an entire universe of digital twins for their external services, so agents could develop, test, and break things without touching the real production environment.

This framework is called AIDLC — an agentic version of the SDLC we all know. Not “how humans write, test, and deploy software,” but a complete development framework built to contain and manage non-deterministic systems that work on their own.

In fact, an entire industry has already been built around this logic. Quite a few companies, including many Israeli ones, were founded on the premise that if review stays human, it will immediately become the bottleneck of the entire production line. So they’re building agents that review agents: scanning diffs, running scenarios, hunting bugs, and trying to create an automated trust layer around code written by models.

But even this, in the end, is a smarter and faster version of the same old idea. You still write first, test after, and hope you built a strong enough suite to catch what needs catching.

There are already those trying to skip even that step.

Meet Logical Intelligence, who are building formal verification agents — systems that move from the logic of testing to the logic of proof. Systems that try to do to code what a mathematician does to a theorem: not guess that it’s correct, but prove it.

When an engineer writes regular code, they write tests that check specific cases. What happens if the input is a number instead of a string? What if the input is empty? But there are always cases you didn’t think of. Formal verification is something else entirely. Instead of checking examples, you produce a mathematical proof — like a proof in algebra — that a certain property always holds, for every possible input, in every execution path. Not “we tested a million cases and everything worked.” But “mathematically, it’s impossible for this to break.”

The problem? Until now this was manual work that takes months. Specialized engineers write proofs in formal languages, and it’s insanely expensive and slow. So formal verification was reserved only for the most critical systems — nuclear, aviation, finance.

Aleph, the agent by Logical Intelligence, makes this automatic. It takes code and generates those proofs without a human writing them. And in a pilot they’re already running, it goes a step further: writing new code that comes with a built-in mathematical proof that it cannot behave dangerously. The code is born with a stamp of “tested and certified.”

Which suggests that even the control layer — the one we assume will remain human — is already starting to shift from the person to the system.

Brownfield vs. Greenfield

That said, most of the software world doesn’t look like a fresh repo at OpenAI or some side project in vibe coding. It looks like a fifteen-year-old system with partial tests, configuration written under pressure, and knowledge that lives in the heads of three tired people who’ve been at the company for six years.

But this still isn’t a comforting argument. Because it describes the starting point, not the direction of travel. Every company with a legacy system eventually starts new projects. Every project that starts today is born into a world of agents and shaped around them. Like demographics — the older population may be larger right now, but the children are the ones who will determine what the world looks like. And as time passes, the greenfield standard becomes the industry standard.

For legacy companies, the dark factory won’t arrive tomorrow morning. But that’s not because brownfield is protected in principle. It’s because the transition will come from the side, not through the front gate. First they won’t replace the entire system; first they’ll use AI to map dependencies, generate documentation from undocumented code, write regression tests, and extract the tribal knowledge that lives in the heads of three tired people. And that is already exactly what companies like Anthropic are selling under the banner of code modernization.

Anthropic has already shown it’s possible to tackle even COBOL modernization — one of the oldest strongholds of the consulting industry. The market itself reacted with fear when IBM’s stock took a sharp hit after the announcement. Not because everything disappears overnight, but because suddenly something that was considered “ours alone” stopped feeling immune.

And the pressure is no longer just technical. It’s business. A Pega study found that 68% of decision-makers say legacy systems prevent them from adopting AI, and 57% say those systems are probably already hurting their ability to retain customers. In other words, brownfield is no longer just “hard to change.” It’s becoming a blocker that prevents the company from moving in time.

And when such a blocker becomes visible, capital starts pushing too. OpenAI has already enlisted McKinsey, BCG, Accenture, and Capgemini to embed AI co-workers inside real enterprise systems, and funds like Blackstone are already speaking openly about AI transformation as an engine for value creation in portfolio companies. So brownfield doesn’t block the dark factory. It only ensures that the penetration will be gradual, expensive, and more painful. But precisely because so much money sits in it, it’s also one of the strongest targets for automation pressure.

That’s why brownfield is an argument about pace. Not about direction.

The Keyboard Is No Longer at the Center

Michael Truell, CEO of Cursor, described the transition we’re experiencing now. He divided it into three periods: the first was autocomplete — you type, and the tool completes. The second, where many are today, is synchronous agents — you tell them what to do, but manage them step by step inside the code. The third period is cloud agents: models that work alone on their own virtual machines and return logs, video, and artifacts ready for review.

And it’s not just Cursor anymore. Anthropic acquired Vercept so agents could operate computers through vision. Claude Code gained remote control over environments. CoWork and Codex now receive scheduled tasks on sandboxes. In other words, the agent no longer just writes text in a code editor — it runs, tests, navigates, and operates. The old boundary between “writing code” and “making sure it works” has simply been erased.

And if you think this is just a distant vision, look at the adoption. In March 2025, most developers were still in the first period: for every person who ran an agent to write for them, there were 2.5 people who just pressed Tab for code completion. One year later, in February 2026, that ratio completely reversed. For every person pressing Tab, there are two people letting the agent write. That’s 15x growth in one year of people who simply stopped writing lines of code themselves.

And Cursor isn’t just talking about the third period — they’re building it. In their research on “self-driving codebases,” they first tried letting agents organize themselves without hierarchy. It failed immediately. The agents avoided taking responsibility, held locks, stepped on each other. Twenty agents dropped to the throughput of two or three.

What worked in the end? An architecture that looks like a human software team: a chief planner, sub-planners, and edge workers. Each with clear responsibility, each on their own copy of the code, with no need to coordinate directly. Cursor themselves wrote that it resembles the structure of their own software teams — just without the people.

And another interesting insight: when they demanded 100% correctness from every agent before every commit, the system froze. Agents started fixing things that weren’t theirs, colliding with each other, afraid to move. Much like stage fright — they avoided taking on big tasks. And precisely when they were allowed to make small mistakes and other agents were given the job of fixing them after the fact, everything started flowing.

So maybe the question “does the agent make mistakes?” as a metric for “can AI replace me?” is less important than we thought. Maybe the real question is “does the system around it know how to fix the mistake?”

According to Cursor’s project, it seems so.

The Rhino in the Room

There’s a rhino in the room and it’s time to talk about it.

Every software company is ultimately a machine built so that humans can write code together. (Emphasis on together.) Standups, sprints, retros, boards, code reviews, handoffs between teams, a manager who syncs, QA doing a pass, a product manager closing corners. None of these are laws of nature. They’re mechanisms built around human limitations: we forget, get confused, can’t hold everything in our heads, and can’t work on the same change at high speed without coordination.

In other words, the familiar SDLC is a human operating system.

And the moment the human is no longer writing most of the code, that operating system stops being infrastructure and becomes friction.

An organization like this resembles a city built for horses. Even if you bring in cars, the traffic won’t clear if the streets, traffic lights, and laws still assume carriages.

Run a simple thought experiment. If an agent can build a feature in two hours, why plan it in a two-week sprint? If a diff is generated automatically, tested against external scenarios, and comes back with logs, video, and evaluation mechanisms — why does a senior need to sit for an hour on every line? If you have a production line generating code continuously, why do you need a manager whose main job is to sync between people who are no longer doing most of the implementation?

At StrongDM you can see this almost clinically: no standups, no sprints, no board. There’s a spec. There’s a run. There’s a result. So is this Sisyphean coordination layer even still relevant?

But wait — does this necessarily mean our value disappears?

No. It means it moves.

This is Amdahl’s Law. When you speed up one part of a process, the parts that weren’t sped up immediately become the bottleneck. Writing code was for years the expensive part. Now that it’s getting cheaper, the bottleneck moves to clarity of thought, system understanding, and decision-making.

A tech lead who used to sit with the team and make sure the feature was progressing now needs to formulate a spec sharp enough for an agent to build it alone. And this repeats across every role, from product management to QA. The question is no longer “what do you do?” but “what do you define?”

Anthropic already describes a world where people outside engineering build solutions for themselves. A lawyer with no coding background built internal triage tools there. A legal team cut marketing review from two to three days down to 24 hours. That doesn’t mean everyone becomes an engineer. It means the old moat of “only those who can code can build” is eroding, and therefore what becomes more expensive is not the ability to touch a keyboard, but the ability to correctly define a problem.

That’s why companies like EPAM are already talking about how to build a development process that doesn’t assume humans are the main implementation layer.

And the moment the logic shifts from code, configuration, and human hands to specs and evaluation mechanisms, the organizational structure is forced to move too.

So How Much Are You Worth to the Company?

It’s true that as the cost of producing software drops, the total market for software grows. Organizations that were once too small for a development team suddenly become serviceable. But it doesn’t follow that you need the same number of people per unit of software. More software, fewer programmers per unit.

Most of you already know the next story. Jack Dorsey (Block) announced at the end of February layoffs of more than 4,000 people at Block, explicitly claiming that AI tools and small, flat teams are fundamentally changing how you build and run a company. Wall Street wasn’t scared. It was thrilled. The stock jumped more than 20% after the announcement. In other words, the market rewards this behavior.

In a world where the average SaaS company generates a few hundred thousand dollars in revenue per employee, Cursor is already operating at roughly $3.3 million in revenue per employee. Midjourney, with fewer than a hundred employees and half a billion dollars in revenue, reaches nearly $5 million. And more AI-native companies are pushing in the same direction: much more revenue, with far fewer people.

That’s not because there’s no work. It’s because the shape of work has changed.

Professions don’t die on the day there’s no longer a need for their output. They die on the day the market stops paying a premium for the skill that was at their center.

So if writing is getting cheap, those who mainly sell writing get hit first.

That’s Why Juniors Get Hit First

Once, the very ability to take a ticket and turn it into working code was an expensive skill. That’s what juniors sold: hands on keyboard, X lines a day, PRs per sprint.

Now the bottleneck has moved. What remains expensive is not writing — it’s judgment. And judgment is exactly what juniors haven’t had time to learn yet.

Anthropic uses the word “taste.” It’s a slightly annoying word, but it’s hard to find a better one. Not taste in the sense of style, but taste in the sense of knowing what’s right even when there’s no document telling you. Knowing what the user needs, what must not break, where the software will crash, which trade-off is dangerous, and which vague spec will lead an agent to an elegant catastrophe.

The professional ladder of software always worked like an apprenticeship model in organizational disguise. A junior enters, fixes small bugs, builds simple features, absorbs the codebase, gets reviews, becomes mid-level, becomes senior.

AI doesn’t break this model from the top. It breaks it from the bottom. It ate it from the lower rungs — exactly the place they were supposed to climb.

The data already accumulating in the market points to a sharp decline in junior positions: in the US, a drop of roughly 67% to 73% by various measures, and in the UK a sharp fall in graduate roles. This isn’t an anecdote about a hiring crisis. It’s the first sign that the profession is starting to hollow out from the inside.

And here the junior paradox enters again, through a different door.

Because if we understand that the important skill for a junior is judgment.

And judgment requires experience.

And experience is built through writing.

And writing is exactly what agents took from them.

So where will tomorrow’s senior come from?

Chicken and egg.

Anthropic themselves tested this in a controlled experiment: junior engineers who learned a new library with AI assistance scored 17% lower on comprehension tests than those who learned without. The biggest gap was in debugging — the skill that most demands deep understanding. And Amodei himself said explicitly that they’re seeing code skill erosion in line with usage patterns. In other words, not only are the lower rungs of the ladder disappearing — the tool that’s supposed to “help” may also be eroding some of the capabilities we haven’t yet built a replacement for.

Egyptian hieroglyphics didn’t disappear overnight. The writing system held for millennia, until the chain of instruction broke. Fewer and fewer people knew how to read, fewer and fewer people knew how to teach, and a circle closed. The same goes for software. Languages die when one generation stops teaching the next. The real danger isn’t that one day the need for code will vanish, but that the training pipeline that produced developers with deep intuition will break. And for years that intuition was built through writing, debugging, reviewing, and making mistakes. If AI takes more and more of those stations, fewer people will develop the judgment layer that used to grow from within them.

This doesn’t mean nobody will ever understand systems deeply again. It means that capability might stop being the foundation of a broad profession and become a narrower layer of expertise. As in other fields that were pushed upward, the knowledge doesn’t disappear — the mass demand for it erodes. Experts remain, but far fewer of them, and far fewer places where they grow into such.

But this doesn’t mean juniors are going to disappear. Some are actually thriving.

There’s a new generation that grew up with AI the way our generation grew up with smartphones. They’re not afraid of it, not trying to prove they know better than it, and not insisting on writing everything themselves. They think through it. They know how to break a problem into questions a model can answer, they recognize when it’s hallucinating, and they understand when to stop trusting it. While veteran seniors are still debating whether it’s even worth using AI, these juniors have already built three projects with it.

And so not all developers are going to disappear, but the bar for those who stay is rising faster than most people are willing to admit. Next year’s junior won’t be judged on how fast they write CRUD — the agent will do that. They’ll be judged on what we previously expected only from a more experienced developer: system understanding, product thinking, ability to formulate a spec, identify gaps, and exercise judgment over output they didn’t produce. So whoever is building their career on the assumption that one day they’ll get to write code with their fingers may find that day is no longer worth what it once was.

Will Software Engineering Disappear?

I don’t see this as a question of whether, but of how long. How long can we hold the position of coder, of reviewer, of AI operator — a year? Five? Our parents had thirty-year careers in the same profession. It’s hard to imagine thirty years of “supervisor of supervisor of supervisor of agents.”

Dario Amodei, perhaps the most careful person in the world when it comes to talking about radical technology, described it in a podcast like a tsunami that everyone can see, and most people are still explaining to themselves that it’s “just a trick of the light.” He’s not claiming software engineering vanishes entirely tomorrow. But he did say that coding disappears first, and the broader mission of software engineering will follow after.

Sam Altman also said that AI already does “probably above 50%” of code work in many companies, and that first every engineer will do much more, and then “at some point, yes, maybe we’ll need fewer software engineers.”

Andrej Karpathy, who coined the term “vibe coding,” says it’s already becoming outdated and that what’s happening now is more like “agentic engineering.”

Boris Cherny, who leads Claude Code, said with almost brutal simplicity in a recent interview that coding is “basically solved at large” and that the title “software engineer” will start to disappear.

Jerry Murdock, co-founder of Insight, talked about a “tsunami” of autonomous agents. Not as a productivity improvement tool, but actual autonomous agents. He claimed portfolio companies are already telling him that Cursor itself is “outdated” because they’ve moved to a fully agentic approach.

You don’t have to accept their timelines. But it’s hard to keep pretending that the people building the tools think this is just a personal productivity upgrade for developers.

And on That Pessimistic Note

Don Knuth, author of The Art of Computer Programming and Turing Award laureate, published a document at the end of February that opens with “Shock! Shock!” after Claude Opus 4.6 solved an open problem in combinatorics that he himself had worked on for weeks. He described what he saw as a dramatic leap in automated reasoning, and admitted it changed his mind about the capabilities of generative AI.

If Don Knuth is willing to change his mind, maybe we too can admit what’s already hard to deny: the activity around which our profession was built is beginning to separate from the person who performed it. Not everyone will disappear. Not everything will happen in the next two years. But the direction is no longer an open question.

First the line of code disappears. Then the review. Then the ladder that trained the next person. And only at the end does everyone agree to call it by name.

So go back for a moment to the image of the small bird sitting on the rhino. It’s still there. It’s still useful. It’s still part of the picture. But it’s not the creature carrying the weight, and not the creature deciding where to go. It’s there because a massive body moving through the world has small needs that it knows how to fill. And that, more and more, is the role that remains for us.

By Adir Duchan, Senior AI Engineer at Elementor

FOMA AI News

Discussion about this post

Ready for more?