The Pyramid Isn't Dead, You're Just Not Using It Right

This is the opener for a series called “The Test Pyramid — Reimagined.” Each layer gets its own follow-up post. Links will land here as they go live.

1. Unit Tests – Your Code’s Loudest Critic

Every six months, somebody publishes another eulogy for the test pyramid. The trophy buried it in 2018. The honeycomb buried it in 2020. The diamond buried it again in 2022. By my count, the pyramid has been dead at least four times this decade, which is impressive for a shape.

I read the eulogies. I read the responses to the eulogies. I read the responses to those. And every time I close the tab, walk to a whiteboard, and draw another pyramid — because the pyramid never actually died. What died is a specific, simplified, three-layer drawing that a generation of teams used as a shape instead of a model, and which, fair enough, doesn’t survive contact with modern web architecture. The shape was always a teaching aid. The idea underneath it — push tests as low as you can get them, because lower is faster, cheaper, and more honest — has not aged a day.

Quick credit before I draw, because all of this stands on other people’s work. The pyramid as a concept comes from Mike Cohn’s 2009 book Succeeding with Agile — a fast, large base of unit tests, a smaller layer of service tests, a thin layer of UI tests at the top. The canonical modern reference is Ham Vocke’s The Practical Test Pyramid on Martin Fowler’s blog (2018). If you only have time for one full-length read on this topic, that’s the one. I’m not trying to replace it — I’m trying to add the line the diagram in that article, and most articles since, doesn’t draw.

Here’s the version I put on whiteboards now. Four layers, with a line cutting through the middle.

Let me walk you through it.

Layer 1 — Unit Tests (the base)

Mock your own code. One test, one thing.

A unit test that fails should tell you what broke, not just that something broke. If the failure message is “something in the checkout flow blew up,” congratulations — you wrote an integration test wearing a unit test’s clothes. The point of the unit layer is precision. The test should be small enough that the failure is the diagnosis.

The thing most teams miss about unit tests in 2026 is that “unit test” no longer means “a test of a pure function with no dependencies.” Modern UI testing libraries — React Testing Library, Vue Test Utils, Angular’s TestBed — let you exercise real component behavior at the unit layer. Render the component. Click the button. Assert the state. No browser. No backend. Milliseconds.

A huge chunk of what most teams currently push into Selenium or Playwright suites belongs at the unit layer instead. They don’t do it because the muscle memory says “UI behavior = browser test,” and that muscle memory is a decade out of date.

There’s a bonus property here that nobody talks about: unit tests are a code-cleanliness smell test. If you can’t write a unit test for a method without mocking ten things, the method is doing too much. The test isn’t bad. The code is bad. The test is the canary. We’ll come back to this in the unit-tests deep dive.

Layer 2 — Integration Tests

Do not mock your own code. Do mock external dependencies.

This is the layer most teams under-invest in, usually because they’ve never agreed on what it is. An integration test, in my model, is one that runs your code — multiple modules of your code, talking to each other for real — but stubs out anything you don’t own. The third-party payment processor. The email provider. The analytics service. The flaky vendor SDK that pages you at 3 AM.

What you’re testing here is the system integrated with itself. Does the controller call the service the way the service expects? Does the data model round-trip cleanly? Does the new code path interact correctly with the older code path that nobody on the team remembers writing?

Integration tests give you maximum coverage for minimum test code. They are, dollar for dollar, the best-leveraged tests in your suite. They are also the layer that gets shrugged off most often, because the unit zealots think they’re too coarse and the e2e zealots think they’re too fake. They are exactly the right size, and we will spend a whole post on them.

The deployment line

Here’s the bit you don’t see on most modern test diagrams.

Everything above the line runs against a real, deployed environment. Everything below the line runs pre-merge, pre-deploy, on a developer’s laptop or a CI runner with no infrastructure to speak of. The line is not metaphorical. It is the literal moment your code stops being a thing in a pull request and starts being a thing on the internet.

This single distinction does most of the heavy lifting in the model. Push as much of your testing below the line as you possibly can. Below-the-line tests run on every pull request, gate every merge, and cost almost nothing. Above-the-line tests run after you’ve already shipped to an environment, run more slowly, run less reliably, and only run as often as you’ve built infrastructure to run them. Every test you can move below the line is a test that protects you earlier, faster, and cheaper.

Most teams, if they’re honest, have inverted this. They have a thin layer of unit tests, almost no integration tests, and a hulking pile of post-deploy UI tests doing work that should have been caught at PR time. The pyramid isn’t dead in those orgs. It’s been flipped upside down and stood on its tip.

Layer 3 — API Tests

Cheap. Fast. Post-deploy.

API tests are the first thing above the line, and they’re useful — when they’re actually API tests. They shine for downstream-service edge cases the integration layer can’t reach, for contract validation against a deployed backend, and for situations where the backend is the genuine source of business logic and you need to prove it works in its real environment.

But here’s the dirty secret of API testing: most “API tests” in the wild are integration tests in costume. The team wrote them post-deploy because that’s where the QA team had tooling, not because the test actually requires a deployed environment. If you can mock the backend’s external dependencies, run the backend against a local DB, and assert the same behavior pre-merge — you should. That’s an integration test. Move it below the line.

Genuine API tests — the ones that need to live above the line — are the ones that exercise something only the deployed environment provides. Real auth tokens from your real identity provider. Real downstream services in your staging cluster. Real network paths that only exist after deployment. Those are worth keeping. Everything else is rent you don’t need to pay.

Layer 4 — System Tests (the apex)

The full deployed stack. Real auth. Real CSS. Real third-party JavaScript. Real CDN. Real DNS. Real everything.

System tests are expensive. They’re sometimes flaky. They’re slower than every other layer combined. They are also the only place certain bugs can be caught, because certain bugs do not exist until the code is deployed. The misconfigured ECS task definition. The CSP header that blocks your analytics in production but not staging. The third-party widget that renders fine in dev and breaks in prod because the prod CDN serves a different version. The auth flow that works locally because you’re using a stub but fails in real life because the identity provider redirects somewhere unexpected.

You need system tests. You need some system tests. The trick is the word some.

If your suite is mostly system tests, one of two things is true: either you have no pre-deploy tests at all (which is bad), or you have pre-deploy tests but you don’t trust them (which is almost as bad). Both are tells that the pyramid isn’t a pyramid in your org — it’s a column with a tiny base. We’ll spend a whole post on what system tests should and shouldn’t be carrying.

Who the pyramid is for

There’s a thing nobody says out loud about the testing-model wars, and I’m going to say it: every popular model in this conversation was written by a developer, for developers. The pyramid, the trophy, the honeycomb, the diamond — every one of them is aimed at an application developer writing tests for their own code. None of them really account for the fact that, in most orgs of any meaningful size, there is a separate testing function alongside the developers, with its own perspective, its own incentives, and its own scars.

Here’s what that creates in practice. A tester reads the pyramid. A tester reads the post saying “trust your devs’ unit tests, push as much down as possible.” A tester then watches a bug ship to production that the unit tests didn’t catch. Their next move is not, “the dev should have written a better unit test.” It’s, “clearly I can’t trust the unit tests, so I’d better cover this myself in Selenium.”

I have made this exact move. I have coached people into making this move. It is the most natural reaction in the world, and it is also the engine that builds the inverted pyramid — bloated above the line, hollow below, every dev-written test treated as suspicious, every gap papered over with one more UI test. The pyramid wasn’t built wrong in that org. It was defended into uselessness by a tester who didn’t trust the layer underneath them.

So I’ll say what most pyramid-defenders won’t: the four-layer, deployment-lined version is, on purpose, a model both audiences can see themselves in. Devs own most of what’s below the line — unit and integration tests live closest to their code, run in their PRs, gate their merges. Testers own most of what’s above the line — system tests and the cross-cutting stuff that only makes sense once the code is real. Both sides are expected to play across the line: SDETs reviewing unit tests and pushing back on bad ones, devs writing system tests when the feature warrants it.

The line isn’t a wall. It’s a collaboration seam. The model has to make both audiences feel like they belong in the picture — otherwise testers will keep solving their trust problem by rebuilding the entire suite at the highest, slowest, most expensive layer.

That’s the version I’m trying to build. Not a dev-only model. Not a tester-only model. A shared one.

More posts are coming in this series — one per layer, plus the deployment line itself. Each one will take a swing at a specific bit of received wisdom in our field, and I’d love your pushback in the comments.

Subscribe, bookmark, RSS — whatever your preferred mechanism is. The pyramid isn’t dead. We just owe it a better drawing.

Discover more from Go Forth And Test

Subscribe to get the latest posts sent to your email.

The Pyramid Isn’t Dead, You’re Just Not Using It Right