API Tests - The Rent You Don't Need to Pay

Part 5 of “The Test Pyramid — Reimagined.” Start with the opener if you missed it.

Most of what your team calls “API tests” are integration tests in a hat, paying rent above the deployment line that they don’t owe.

I don’t mean that as a roast. I mean it as a diagnosis, and the diagnosis explains why so many post-deploy API suites feel slow, brittle, and weirdly disconnected from the work the developers are doing twenty feet away. The tests live above the line because that’s where the team’s API testing tooling has always lived — a folder of Postman collections, a Karate project, a Rest Assured suite that some senior SDET set up in 2019 and that everyone has been adding to ever since. None of those tools force the question should this test be running pre-deploy instead? So nobody asks it. The suite grows. The CI pipeline gets slower. The first time anybody notices is when somebody complains that the post-deploy run takes forty minutes.

The fix isn’t to delete the suite. The fix is to be ruthless about which tests have earned their slot above the line and which ones are sitting there out of habit.

This post is about how to tell the difference.

What I mean by “API test”

In the opener I gave you the one-line version: cheap, fast, post-deploy, and only worth keeping if the test needs something the deployed environment provides. Let me unpack that.

An API test, in my model:

Runs above the deployment line. It hits a deployed backend at a real URL — staging, preview, production canary, whichever environment you’ve stood up for the purpose. The thing under test is not your code in a CI container. It’s your code running in the same kind of environment users (or the next environment downstream) will eventually hit it in.
Talks to the service over HTTP (or gRPC, or your protocol of choice). No in-process calls. No spinning up the app in the test runner. The test is a client. The service is a server. The network is between them.
Has no UI. No browser. No DOM. Same rule as integration tests on this one. If your assertion runs in pixels, you’re at the system layer, not the API layer.
Exercises something the integration suite literally cannot. This is the rule that does most of the work, and the rule most “API tests” in the wild fail. If you can get the same coverage by mocking the backend’s externals and running pre-deploy, the test belongs at the integration layer. The API layer is for everything else.

That last bullet is the costume detector. Hold on to it.

The costume problem

Here’s the pattern I see in nearly every API suite I look at.

A developer needs to verify that the POST /orders endpoint persists an order, charges the payment provider, and returns a 201. The integration test that would cover this — Express plus Supertest plus a stubbed payment HTTP call, exactly the kind of test I showed in the integration post — doesn’t exist, because the team never built integration tests. So the developer files a ticket with the SDET team, the SDET writes a Karate scenario that hits the deployed staging API, the test passes, the team feels covered, and the PR ships.

The test that just got written:

Runs post-deploy, so the bug it would have caught now ships to staging before being caught.
Depends on the staging environment being up, which it sometimes isn’t.
Depends on the test data the team seeded six months ago, which sometimes isn’t there anymore.
Takes a couple of seconds against the network instead of milliseconds against an in-process app.
Costs the team a deploy cycle for every iteration when the test starts failing.

And it provides no coverage the integration test wouldn’t have provided. Same code paths, same assertions, same boundary stubbed at the same place — just on the wrong side of the line. The team got the same confidence for several multiples of the cost, and got it later.

That’s the costume. The test is wearing the API-test hat because that’s where the team’s API tooling sits, not because the test’s job requires a deployed environment.

The first move on any API suite audit is to walk down the test list and ask, for each one: if I had the integration layer, could I do this pre-deploy? When the answer is yes — which it will be, most of the time — the test gets demoted. The integration suite grows. The API suite shrinks. The CI pipeline catches more bugs earlier, and the post-deploy run gets faster every quarter, not slower.

What an API test should actually be doing

A test belongs at the API layer if and only if the thing it’s testing requires a deployed environment to exist. The list isn’t long, but it isn’t empty either, and the tests on it are genuinely useful.

Real auth against a real identity provider. Your integration tests stub the auth boundary, because what they’re testing isn’t the auth — it’s your business logic on the other side of the auth. But at some point, you need to know that an actual JWT from your actual identity provider, with your actual production scopes, actually authorizes a request to your actual deployed API. That’s an API test. It can’t run pre-deploy, because the IDP doesn’t issue tokens for code that hasn’t been deployed.

Your integration code against real downstreams — not the downstreams themselves. If Service A calls Service B calls Service C and C is broken, that’s C’s tests’ job to catch. You should not be testing C’s functionality from inside A’s suite — that’s how A’s API suite turns into an apology for missing tests in B and C. What’s legitimate here is the seam: are you calling the right URL, sending the right auth, parsing the response shape you think you’re parsing, handling the error codes the downstream actually returns? Integration tests stub those calls with the responses you believe the downstreams return; the downstreams change and your stubs don’t. An API test against the deployed downstream catches that drift — but the unit under test is your client code, not theirs. And most of that drift should already be caught pre-deploy by contract testing (more on that below), so this bucket should be a thin backstop, not the bulk of your suite.

Behavior the integration stub can’t fake honestly. Separate from shape drift: some downstream behavior is hard to stub at all. Rate limits with real backoff timing. Eventual-consistency windows that depend on real replication lag. Retry and circuit-breaker logic that only fires under real latency. When the integration-layer stub can’t reproduce the condition, the test of your handling of that condition belongs against the real downstream — and that puts it above the line.

Smoke tests after deploy. A focused, fast set of tests that runs immediately after every deploy and answers exactly one question: did the deploy break anything obvious? These are deliberately shallow — half a dozen endpoints, basic auth, basic happy path, basic round-trip. They exist to catch the deploys where the artifact built but the config is wrong, the env var didn’t get injected, the new code can’t reach the database. They’re the cheapest possible post-deploy safety net.

Canary checks in production. A small subset of your smoke tests, running on a schedule against production. Not to catch every bug — that’s not what they’re for — but to catch the class of failures that only show up when real traffic hits real production: a DNS misconfiguration, an IAM role that got rotated, a downstream service that’s degraded but still answering. These are operational tests as much as functional ones. Tie them to your on-call alerting.

That’s the list. If a test you’re looking at doesn’t fit one of those buckets, it’s almost certainly an integration test in a hat.

What a clean API test looks like

Same shape, two languages.

Java + Rest Assured — a post-deploy smoke test that hits the deployed /orders endpoint with a real bearer token from the configured identity provider, and asserts the deploy is wired up end to end:

class OrdersApiSmokeTest {

    private static final String BASE_URL =
        System.getenv("API_BASE_URL"); // e.g. https://api.staging.example.com

    private static String bearerToken;

    @BeforeAll
    static void fetchToken() {
        bearerToken = IdentityProvider.tokenFor(
            System.getenv("SMOKE_TEST_USER"),
            System.getenv("SMOKE_TEST_PASSWORD")
        );
    }

    @Test
    void placesAnOrderAgainstTheDeployedService() {
        String orderId =
            given()
                .baseUri(BASE_URL)
                .header("Authorization", "Bearer " + bearerToken)
                .contentType(ContentType.JSON)
                .body(Map.of("sku", "smoke-sku", "quantity", 1))
            .when()
                .post("/orders")
            .then()
                .statusCode(201)
                .body("status", equalTo("CONFIRMED"))
                .extract().path("id");

        given()
            .baseUri(BASE_URL)
            .header("Authorization", "Bearer " + bearerToken)
        .when()
            .get("/orders/" + orderId)
        .then()
            .statusCode(200)
            .body("status", equalTo("CONFIRMED"));
    }
}

The test is short on purpose. It’s not validating business logic — the integration suite already did that pre-merge. It’s validating that the deployed artifact, with the real config, behind the real load balancer, talking to the real downstreams, holding the real auth tokens, actually works. If the deploy broke the env var injection, this test fails. If the IAM role can’t reach the DB, this test fails. If the new auth middleware rejects valid tokens, this test fails. None of those failures could have been caught pre-deploy.

JavaScript + Playwright’s APIRequest — same idea, different toolchain. Playwright’s APIRequest is a perfectly good HTTP client and it lets you reuse one tool across your API and system layers:

import { test, expect, request } from "@playwright/test";
import { tokenFor } from "../helpers/identity-provider";

const BASE_URL = process.env.API_BASE_URL;

test("places an order against the deployed service", async () => {
  const token = await tokenFor(
    process.env.SMOKE_TEST_USER,
    process.env.SMOKE_TEST_PASSWORD
  );

  const api = await request.newContext({
    baseURL: BASE_URL,
    extraHTTPHeaders: { Authorization: `Bearer ${token}` },
  });

  const create = await api.post("/orders", {
    data: { sku: "smoke-sku", quantity: 1 },
  });
  expect(create.status()).toBe(201);
  const { id, status } = await create.json();
  expect(status).toBe("CONFIRMED");

  const fetch = await api.get(`/orders/${id}`);
  expect(fetch.status()).toBe(200);
  expect((await fetch.json()).status).toBe("CONFIRMED");
});

Same job. Same surface area. Different stack. The thing that makes either of these an API test is not the tool — it’s the fact that the token, the URL, the load balancer, the DB connection, and the deploy are all real, and the test would not pass against code that hasn’t been deployed.

A note on contract testing

There’s a whole adjacent discipline worth gesturing at: contract testing. Pact, Spring Cloud Contract, and a small forest of other tools all try to solve the same problem — how do we make sure the consumer and producer of an API agree on the shape of the requests and responses, without running a full end-to-end test for every interaction?

The short version: someone writes a contract describing the request/response shape. Both sides verify against it in their own CI. Neither side has to stand up the other’s full stack.

Worth being precise about where this runs, because the natural assumption — that anything verifying behavior between two deployed services must live above the line — is wrong. Contract testing is almost entirely a pre-deploy discipline.

Pact is consumer-driven. The consumer’s CI generates a pact and publishes it to a broker. The producer’s CI verifies, against its own code in CI, that it still satisfies every published pact. Before either side deploys, they call can-i-deploy against the broker — it checks the verification matrix and exits non-zero if the version you’re about to ship would break a partner. That’s an explicit pre-deploy gate, run locally against a broker you control. The deploy itself is recorded after it succeeds (via record-deployment), which is how the broker knows what’s actually in each environment.
Spring Cloud Contract is producer-driven and broker-less. Contracts live in the producer’s repo; the producer’s build generates JUnit tests from them and runs them against its own code. If those pass, it publishes a stubs JAR (WireMock mappings) to Maven/Artifactory. Consumers depend on a pinned version of that stubs JAR and run their tests against it locally. There’s no can-i-deploy equivalent — the gating is structural: a producer can’t publish a stubs JAR that fails its own generated tests, and a consumer pinning that JAR fails in CI before it can deploy.

Both workflows live below the deployment line. The only thing that crosses the line is the bookkeeping — Pact recording a deployment, or a consumer bumping its pinned stubs version — and that’s not a test.

So contract testing doesn’t really belong in the API layer of this model. It belongs below the line, with the rest of your pre-deploy work. I’m mentioning it here because the problem it solves — “my service still honors the agreement my partners depend on” — is the same problem the second bullet above (your-client-code-against-real-downstreams) is the backstop for. Contract tests catch shape drift cheaply, before you ship; the post-deploy version is a small safety net for anything the contracts didn’t model (real auth, real data, real network, real environment).

If you’re operating a service with multiple consumers and you’re not doing contract testing, you have a category of bugs your suite is structurally blind to. Worth fixing — but fix it below the deployment line, not above.

How big should the API layer actually be?

Smaller than you think, larger than nothing.

The post-deploy API suite for a healthy service should look something like:

A handful of smoke tests per service — one happy-path call per major endpoint, run immediately after every deploy.
A small, focused set of real-auth and real-downstream tests for the boundaries your integration layer couldn’t honestly stub.
A canary subset running against production on a schedule, tied to your alerting.

That’s typically tens of tests per service, not hundreds. If your API suite is larger than that and you can’t account for every test as fitting one of those buckets, the suite has accumulated costumes and it’s time for a closet purge.

The healthiest API suites I’ve worked with treat new test additions adversarially: prove this test can’t be done at the integration layer. Most proposed tests can be, and get demoted in PR review. The ones that survive really do require the deployed environment, and they pay their rent.

What’s next

One post left in the series: UI/system tests. The apex. The layer you should hate needing and will need anyway. The place where flake lives, where real production-only bugs get caught, and where the inverted pyramid is most likely to have set up camp.

Subscribe, RSS, bookmark — whatever your preferred mechanism is. One floor to go.

Discover more from Go Forth And Test

Subscribe to get the latest posts sent to your email.

API Tests – Rent You Don’t Need to Pay