What Versioning Actually Means
Versioning is not a release number on a URL. It is a contract management system: which behaviors consumers can depend on and for how long.
A version is a snapshot of your API's contract: the shape of requests it accepts, responses it returns, error codes it emits, and side effects it produces. When you "version" an API, you are drawing a line and saying "everything on this side keeps working as documented, even after we build something new."
Not all changes need a new version. Adding an optional query parameter is non-breaking. Returning an additional response field is non-breaking if consumers follow the robustness principle (ignore what you don't understand). But removing a field, renaming a key, changing a data type, or altering a value's meaning? Those demand a versioning strategy.
Versioning exists for the consumer, not the producer. That rule shapes the rest of the design.
Where to Put the Version Identifier
No universally correct choice. URI wins on simplicity. Headers win on cleanliness. Most teams at scale use a hybrid: major versions in the URI path, minor or date-based versions in headers. For most B2B APIs I'd start with URI path and only move to headers if the path-based approach creates routing problems.
| Strategy | Example | Why teams choose it | Tradeoff |
|---|---|---|---|
| URI Path | /v1/users | Most widely adopted. Debuggable, cacheable, simple to route. | Implies the whole resource changed and clutters URLs. |
| Query Parameter | ?version=1 | Less intrusive and keeps paths clean. | Easy to forget and can play poorly with caches that strip query params. |
| Header | X-API-Version | URL stays clean and control is fine-grained. | Harder to spot during incident response because the version is invisible in plain URLs. |
| Content Negotiation | Accept header media type | Aligned with HTTP semantics. | Hardest to understand quickly and cumbersome with simple tooling. |
Date-Based Versioning: The Stripe Model
When a developer creates a Stripe account, it gets pinned to the API version active that day, say 2025-09-15. Months later, Stripe releases breaking changes. The developer's integration does not notice. Their account still gets the pinned behavior.
Stripe's engineers write code only against the latest version internally. A compatibility layer transforms responses backward through a chain of version-specific transformations to match whatever version the caller's account is pinned to. Stripe has maintained backward compatibility with every API version since 2011. Over a decade of versions, all simultaneously supported.
The cost: that transformation chain is engineering debt compounding with every release. Testing must cover every active version. We looked into this approach for a payments API with about 200 active integrations. After mapping out what the compatibility layer would look like for our schema (which was messier than Stripe's), we estimated it would take a dedicated engineer roughly 40% of their time just to maintain the transformation chain. We decided it wasn't worth it for our scale and went with expand-migrate-contract instead.
For smaller teams, the lesson still applies: pin consumers to a known-good state by default, let them upgrade on their own schedule. You can implement a lighter version with any versioning scheme by defaulting unversioned requests to the last stable version rather than the latest.
The Expand-Migrate-Contract Pattern
The practical method works at any scale. Martin Fowler popularized it as "parallel change." Build the new lane before you close the old one.
Expand. Add the new field or behavior alongside the existing one. Nothing removed. Both coexist. If changing user_name to username, return both fields simultaneously.
Migrate. Communicate the change, update documentation, give consumers a timeline. Instrument the old behavior. Measure how many consumers still use it. You can't close a lane until you know it's empty, and knowing means telemetry, not hope.
Contract. Once telemetry confirms no active consumers depend on the old behavior (or the deprecation deadline has passed), remove it.
This works for REST, GraphQL, gRPC, and anything with a schema. It's the backbone of non-breaking API evolution. Teams that skip the migrate step (expanding and immediately contracting) are the ones generating 3 a.m. pages. The migration window is the point of the pattern.
Expand
Introduce the new behavior beside the old one so consumers are not broken immediately.
Measure and migrate
Instrument usage, publish timelines, and move clients deliberately instead of guessing who still depends on the old path.
Contract
Remove only when telemetry says the lane is empty or the sunset deadline has truly arrived.
GraphQL's Approach
GraphQL says: one schema, it only gets wider. Clients request exactly the fields they need, so new fields are non-breaking. When a field needs retiring, GraphQL offers first-class deprecation as a schema concept: mark it deprecated with a reason, and tooling (IDE plugins, schema explorers) surfaces that information automatically.
The deeper advantage: per-field usage tracking. Because every query specifies its fields, you can see exactly which clients use which fields, down to the individual query. The migrate step of expand-migrate-contract becomes nearly trivial compared to REST, where you often can't tell if a client is actually reading a given response field.
The catch: schema sprawl over years of additive changes and deprecated-but-not-removed fields. We had a GraphQL schema that accumulated 340 fields over two years. About 90 of them were deprecated. Nobody wanted to remove them because we couldn't be sure external clients weren't still querying them, and our analytics only covered internal usage. When we finally instrumented external query logging, we found 60 of those 90 deprecated fields had zero queries in the preceding six months. We removed them in one batch and got exactly one complaint.
For incompatible structural changes, even GraphQL teams reach for something resembling versioning. The "versionless" label has limits.
Why GraphQL helps
Field-level requests and field-level deprecation make it much easier to understand exactly what clients still use.
Why GraphQL does not eliminate the problem
Structural schema changes still require migration discipline, even if additive evolution covers many routine changes.
Deprecation Done Right
Two HTTP headers formalize this. The Deprecation header (broadly adopted IETF draft) signals that an endpoint has been or will be deprecated. The Sunset header (RFC 8594) specifies the exact date it stops responding.
Together: machine-readable deprecation timeline. A consumer's HTTP client can detect these, log warnings, and trigger migration workflows without anyone reading a changelog.
Norms that have converged: six months notice minimum for standard endpoints, twelve for mission-critical ones. Include a Link header pointing to migration docs or the replacement endpoint. After sunset, return 410 Gone, not 404. A 410 tells the client "this was deliberately removed," which is easier to debug than a generic "not found."
The companies that do this well treat deprecation as a product launch in reverse. Stripe publishes detailed upgrade guides for every version. Twilio sends email campaigns months in advance. Zalando's open-source API guidelines specify deprecation as a required lifecycle phase.
I've seen this go wrong repeatedly. We once sent three emails, posted in the developer forum, and added banner warnings to the API dashboard. Six months later, 23% of consumers were still on the deprecated version. What finally worked was adding a Warning response header that their monitoring tools picked up automatically. Non-technical communication (emails, blog posts) supplements technical communication (headers, SDK warnings), but neither alone is sufficient.
The worst deprecation strategy is silence: removing an endpoint and hoping nobody notices.
Governance
Versioning is a governance problem, not a technical one. The tooling is easy. The hard part: deciding when a breaking change is worth the cost, who bears that cost, and how long the old version stays alive.
A documented versioning policy accessible to producers and consumers, answering: what constitutes a breaking change? How long will deprecated versions be supported? Who approves a new major version? What is the minimum migration window?
Separate the decision to deprecate from the act of deprecating. An API review board or architecture team decides a breaking change is necessary. A product and engineering team then executes the expand-migrate-contract cycle, with clear communication at each step. Collapsing these roles leads to either reckless deprecation (engineering moves fast and breaks things) or permanent stagnation (nobody wants to own the migration cost).
Invest in observability as a governance tool. Per-consumer call volume on deprecated endpoints, trend lines, active versus abandoned integrations. You cannot make informed deprecation decisions without data. Instrumentation transforms version lifecycle management from guesswork into evidence-based decision-making.
Treat backward compatibility as a design constraint, not a cleanup task. Use extensible field naming. Avoid enums that cannot grow. Favor optional fields over required ones. Think about extensibility during initial design, not during the first breaking-change crisis. The cheapest API evolution is the one planned into the schema early.
Every live version is a maintenance surface: security patches, infrastructure costs, monitoring, on-call burden. The compounding cost of "just keep it running" is how teams end up supporting eight versions with no budget for version nine.
Policy
Teams need an explicit versioning policy that defines breaking changes, support windows, and approval responsibilities.
Role separation
Deciding a breaking change is necessary and executing a safe migration are related but distinct jobs.
Telemetry
Per-consumer deprecated usage data turns deprecation from politics and guesswork into evidence-based execution.
What a Real Migration Looks Like
A health-tech company whose Patient Records API served 340 external integrations (hospital EHR systems, insurance claim processors, telehealth platforms, analytics dashboards) needed to move from v1 to v2. Four years of inconsistent naming, a flat resource model, and authentication patterns that no longer met HIPAA requirements. They knew 340 integrations, many built by third-party vendors with their own release cycles, couldn't all migrate simultaneously.
They used expand-migrate-contract. Phase one stood up v2 endpoints alongside v1, both backed by the same data layer. No v1 endpoints removed or altered. Every v1 response got Deprecation headers with a date twelve months out and Link headers pointing to the v2 migration guide.
Phase two segmented consumers by call volume. The top 20 integrations (89% of traffic) received direct outreach: dedicated Slack channels, co-development sessions, and sandbox environments pre-loaded with their production data shapes in v2 format. The remaining 320 got email campaigns, documentation updates, and monthly "migration health" reports showing their v1 usage alongside v2 equivalents.
They built a daily-updated dashboard tracking v1 call volume per consumer. This was their lane-closure telemetry.
After nine months, 94% of traffic had migrated. The remaining 6% came from 47 integrations, 31 of which turned out to be abandoned (no activity beyond automated health checks from forgotten cron jobs). They contacted the remaining 16 active consumers, extended their deadline by three months, and offered engineering support.
At month fourteen, they set v1 to return 410 Gone with the v2 equivalent URL in the response body. Two integrations broke. Both were fixed within hours because the 410 told them exactly what happened and where to go. The post-mortem identified consumer-segmented outreach and daily telemetry as the two decisions that made the project survivable.
Parallel launch
v2 came online beside v1 with no immediate contract breakage and with deprecation headers attached to legacy traffic.
Segmented migration
High-traffic integrations got direct engineering support while the long tail received structured reporting and repeated communication.
Telemetry-led cutover
Daily per-consumer v1 usage made retirement a controlled, evidence-based decision rather than a leap of faith.
Common Misconceptions
"Versioning means maintaining multiple codebases." Not necessarily. Stripe maintains one codebase with a compatibility transformation layer. Expand-migrate-contract avoids parallel codebases entirely. Multiple live versions don't require multiple code branches if your architecture separates contract compatibility from business logic.
"SemVer solves API versioning." SemVer communicates the nature of a change (major, minor, patch). It tells you nothing about how to run both versions simultaneously, migrate consumers, or set deprecation timelines. It is a label, not a plan.
"GraphQL does not need versioning." It reduces breaking change frequency. It does not eliminate it. Schema-level structural changes (altering nullability, splitting types, changing union membership) can still break clients.
"If we document the change, consumers will migrate." Documentation is necessary but nowhere near sufficient. Without headers, sunset deadlines, direct outreach, and telemetry showing who has not migrated, a significant fraction will never read the changelog.
"We can run the old version forever." Every live version is a maintenance surface: security patches, infrastructure, monitoring, on-call burden. Set deprecation timelines early and enforce them.
Labels are not plans
SemVer and good docs help, but they do not replace compatibility strategy, consumer migration, or retirement execution.
One codebase is still possible
Compatibility layers and expand-migrate-contract can preserve one evolving system without parallel code branches.
Forever support is not free
Every live version adds security, monitoring, infrastructure, and operational burden that compounds over time.
Takeaways
- Every versioning decision should be evaluated through the consumer's lens, not the producer's convenience.
- Additive changes are free; subtractive changes are expensive. Design your initial API for growth.
- Expand-migrate-contract is the universal pattern: add, move consumers over, remove. In that order.
- Deprecation is a communication problem, not a technical one. Pair technical signals with human outreach to high-traffic consumers.
- You cannot deprecate what you cannot measure. Instrument deprecated endpoints and make decisions on data, not hope.
- Default to stability: unversioned requests should resolve to the last known stable version, not the latest release. Surprising consumers with new behavior they didn't opt into is a breaking change in disguise.
- Every live version is a liability. Set deprecation timelines early, communicate them, and enforce them.


