Generate OpenAPI from behavior with Claude Code, honestly

Q: What’s the minimum “behavior notes” I should write per endpoint?

For each endpoint, capture: - Purpose : what it does in one sentence - Auth : required token/roles or public - Request : query/path params, headers, and a JSON body example - Responses : at least one success example and one or two realistic error examples with status codes - Edge cases : missing/invalid IDs, duplicates (409?), empty results, pagination limits If you can write a concrete request and two responses, you’re usually far enough to draft a truthful spec.

Q: What should I check in the OpenAPI file before I compare it to the running API?

Validate the spec itself first: - Required vs optional fields are correct - Formats are explicit (UUID, email, ISO 8601 date-time) - Examples match schemas (enums, types, nullability) - Status codes are intentional and consistent (e.g., create often ) - Auth requirements are stated per endpoint This catches “wishful” OpenAPI files before you even look at production behavior.

Q: What’s a simple habit to prevent drift from coming back?

A practical routine is: - Update behavior notes and spec whenever an endpoint changes - Validate the OpenAPI file (schemas, examples, status codes, auth) - Compare a few real request/response samples against the spec before release - Add small contract checks in tests so drift is caught automatically If you’re building in Koder.ai, you can keep the OpenAPI file alongside the code, use snapshots before risky edits, and rollback if a spec/code change gets messy.

Generate OpenAPI from behavior with Claude Code, honestly | Koder.ai

Why OpenAPI contracts drift (and why it matters)

An OpenAPI contract is a shared description of your API: what endpoints exist, what you send, what you get back, and what errors look like. It’s the agreement between the server and anything that calls it (a web app, mobile app, or another service).

The problem is drift. The running API changes, but the spec doesn’t. Or the spec gets “cleaned up” to look nicer than reality, while the implementation keeps returning odd fields, missing status codes, or inconsistent error shapes. Over time, people stop trusting the OpenAPI file, and it becomes just another doc everyone ignores.

Drift usually comes from normal pressures: a quick fix ships without updating the spec, a new optional field is added “temporarily,” pagination evolves, or teams update different “sources of truth” (backend code, a Postman collection, and an OpenAPI file).

Keeping it honest means the spec matches real behavior. If the API sometimes returns 409 for a conflict, that belongs in the contract. If a field is nullable, say so. If auth is required, don’t leave it vague.

A good workflow leaves you with:

An OpenAPI file that reflects intended and observed behavior
A simple set of checks to catch drift early, before clients break
Concrete client and server validation examples you can copy into your codebase

That last point matters because a contract only helps when it’s enforced. An honest spec plus repeatable checks turns “API documentation” into something teams can rely on.

Start from intended behavior, not from code

If you start by reading code or copying routes, your OpenAPI will describe what exists today, including quirks you may not want to promise. Instead, describe what the API should do for a caller, then use the spec to verify the implementation matches.

Before writing YAML or JSON, collect a small set of facts per endpoint:

What it does (method and path)
What it accepts (headers, query, path params, body)
What it returns (success status code, response shape, key headers)
What can fail (likely errors, status codes, error body shape)
Who can call it (auth and roles, if relevant)

Then write behavior as examples. Examples force you to be specific and make it easier to draft a consistent contract.

For a Tasks API, a happy path example might be: “Create a task with title and get back id, title, status, and createdAt.” Add common failures: “Missing title returns 400 with {\"error\":\"title is required\"}” and “No auth returns 401.” If you already know edge cases, include them: whether duplicate titles are allowed, and what happens when a task ID doesn’t exist.

Capture rules as simple sentences that don’t depend on code details:

“title is required and 1-120 characters.”
“List returns up to 50 items unless limit is set (max 200).”
“dueDate is ISO 8601 date-time.”
“Write endpoints require a user token.”

Finally, decide your v1 scope. If you’re unsure, keep v1 small and clear (create, read, list, update status). Save search, bulk updates, and complex filters for later so the contract stays believable.

A lightweight template for describing endpoints

Before you ask Claude Code to write a spec, write behavior notes in a small, repeatable format. The goal is to make it hard to accidentally “fill gaps” with guesses.

A good template is short enough that you’ll actually use it, but consistent enough that two people would describe the same endpoint similarly. Keep it focused on what the API does, not how it’s implemented.

Endpoint behavior note template

Use one block per endpoint:

METHOD + PATH:
Purpose (1 sentence):
Auth:
Request:
- Query:
- Headers:
- Body example (JSON):
Responses:
- 200 OK example (JSON):
- 4xx example (status + JSON):
Edge cases:
Data types (human terms):

Write at least one concrete request and two responses. Include status codes and realistic JSON bodies with actual field names. If a field is optional, show one example where it’s missing.

Call out edge cases explicitly. These are the spots where specs quietly become untrue later because everyone assumed something different: empty results, invalid IDs (400 vs 404), duplicates (409 vs idempotent behavior), validation failures, and pagination limits.

Also note data types in plain words before you think about schemas: strings vs numbers, date-time formats, booleans, and enums (list allowed values). This prevents a “pretty” schema that doesn’t match real payloads.

Draft the OpenAPI spec with Claude Code (prompting that works)

Claude Code works best when you treat it like a careful scribe. Give it your behavior notes and strict rules for how the OpenAPI should be shaped. If you only say “write an OpenAPI spec,” you’ll usually get guesses, inconsistent naming, and missing error cases.

Paste your behavior notes first, then add a tight instruction block. A practical prompt looks like this:

You are generating an OpenAPI 3.1 YAML spec.
Source of truth: the behavior notes below. Do not invent endpoints or fields.
If anything is unclear, list it under ASSUMPTIONS and leave TODO markers in the spec.

Requirements:
- Include: info, servers (placeholder), tags, paths, components/schemas, components/securitySchemes.
- For each operation: operationId, tags, summary, description, parameters, requestBody (when needed), responses.
- Model errors consistently with a reusable Error schema and reference it in 4xx/5xx responses.
- Keep naming consistent: PascalCase schema names, lowerCamelCase fields, stable operationId pattern.

Behavior notes:
[PASTE YOUR NOTES HERE]

Output only the OpenAPI YAML, then a short ASSUMPTIONS list.

After you get the draft, scan the ASSUMPTIONS first. That’s where honesty is won or lost. Approve what’s correct, fix what’s wrong, and rerun with updated notes.

To keep naming consistent, state conventions up front and stick to them. For example: a stable operationId pattern, noun-only tag names, singular schema names, one shared Error schema, and one auth scheme name used everywhere.

If you work in a vibe-coding workspace like Koder.ai, it helps to save the YAML as a real file early and iterate in small diffs. You can see which changes came from approved behavior decisions versus details the model guessed.

Validate the spec before comparing to the running API

Get rewarded for sharing

Share what you build and learn on Koder.ai and earn credits for your account.

Earn Credits

Before you compare anything to production, make sure the OpenAPI file is internally consistent. This is the fastest place to catch wishful thinking and vague wording.

Read each endpoint like you’re the client developer. Focus on what a caller must send and what they can rely on receiving.

A practical review pass:

Required vs optional: mark required fields correctly and avoid “everything optional.”
Types and formats: be explicit about UUIDs, emails, date-time, min/max values.
Examples: add at least one realistic request and response example per endpoint, and keep examples aligned with schemas.
Status codes: align with your behavior notes (create is often 201, not 200). Pick 400 vs 422 and be consistent.
Auth: state what’s required per endpoint, and whether roles affect access.

Error responses deserve extra care. Pick one shared shape and reuse it everywhere. Some teams keep it very simple ({ error: string }), others use an object ({ error: { code, message, details } }). Either can work, but don’t mix them across endpoints and examples. If you do, client code will accumulate special cases.

A quick sanity scenario helps. If POST /tasks requires title, the schema should mark it required, the failure response should show what error body you actually return, and the operation should clearly state whether auth is required.

Compare the spec to the running API implementation

Once the spec reads like your intended behavior, treat the running API as the truth for what clients experience today. The goal isn’t to “win” between spec and code. It’s to surface differences early and make a clear decision on each one.

For a first pass, real request/response samples are usually the simplest option. Logs and automated tests also work if they’re reliable.

Watch for common mismatches: endpoints present in one place but not the other, field name or shape differences, status code differences (200 vs 201, 400 vs 422), undocumented behaviors (pagination, sorting, filtering), and auth differences (spec says public, code requires a token).

Example: your OpenAPI says POST /tasks returns 201 with {id,title}. You call the running API and get 200 plus {id,title,createdAt}. That’s not “close enough” if you generate client SDKs from the spec.

Before you edit anything, decide how you resolve conflicts:

If the behavior is correct but undocumented, fix the spec.
If the spec is the agreed contract, fix the code to match.
If neither is right, adjust intended behavior first, then update both.

Keep every change small and reviewable: one endpoint, one response, one schema tweak. It’s easier to review and easier to retest.

Produce client and server validation examples from the contract

Once you have a spec you trust, turn it into small validation examples. This is what keeps drift from creeping back in.

Server-side validation (reject bad requests)

On the server, validation means failing fast when a request doesn’t match the contract, and returning a clear error. That protects your data and makes bugs easier to spot.

A simple way to express server validation examples is to write them as cases with three parts: input, expected output, and expected error (an error code or message pattern, not exact text).

Example (contract says title is required and must be 1 to 120 chars):

{
  "name": "Create task without title returns 400",
  "request": {"method": "POST", "path": "/tasks", "body": {"title": ""}},
  "expect": {"status": 400, "body": {"error": {"code": "VALIDATION_ERROR"}}}
}

Client-side validation (catch breaking changes early)

On the client, validation is about detecting drift before users do. If the server starts returning a different shape, or a required field disappears, your tests should flag it.

Keep client checks focused on what you truly rely on, like “a task has id, title, status.” Avoid asserting every optional field or exact ordering. You want failures on breaking changes, not harmless additions.

A few guidelines that keep tests readable:

Prefer checking presence and type over exact values.
Assert required fields only unless an optional field is critical to your feature.
For errors, check status and an error code, not full messages.
Allow extra fields unless your contract explicitly forbids them.

If you build with Koder.ai, you can generate and keep these example cases next to your OpenAPI file, then update them as part of the same review when behavior changes.

Example scenario: a simple Tasks API end to end

Build your API in chat

Create a web plus API project from chat and iterate on endpoints as the contract evolves.

Start Building

Imagine a small API with three endpoints: POST /tasks creates a task, GET /tasks lists tasks, and GET /tasks/{id} returns one task.

Start by writing a few concrete examples for one endpoint, as if you were explaining it to a tester.

For POST /tasks, intended behavior could be:

Success: send { \"title\": \"Buy milk\" } and get 201 with a new task object, including an id, the title, and done:false.
Failure 1: send {} and get 400 with an error like { \"error\": \"title is required\" }.
Failure 2: send { \"title\": \"x\" } (too short) and get 422 with { \"error\": \"title must be at least 3 characters\" }.

When Claude Code drafts the OpenAPI, the snippet for this endpoint should capture the schema, status codes, and realistic examples:

paths:
  /tasks:
    post:
      summary: Create a task
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateTaskRequest'
            examples:
              ok:
                value: { "title": "Buy milk" }
      responses:
        '201':
          description: Created
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Task'
              examples:
                created:
                  value: { "id": "t_123", "title": "Buy milk", "done": false }
        '400':
          description: Bad Request
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
              examples:
                missingTitle:
                  value: { "error": "title is required" }
        '422':
          description: Unprocessable Entity
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
              examples:
                tooShort:
                  value: { "error": "title must be at least 3 characters" }

A common mismatch is subtle: the running API returns 200 instead of 201, or it returns { \"taskId\": 123 } instead of { \"id\": \"t_123\" }. That’s the kind of “almost the same” difference that breaks generated clients.

Fix it by choosing one truth source. If intended behavior is correct, change the implementation to return 201 and the agreed Task shape. If production behavior is already relied on, update the spec (and the behavior notes) to match reality, then add the missing validation and error responses so clients aren’t surprised.

Common mistakes that make a contract dishonest

A contract becomes dishonest when it stops describing rules and starts describing whatever your API returned on one good day. A simple test: could a new implementation pass this spec without copying today’s quirks?

One trap is overfitting. You capture one response and turn it into law. Example: your API currently returns dueDate: null for every task, so the spec says the field is always nullable. But the real rule might be “required when status is scheduled.” The contract should express the rule, not just the current dataset.

Errors are where honesty often breaks. It’s tempting to spec only success responses because they look clean. But clients need the basics: 401 when the token is missing, 403 for forbidden access, 404 for unknown IDs, and a consistent validation error (400 or 422).

Other patterns that cause trouble:

Names and types drift across endpoints (taskId in one route but id elsewhere, or priority as a string in one response and a number in another).
Examples contradict schemas (enum values don’t match, date-time examples aren’t ISO 8601).
Types get widened to avoid decisions (everything becomes string, everything becomes optional).
The spec reads like marketing copy (“fast,” “secure”) instead of a testable contract.

A good contract is testable. If you can’t write a failing test from the spec, it’s not honest yet.

Keep changes safe

Snapshot before risky edits, then rollback if your spec and implementation diverge.

Take Snapshot

Before you hand an OpenAPI file to another team (or paste it into docs), do a fast pass for “can someone use this without reading your mind?”

Start with examples. A spec can be valid and still useless if every request and response is abstract. For each operation, include at least one realistic request example and one success response example. For errors, one example per common failure (auth, validation) is usually enough.

Then check consistency. If one endpoint returns { \"error\": \"...\" } and another returns { \"message\": \"...\" }, clients end up with branching logic everywhere. Pick a single error shape and reuse it, along with predictable status codes.

A short checklist:

Required fields, formats (email, uuid, date-time), and enums are explicit.
Status codes are intentional and consistent across similar endpoints.
Error responses include both schema and examples, not just a status.
You re-run 2-3 real calls per endpoint (tests, curl, Postman, or logs) and compare payloads to the spec.
You can write one small client call from the spec alone without guessing headers, field names, or nullability.

A practical trick: pick one endpoint, pretend you’ve never seen the API, and answer “What do I send, what do I get back, and what breaks?” If the OpenAPI can’t answer that clearly, it’s not ready.

Next steps: make it a habit (and keep changes safe)

This workflow pays off when it runs regularly, not only during a release scramble. Pick a simple rule and stick to it: run it whenever an endpoint changes, and run it again before you publish an updated spec.

Keep ownership simple. The person who changes an endpoint updates the behavior notes and spec draft. A second person reviews the “spec vs implementation” diff like a code review. QA or support teammates often make great reviewers because they spot unclear responses and edge cases quickly.

Treat contract edits like code edits. If you’re using a chat-driven builder like Koder.ai, taking a snapshot before risky edits and using rollback when needed keeps iteration safe. Koder.ai also supports exporting source code, which makes it easier to keep the spec and implementation side by side in your repo.

A routine that usually works without slowing teams down:

Write behavior notes when you add a new endpoint
Validate the spec and run contract tests before merging
Compare the spec against a running API before release
Snapshot before risky edits; rollback if the diff gets confusing

Next action: pick one endpoint that already exists. Write 5-10 lines of behavior notes (inputs, outputs, error cases), generate a draft OpenAPI from those notes, validate it, then compare it to the running implementation. Fix one mismatch, retest, and repeat. After one endpoint, the habit tends to stick.

FAQ

What does “OpenAPI drift” mean in plain terms?

OpenAPI drift is when the API you actually run no longer matches the OpenAPI file people share. The spec might be missing new fields, status codes, or auth rules, or it might describe “ideal” behavior that the server doesn’t follow.

It matters because clients (apps, other services, generated SDKs, tests) make decisions based on the contract, not on what your server “usually” does.

What are the most common ways drift shows up for client teams?

Client breakages become random and hard to debug: a mobile app expects 201 but gets 200, an SDK can’t deserialize a response because a field was renamed, or error handling fails because error shapes differ.

Even when nothing crashes, teams lose trust and stop using the spec, which removes your early warning system.

Why shouldn’t I start the OpenAPI spec by copying backend code?

Because code reflects current behavior, including accidental quirks you may not want to promise long-term.

A better default is: write intended behavior first (inputs, outputs, errors), then verify the implementation matches. That gives you a contract you can enforce, instead of a snapshot of today’s routes.

What’s the minimum “behavior notes” I should write per endpoint?

For each endpoint, capture:

Purpose: what it does in one sentence
Auth: required token/roles or public
Request: query/path params, headers, and a JSON body example
Responses: at least one success example and one or two realistic error examples with status codes

How do I keep error responses consistent across the whole spec?

Pick one error body shape and reuse it everywhere.

A simple default is either:

{ "error": "message" }, or
{ "error": { "code": "...", "message": "...", "details": ... } }

Then make it consistent across endpoints and examples. Consistency matters more than sophistication because clients will hard-code this shape.

How can I prompt Claude Code so it doesn’t make up endpoints or fields?

Give Claude Code your behavior notes and strict rules, and tell it not to invent fields. A practical instruction set:

“Source of truth is the behavior notes. Do not guess.”
“If unclear, add TODO in the spec and list it under ASSUMPTIONS.”
“Include reusable schemas (like Error) and reference them.”
“Use consistent naming (PascalCase schemas, lowerCamelCase fields).”

After generation, review the first. That’s where drift starts if you accept guesses.

What should I check in the OpenAPI file before I compare it to the running API?

Validate the spec itself first:

Required vs optional fields are correct
Formats are explicit (UUID, email, ISO 8601 date-time)
Examples match schemas (enums, types, nullability)
Status codes are intentional and consistent (e.g., create often 201)
Auth requirements are stated per endpoint

This catches “wishful” OpenAPI files before you even look at production behavior.

When the spec and the running API disagree, which one should I change?

Treat the running API as what users experience today, and decide mismatch-by-mismatch:

If behavior is correct but undocumented: update the spec
If the spec is the agreed contract: update the code
If neither is right: change intended behavior first, then update both

Keep changes small (one endpoint or one response at a time) so you can retest quickly.

What do “server-side” vs “client-side” contract validation checks look like?

Server-side validation should reject requests that violate the contract and return a clear, consistent error (status + error code/shape).

Client-side validation should detect breaking response changes early by asserting only what you truly rely on:

Required fields exist and have the right types
Status codes match expected flows
Error responses have the agreed shape

Avoid asserting every optional field so tests fail on real breakages, not harmless additions.

What’s a simple habit to prevent drift from coming back?

A practical routine is:

Update behavior notes and spec whenever an endpoint changes
Validate the OpenAPI file (schemas, examples, status codes, auth)
Compare a few real request/response samples against the spec before release
Add small contract checks in tests so drift is caught automatically

If you’re building in Koder.ai, you can keep the OpenAPI file alongside the code, use snapshots before risky edits, and rollback if a spec/code change gets messy.