Learn how to write Claude Code style guide prompts that enforce naming, layering, error handling, and logging, and spot violations early with simple checks.

Style guide violations rarely show up as one big mistake. They start as tiny “close enough” choices that look harmless in a pull request, then pile up until the codebase feels uneven and harder to read.
Style drift often looks like this: one file uses userID, another uses userId, a third uses uid. One handler returns { error: "..." }, another throws, another logs and returns null. Each change is small, but together they create a repo where patterns stop being predictable.
Fast iteration and multiple contributors make it worse. People copy what they see, especially under time pressure. If the most recent code in the repo used a shortcut, that shortcut becomes the template for the next change. After a few weeks, the “default style” isn’t your written guide. It’s whatever happened last.
That’s why the goal has to be consistent conventions, not personal preferences. The question isn’t “Do I like this name?” It’s “Does this match the rules we agreed on so the next person can follow it without thinking?”
Catching violations early means stopping bad patterns before they become copy-paste fuel. Focus on new and edited code, fix the first appearance of a new inconsistency, and block merges that introduce new drift. When you do flag an issue, add a short preferred example people can mimic next time.
One realistic example: a developer adds a new API endpoint and logs raw request bodies “just for debugging.” If that lands, the next endpoint copies it, and soon sensitive data shows up in logs. Catching it in the first PR is cheap. Catching it after it spreads is painful and risky.
A style guide only works in reviews if it reads like a checklist, not a set of preferences. Rewrite each guideline as a rule that can be checked on a diff.
Organize rules into four buckets so they’re hard to miss: naming, layering, error handling, and logging. For each bucket, write two things: what must be true and what is forbidden.
Decide the strength of each rule up front:
Set scope so reviews don’t turn into endless refactors. A simple rule works well: “new and modified code must comply; existing untouched code isn’t rewritten unless it blocks the fix.” If you want cleanup, timebox it as a separate task.
Also define the output you want from a review so it’s easy to act on: a pass/fail verdict, a violations list with file and line references, suggested fixes written as concrete edits, and a brief risk note when something could cause bugs or leaks.
Example: if a PR logs raw user tokens, the review should fail under “logging: never log secrets” and suggest logging a request ID instead.
Style prompts fail when they sound like preferences. A good review prompt reads like a contract: clear non-negotiables, clearly named exceptions, and a predictable output.
Start with two short blocks: what must be true and what can be bent. Then add a decision rule: “If unclear, mark as Needs Clarification. Do not guess.”
Force evidence. When the tool flags a violation, require it to quote the exact identifier and file path, not a vague description. That single constraint removes a lot of back-and-forth.
Keep the scope tight: comment on changed lines and directly impacted code paths only. If you allow unrelated refactors, style enforcement turns into “rewrite the file,” and people stop trusting the feedback.
Here’s a structure you can reuse:
Role: strict style guide reviewer.
Input: diff (or files changed) + style guide rules.
Non-negotiables: [list].
Allowed exceptions: [list].
Scope: ONLY comment on changed lines and directly impacted code paths. No unrelated refactors.
Evidence: Every finding MUST include (a) file path, (b) exact identifier(s), (c) short quote.
Output: structured compliance report with pass/fail per category + minimal fixes.
Require the report to keep the same sections every time, even if some are “No issues found”: Naming, Layering, Error handling, Logging.
If it says “service layer leaking DB details,” it must cite something like internal/orders/service/order_service.go and the exact call (for example db.QueryContext) so you can fix the leak without debating what it meant.
A style guide sticks when the process is repeatable. The goal is to make the model check rules, not argue taste, and to do it the same way every time.
Use a simple two-pass workflow:
Example: a PR adds a new endpoint. Pass 1 flags that the handler talks to PostgreSQL directly (layering), uses mixed naming for request structs (naming), and logs full emails (logging). Pass 2 makes minimal fixes: move the DB call into a service or repository, rename the struct, and mask the email in logs. Nothing else changes.
Naming issues feel minor, but they create real cost: people misread intent, search gets harder, and “almost the same” names multiply.
State the naming rules the reviewer must enforce across the whole change: file names, exported types, functions, variables, constants, and tests. Be explicit about casing (camelCase, PascalCase, snake_case) and choose one acronym rule (for example APIClient vs ApiClient). Then require it everywhere.
Also standardize shared vocabulary: error types, log fields, and config keys. If logs use request_id, don’t allow reqId in one file and requestId in another.
A reviewer instruction that stays practical:
Check every new or renamed identifier. Enforce casing + acronym rules.
Flag vague names (data, info, handler), near-duplicates (userId vs userID), and names that contradict behavior.
Prefer domain language: business terms over generic tech words.
Ask for a short report: the three most confusing names, any near-duplicates and which one to keep, plus any log/config/error names that don’t match the standard.
Layering rules work best in plain language: handlers talk HTTP, services hold business rules, repositories talk to the database.
Lock dependency direction. Handlers may call services. Services may call repositories. Repositories should not call services or handlers. Handlers should not import database code, SQL helpers, or ORM models. If you use shared packages (config, time, IDs), keep them free of app logic.
Pin cross-cutting work to a home. Validation usually belongs at the boundary for request shape and in the service for business rules. Authorization often starts in the handler (identity, scopes), but the service should enforce the final decision. Mapping belongs at layer edges: handler maps HTTP to domain input, repository maps DB rows to domain types.
Drop this into a prompt to keep reviews concrete:
Check layering: handler -> service -> repository only.
Report any leaks:
- DB types/queries in handlers or services
- HTTP request/response types inside services or repositories
- repository returning DB models instead of domain objects
- auth/validation mixed into repository
For each leak, propose the smallest fix: move function, add interface, or rename package.
Make the report explicit: name the file, the layer it belongs to, the import or call that breaks the rule, and the minimal change that prevents the pattern from spreading.
Most style debates get loud when something breaks in production. A clear error handling policy keeps fixes calm because everyone knows what “good” looks like.
Write down the philosophy and enforce it. For example: “Wrap errors to add context; create a new error only when changing meaning or mapping to a user message. Return raw errors only at the system boundary.” That one sentence prevents random patterns from spreading.
Separate user-facing text from internal detail. User messages should be short and safe. Internal errors can include the operation name and key identifiers, but not secrets.
In reviews, check for a few recurring failures: swallowed errors (logged but not returned), ambiguous returns (nil value with nil error after a failure), and user-facing messages that leak stack traces, query text, tokens, or PII. If you support retries or timeouts, require them to be explicit.
Example: a checkout call times out. The user sees “Payment service is taking too long.” Internally, wrap the timeout and include op=checkout.charge and the order ID so it’s searchable and actionable.
Logs only help when everyone writes them the same way. If each developer picks their own wording, level, and fields, search turns into guesswork.
Make log levels non-negotiable: debug for development detail, info for normal milestones, warn for unexpected but handled situations, and error when the user-facing action fails or needs attention. Keep “fatal” or “panic” rare and tied to a clear crash policy.
Structured logs matter more than perfect sentences. Require stable key names so dashboards and alerts don’t break. Decide a small core set (for example event, component, action, status, duration_ms) and keep it consistent.
Treat sensitive data as a hard stop. Be explicit about what must never be logged: passwords, auth tokens, full credit cards, secrets, and raw personal data. Call out things that look harmless but aren’t, like password reset links, session IDs, and full request bodies.
Correlation IDs make debugging possible across layers. Require a request_id in every log line within a request. If you log user_id, define when it’s allowed and how to represent missing or anonymous users.
A prompt block you can reuse:
Review the changes for logging conventions:
- Check level usage (debug/info/warn/error). Flag any level that does not match impact.
- Verify structured fields: require stable keys and avoid free-form context in the message.
- Confirm correlation identifiers: request_id on all request-bound logs; user_id only when allowed.
- Flag any sensitive data risk (tokens, secrets, personal data, request/response bodies).
- Identify noisy logs (in loops, per-item logs, repeated success messages) and missing context.
Return: (1) violations with file/line, (2) suggested rewrite examples, (3) what to add or remove.
Before merging, do a fast “safety pass”: any new log missing request_id for request work, any new keys that change existing names (userId vs user_id), any error logs missing what failed (operation, resource, status), any high-volume logs that will fire on every request, and any chance of secrets or personal data appearing in fields or messages.
Treat style drift like a build break, not a suggestion. Add a strict gate that runs before merge and returns a clear pass or fail. If a mandatory rule is broken (naming, layer boundaries, logging safety, error handling), it fails and points to exact files and lines.
Keep the gate short. One practical trick is to require a YES/NO checklist per rule, and refuse approval if any item is NO.
A PR-sized checklist that catches most problems:
When the tool suggests fixes, require a small compliant snippet for each rule it touches. That prevents vague feedback like “rename for clarity.”
The fastest way for a style guide to fail is leaving room for interpretation. If two reviewers can read the same rule and reach different conclusions, the tool will enforce taste, not standards.
Naming is a common example. “Use clear names” isn’t testable. Tighten it into something you can verify: “functions are verbs (e.g., createInvoice), booleans start with is/has/can, exported types are PascalCase.”
Another trap is asking for everything at once. When one prompt tries to cover naming, layering, errors, logging, tests, and performance, the feedback gets shallow. Split reviews into focused passes when you need depth, or keep the gate prompt limited to mandatory rules.
The problems that most often cause enforcement to drift:
If you treat prompts like tests, you get predictable enforcement. If you treat them like advice, violations sneak in and multiply.
Run a fast pass over the diff (not the whole repo) and confirm:
Keep a tiny prompt template and paste it with each change:
Review ONLY the changed code against our rules for naming, layering, errors, and logging.
List mandatory violations first (with file + line if available). Then list optional suggestions.
End with either: “no mandatory violations found” or “mandatory violations found”.
Example: a new function procUsr() in a handler that writes to PostgreSQL directly should fail naming and layering even if the feature works. Catching it here prevents copy-paste from spreading the mistake.
A teammate adds a new endpoint: POST /v1/invoices/{id}/send. It touches a handler, a service, and storage.
On the first pass, you want a report, not a rewrite:
Pass 1 (report only)
You are a strict style checker. Read the patch.
Rules: naming must match our guide, handlers call services only, services call storage only, no SQL in handlers,
errors must be wrapped with context, logs must be structured and not leak PII.
Output: a numbered list of violations with file:line, rule name, and one-sentence impact. Do not propose fixes.
If a rule might be intentionally broken, ask one clarification question.
Typical findings: SendInvoiceNow() vs SendInvoice naming mismatch, the handler calling db.QueryRow directly, returning raw err with no context, and noisy logs like log.Printf("sending invoice %v", invoice) that dump full objects.
Second pass asks for minimal, safe changes:
Pass 2 (minimal fix suggestions)
Using the violations list, propose the smallest code edits to comply.
Constraints: keep behavior the same, no refactors beyond what is needed, show suggested function names and where code should move.
For each fix, include 1-2 lines of example code.
If breaking a rule is allowed, say it up front: “Exceptions are permitted only if you add a short comment explaining why, and you add a follow-up task to remove the exception.”
After the fix, the handler becomes a thin adapter, the service owns the workflow, storage owns the query, errors become fmt.Errorf("send invoice: %w", err), and logs become one clean line with safe fields (invoice ID, not the full invoice).
Pick one team-approved prompt and treat it like a shared tool. Start with what keeps hurting you in reviews (naming drift, leaking layers, inconsistent errors, unsafe logs). Update the prompt only when you see a real violation in real code.
Keep a small rules block at the top of the prompt and paste it into every review unchanged. If everyone edits the rules each time, you don’t have a standard. You have a debate.
A simple cadence helps: one person collects the top few style misses from the week, and you add exactly one clearer rule or one better example.
If you work in a chat-driven build flow like Koder.ai (koder.ai), it’s worth running the same gate checks during changes, not just at the end. Features like planning, snapshots, and rollback can help you keep style fixes small and reversible before you export source code.