Dec 14, 2025·8 min

Go + Postgres performance tuning: a focused API playbook

Q: How do I quickly tell if my Go API is slow because of Postgres or because of my code?

Start by separating DB wait time from app work time . - If the database is slow, the handler mostly waits on the query. Go CPU often stays normal while requests pile up “in flight.” - If the app is slow, queries return fast but time is spent building objects, doing extra per-row queries, marshaling large JSON, or logging. Go CPU and memory usually rise with response size. Add simple timing around “wait for connection” and “query execution” to see which side dominates.

Q: What metrics should I track first before tuning anything?

Use a small baseline you can repeat: - p95 latency per key endpoint (not average) - error rate (5xx, timeouts, cancellations) - DB time per request (time spent waiting on Postgres) Pick a clear target like “p95 under 200 ms at 50 concurrent users, errors under 0.5%.” Then only change one thing at a time and re-test the same request mix.

Q: What are good starting connection pool settings for a Go API on Postgres?

A practical default is a small multiple of CPU cores per API instance, often 5–20 max open connections , with similar max idle connections, and recycle connections every 30–60 minutes . Two common failure modes: - Pool too small: requests wait to get a connection even though Postgres query time looks fine. - Pool too big: Postgres gets overloaded with many active sessions and latency becomes uneven. Remember pools multiply across instances (20 connections × 10 instances = 200 connections).

Q: How can I confirm the connection pool is the bottleneck (not SQL)?

Time DB calls in two parts: - Time waiting for a connection (pool wait) - Time executing the query (Postgres work) If most time is pool wait, adjust pool sizing, timeouts, and instance counts. If most time is query execution, focus on and indexes. Also confirm you always close rows promptly so connections return to the pool.

Q: What should I look for first in EXPLAIN when an endpoint is slow?

Run on the exact SQL your API sends and look for: - Seq Scan on a large table - Huge gap between estimated rows vs actual rows - Sort dominating time (often with ) - “Rows Removed by Filter” being very high - Lots of shared read blocks in BUFFERS (heavy reads) Fix the biggest red flag first; don’t tune everything at once.

Q: How do I choose the right index for a list endpoint with filters and sorting?

Indexes should match what the endpoint actually does: filters + sort order . Good default approach: - Build a composite index for your common + pattern. - Keep it small (2 columns often, 3 sometimes). - Put the most selective filter first, then the sort column. Example: if you filter by and sort by newest, an index like is often the difference between stable p95 and spikes.

Go + Postgres performance tuning playbook for AI-generated APIs: pool connections, read query plans, index smartly, paginate safely, and shape JSON fast.

What "slow" looks like for Go APIs on Postgres

AI-generated APIs can feel fast in early testing. You hit an endpoint a few times, the dataset is small, and requests come in one at a time. Then real traffic shows up: mixed endpoints, bursty load, colder caches, and more rows than you expected. The same code can start to feel randomly slow even though nothing actually broke.

Slow usually shows up in a few ways: latency spikes (most requests are fine, some take 5x to 50x longer), timeouts (a small percent fail), or CPU running hot (Postgres CPU from query work, or Go CPU from JSON, goroutines, logging, and retries).

A common scenario is a list endpoint with a flexible search filter that returns a large JSON response. In a test database, it scans a few thousand rows and finishes quickly. In production, it scans a few million rows, sorts them, and only then applies a LIMIT. The API still "works", but p95 latency explodes and a few requests time out during bursts.

To separate database slowness from app slowness, keep the mental model simple.

If the database is slow, your Go handler spends most of its time waiting for the query. You may also see many requests stuck "in flight" while Go CPU looks normal.

If the app is slow, the query finishes quickly, but time is lost after the query: building large response objects, marshaling JSON, running extra queries per row, or doing too much work per request. Go CPU rises, memory rises, and latency grows with response size.

"Good enough" performance before launch is not perfection. For many CRUD endpoints, aim for stable p95 latency (not just the average), predictable behavior under bursts, and no timeouts at your expected peak. The goal is straightforward: no surprise slow requests when data and traffic grow, and clear signals when something drifts.

Baseline first: the few numbers that matter

Before you tune anything, decide what "good" means for your API. Without a baseline, it's easy to spend hours changing settings and still not know whether you improved things or just moved the bottleneck.

Three numbers usually tell you most of the story:

p95 request latency (not average)
error rate (HTTP 5xx, timeouts, canceled requests)
DB time per request (how long each request waits on Postgres)

p95 is the "bad day" metric. If p95 is high but average is fine, a small set of requests is doing too much work, getting blocked on locks, or triggering slow plans.

Make slow queries visible early. In Postgres, enable slow query logging with a low threshold for pre-launch testing (for example, 100-200 ms), and log the full statement so you can copy it into a SQL client. Keep this temporary. Logging every slow query in production gets noisy fast.

Next, test with real-looking requests, not just a single "hello world" route. A small set is enough if it matches what users will actually do: a list call with filters and sorting, a detail page with a couple of joins, a create or update with validation, and a search-style query with partial matches.

If you're generating endpoints from a spec (for example, with a vibe-coding tool like Koder.ai), run the same handful of requests repeatedly with consistent inputs. That makes changes like indexes, pagination tweaks, and query rewrites easier to measure.

Finally, pick a target you can say out loud. Example: "Most requests stay under 200 ms p95 at 50 concurrent users, and errors stay under 0.5%." The exact numbers depend on your product, but a clear target prevents endless tinkering.

Connection pooling that keeps Postgres stable

A connection pool keeps a limited number of open database connections and reuses them. Without a pool, each request may open a new connection, and Postgres wastes time and memory managing sessions instead of running queries.

The goal is to keep Postgres busy doing useful work, not context-switching between too many connections. This is often the first meaningful win, especially for AI-generated APIs that can quietly turn into chatty endpoints.

Simple starting settings

In Go, you usually tune max open connections, max idle connections, and connection lifetime. A safe starting point for many small APIs is a small multiple of your CPU cores (often 5 to 20 total connections), with a similar number kept idle, and recycling connections periodically (for example, every 30 to 60 minutes).

If you're running multiple API instances, remember the pool multiplies. A pool of 20 connections across 10 instances is 200 connections hitting Postgres, which is how teams unexpectedly run into connection limits.

How to tell if the pool is the problem

Pool problems feel different from slow SQL.

If the pool is too small, requests wait before they even reach Postgres. Latency spikes, but database CPU and query times may look fine.

If the pool is too big, Postgres looks overloaded: lots of active sessions, memory pressure, and uneven latency across endpoints.

A fast way to separate the two is to time your DB calls in two parts: time spent waiting for a connection vs time spent executing the query. If most time is "waiting", the pool is the bottleneck. If most time is "in query", focus on SQL and indexes.

Useful quick checks:

Log pool stats (open, in-use, idle) and watch for in-use stuck at the max.
Add a timeout on acquiring a connection so waits fail fast in staging.
Monitor active connections in Postgres and how close you are to max_connections.
Confirm each request closes rows and releases connections promptly.
Load test with the same number of app instances you plan to run.

pgx pool vs database/sql

If you use pgxpool, you get a Postgres-first pool with clear stats and good defaults for Postgres behavior. If you use database/sql, you get a standard interface that works across databases, but you need to be explicit about pool settings and driver behavior.

A practical rule: if you're all-in on Postgres and want direct control, pgxpool is often simpler. If you rely on libraries that expect database/sql, stick with it, set the pool explicitly, and measure waits.

Example: an endpoint that lists orders might run in 20 ms, but under 100 concurrent users it jumps to 2 s. If logs show 1.9 s waiting for a connection, query tuning won't help until the pool and total Postgres connections are sized correctly.

Query planning: quick reads of EXPLAIN output

When an endpoint feels slow, check what Postgres is actually doing. A quick read of EXPLAIN often points to the fix in minutes.

Run this on the exact SQL your API sends:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, status, created_at
FROM orders
WHERE user_id = $1 AND status = $2
ORDER BY created_at DESC
LIMIT 50;

A few lines matter most. Look at the top node (what Postgres chose) and the totals at the bottom (how long it took). Then compare estimated vs actual rows. Big gaps usually mean the planner guessed wrong.

What the key lines usually mean

If you see Index Scan or Index Only Scan, Postgres is using an index, which is usually good. Bitmap Heap Scan can be fine for medium-sized matches. Seq Scan means it read the whole table, which is only OK when the table is small or nearly every row matches.

Common red flags:

Seq Scan on a large table
Estimated rows vs actual rows far apart (for example, 10 estimated vs 10,000 actual)
Sort taking most of the time (often paired with ORDER BY)
"Filter:" removing lots of rows after a scan
High shared read blocks in BUFFERS (a lot of data read)

Why plans go wrong (and easy fixes)

Slow plans usually come from a handful of patterns:

Missing index for your WHERE + ORDER BY pattern (for example, (user_id, status, created_at))
Mismatched types (for example, comparing a UUID column to a text parameter), which can block index use
Functions in WHERE (for example, WHERE lower(email) = $1), which can force scans unless you add a matching expression index

If the plan looks odd and estimates are way off, stats are often stale. Run ANALYZE (or let autovacuum catch up) so Postgres learns current row counts and value distributions. This matters after large imports or when new endpoints start writing a lot of data quickly.

Indexing for the queries you actually run

Go From Idea to Service

Turn a backend idea into a working service with Go, Postgres, and a clear iteration loop.

Build App

Indexes help only when they match how your API queries data. If you build them from guesses, you get slower writes, larger storage, and little to no speedup.

A practical way to think about it: an index is a shortcut for a specific question. If your API asks a different question, Postgres ignores the shortcut.

Build indexes around filters + sort order

If an endpoint filters by account_id and sorts by created_at DESC, a single composite index often beats two separate indexes. It helps Postgres find the right rows and return them in the right order with less work.

Rules of thumb that usually hold up:

Index columns you filter on most, then add the column you sort by.
Keep composite indexes small. Two columns is common; three is sometimes OK; more is usually a smell.
Put the most selective filter first (the one that narrows results the most).
Avoid separate indexes that are fully covered by a better composite index.
Prefer one well-chosen index over several "maybe useful" ones.

Example: if your API has GET /orders?status=paid and it always shows newest first, an index like (status, created_at DESC) is a good fit. If most queries also filter by customer, (customer_id, status, created_at) can be better, but only if that's how the endpoint actually runs in production.

Partial indexes for common filters

If most traffic hits a narrow slice of rows, a partial index can be cheaper and faster. For example, if your app mostly reads active records, indexing only WHERE active = true keeps the index smaller and more likely to stay in memory.

To confirm an index helps, do quick checks:

Run EXPLAIN (or EXPLAIN ANALYZE in a safe environment) and look for an index scan that matches your query.
Compare timing and rows read with and without the index.
Watch for "Rows Removed by Filter" staying high. It often means the index doesn't match your filter.

Remove unused indexes carefully. Check usage stats (for example, whether an index has been scanned). Drop one at a time during low-risk windows, and keep a rollback plan. Unused indexes aren't harmless. They slow inserts and updates on every write.

Pagination patterns that do not slow down over time

Pagination is often where a fast API starts feeling slow, even when the database is healthy. Treat pagination as a query design problem, not a UI detail.

Why LIMIT/OFFSET gets slower

LIMIT/OFFSET looks simple, but deeper pages usually cost more. Postgres still has to walk past (and often sort) the rows you're skipping. Page 1 might touch a few dozen rows. Page 500 might force the database to scan and discard tens of thousands just to return 20 results.

It can also create unstable results when rows are inserted or deleted between requests. Users may see duplicates or miss items because the meaning of "row 10,000" changes as the table changes.

Keyset pagination (cursor) with a "last seen" example

Keyset pagination asks a different question: "Give me the next 20 rows after the last row I saw." That keeps the database working on a small, consistent slice.

A simple version uses an increasing id:

SELECT id, created_at, title
FROM posts
WHERE id > $1
ORDER BY id
LIMIT 20;

Your API returns a next_cursor equal to the last id in the page. The next request uses that value as $1.

For time-based sorting, use a stable order and break ties. created_at alone is not enough if two rows share the same timestamp. Use a compound cursor:

WHERE (created_at, id) < ($1, $2)
ORDER BY created_at DESC, id DESC
LIMIT 20;

A few rules prevent duplicates and missing rows:

Always include a unique tie-breaker in the ORDER BY (usually id).
Keep sort order identical across requests.
Make the cursor opaque to clients (encode created_at and id together).
If users can filter, include the same filters on every page.
Prefer immutable sort fields (created time) over mutable ones (status, score) when possible.

JSON shaping: faster responses with smaller payloads

Right Size Your DB Pool

Generate code that includes sensible pool limits and timeouts, then load test with confidence.

Start Project

A surprisingly common reason an API feels slow isn't the database. It's the response. Large JSON takes longer to build, longer to send, and longer for clients to parse. The fastest win is often returning less.

Start with your SELECT. If an endpoint needs only id, name, and status, ask for those columns and nothing else. SELECT * quietly gets heavier over time as tables gain long text, JSON blobs, and audit columns.

Another frequent slowdown is N+1 response building: you fetch a list of 50 items, then run 50 more queries to attach related data. It might pass tests, then collapses under real traffic. Prefer a single query that returns what you need (careful joins), or two queries where the second batches by IDs.

A few ways to keep payloads smaller without breaking clients:

Use an include= flag (or a fields= mask) so list responses stay lean and detail responses opt into extras.
Cap nested arrays (for example, only the latest 10 events) and provide a separate endpoint for the full history.
Don't return raw internal JSON columns if clients only need a couple of keys.
Use short codes instead of repeating long labels.

Build JSON in Postgres or in Go?

Both can be fast. Choose based on what you're optimizing for.

Postgres JSON functions (jsonb_build_object, json_agg) are useful when you want fewer round trips and predictable shapes from one query. Shaping in Go is useful when you need conditional logic, reuse structs, or keep SQL easier to maintain. If your JSON-building SQL becomes hard to read, it becomes hard to tune.

A good rule is: let Postgres filter, sort, and aggregate. Then let Go handle final presentation.

If you're generating APIs quickly (for example with Koder.ai), adding include flags early helps avoid endpoints that bloat over time. It also gives you a safe way to add fields without making every response heavier.

A step-by-step tuning pass before first users

You don't need a huge test lab to catch most performance issues. A short, repeatable pass surfaces the problems that turn into outages once traffic shows up, especially when the starting point is generated code that you plan to ship.

Before changing anything, write down a small baseline:

p95 and p99 latency for your busiest endpoints
error rate and timeouts
database CPU and active connections
the slowest 5 queries by total time (not just the single worst)

The tuning pass

Start small, change one thing at a time, and re-test after each change.

Run a 10 to 15 minute load test that looks like real usage. Hit the same endpoints your first users will hit (login, list pages, search, create). Then sort routes by p95 latency and total time spent.
Check for connection pressure before tuning SQL. A pool that's too large overwhelms Postgres. A pool that's too small creates long waits. Look for rising wait time to acquire a connection and connection counts that spike during bursts. Adjust pool and idle limits first, then re-run the same load.
EXPLAIN the top slow queries and fix the biggest red flag. The usual culprits are full table scans on large tables, sorts on large result sets, and joins that explode row counts. Pick the single worst query and make it boring.
Add or adjust one index, then re-test. Indexes help when they match your WHERE and ORDER BY. Don't add five at once. If your slow endpoint is "list orders by user_id ordered by created_at", a composite index on (user_id, created_at) can be the difference between instant and painful.
Tighten responses and pagination, then re-test again. If an endpoint returns 50 rows with large JSON blobs, your database, network, and client all pay. Return only the fields the UI needs, and prefer pagination that doesn't slow down as tables grow.

Keep a simple change log: what changed, why, and what moved in p95. If a change doesn't improve your baseline, revert it and move on.

Common mistakes and traps to avoid

Plan Performance Before You Build

Use Planning Mode to decide filters, sorting, and pagination before the code is generated.

Open Planner

Most performance problems in Go APIs on Postgres are self-inflicted. The good news is that a few checks catch many of them before real traffic arrives.

One classic trap is treating pool size like a speed knob. Setting it "as high as possible" often makes everything slower. Postgres spends more time juggling sessions, memory, and locks, and your app starts timing out in waves. A smaller, stable pool with predictable concurrency usually wins.

Another common mistake is "index everything." Extra indexes can help reads, but they also slow writes and can change query plans in surprising ways. If your API inserts or updates frequently, every extra index adds work. Measure before and after, and re-check plans after adding an index.

Pagination debt sneaks in quietly. Offset pagination looks fine early, then p95 rises over time because the database has to walk past more and more rows.

JSON payload size is another hidden tax. Compression helps bandwidth, but it doesn't remove the cost of building, allocating, and parsing large objects. Trim fields, avoid deep nesting, and return only what the screen needs.

If you only watch average response time, you'll miss where real user pain starts. p95 (and sometimes p99) is where pool saturation, lock waits, and slow plans show up first.

A quick pre-launch self-check:

Watch pool wait time and Postgres connection counts during a small load test.
Compare average vs p95 latency for the same endpoint.
Verify pagination does not degrade when the table is 10x larger.
Inspect response sizes for list endpoints (bytes matter).
Re-run EXPLAIN after adding indexes or changing filters.

Quick checklist and next steps before launch

Before real users arrive, you want evidence that your API stays predictable under stress. The goal isn't perfect numbers. It's catching the few issues that cause timeouts, spikes, or a database that stops accepting new work.

Run checks in a staging environment that resembles production (similar DB size range, same indexes, same pool settings): measure p95 latency per key endpoint under load, capture your top slow queries by total time, watch pool wait time, EXPLAIN (ANALYZE, BUFFERS) the worst query to confirm it's using the index you expect, and sanity-check payload sizes on your busiest routes.

Then do one worst-case run that mimics how products break: request a deep page, apply the broadest filter, and try it with a cold start (restart the API and hit the same request first). If deep pagination gets slower every page, switch to cursor-based pagination before launch.

Write down your defaults so the team makes consistent choices later: pool limits and timeouts, pagination rules (max page size, whether offset is allowed, cursor format), query rules (select only needed columns, avoid SELECT *, cap expensive filters), and logging rules (slow query threshold, how long to keep samples, how to label endpoints).

If you build and export Go + Postgres services with Koder.ai, doing a short planning pass before deployment helps keep filters, pagination, and response shapes intentional. Once you start tuning indexes and query shapes, snapshots and rollback make it easier to undo a "fix" that helps one endpoint but hurts others. If you want a single place to iterate on that workflow, Koder.ai on koder.ai is designed around generating and refining those services through chat, then exporting the source when you're ready.

FAQ

How do I quickly tell if my Go API is slow because of Postgres or because of my code?

Start by separating DB wait time from app work time.

If the database is slow, the handler mostly waits on the query. Go CPU often stays normal while requests pile up “in flight.”
If the app is slow, queries return fast but time is spent building objects, doing extra per-row queries, marshaling large JSON, or logging. Go CPU and memory usually rise with response size.

Add simple timing around “wait for connection” and “query execution” to see which side dominates.

What metrics should I track first before tuning anything?

Use a small baseline you can repeat:

p95 latency per key endpoint (not average)
error rate (5xx, timeouts, cancellations)
DB time per request (time spent waiting on Postgres)

Pick a clear target like “p95 under 200 ms at 50 concurrent users, errors under 0.5%.” Then only change one thing at a time and re-test the same request mix.

Should I turn on Postgres slow query logging, and what threshold is practical?

Enable slow query logging with a low threshold in pre-launch testing (for example 100–200 ms) and log the full statement so you can copy it into a SQL client.

Keep it temporary:

It gets noisy fast in production.
It can add overhead if you log too much.

Once you’ve found the worst offenders, switch to sampling or raise the threshold.

What are good starting connection pool settings for a Go API on Postgres?

A practical default is a small multiple of CPU cores per API instance, often 5–20 max open connections, with similar max idle connections, and recycle connections every 30–60 minutes.

Two common failure modes:

Pool too small: requests wait to get a connection even though Postgres query time looks fine.
Pool too big: Postgres gets overloaded with many active sessions and latency becomes uneven.

Remember pools multiply across instances (20 connections × 10 instances = 200 connections).

How can I confirm the connection pool is the bottleneck (not SQL)?

Time DB calls in two parts:

Time waiting for a connection (pool wait)
Time executing the query (Postgres work)

If most time is pool wait, adjust pool sizing, timeouts, and instance counts. If most time is query execution, focus on EXPLAIN and indexes.

Also confirm you always close rows promptly so connections return to the pool.

What should I look for first in EXPLAIN when an endpoint is slow?

Run EXPLAIN (ANALYZE, BUFFERS) on the exact SQL your API sends and look for:

Seq Scan on a large table
Huge gap between estimated rows vs actual rows
dominating time (often with )

How do I choose the right index for a list endpoint with filters and sorting?

Indexes should match what the endpoint actually does: filters + sort order.

Good default approach:

Build a composite index for your common WHERE + ORDER BY pattern.
Keep it small (2 columns often, 3 sometimes).

When is a partial index worth it in Postgres?

Use a partial index when most traffic hits a predictable subset of rows.

Example pattern:

Many reads only for active = true
Few queries for inactive rows

A partial index like ... WHERE active = true stays smaller, is more likely to fit in memory, and reduces write overhead versus indexing everything.

Confirm with that Postgres actually uses it for your high-traffic queries.

Why does LIMIT/OFFSET pagination get slower over time, and what should I use instead?

LIMIT/OFFSET gets slower on deep pages because Postgres still has to walk past (and often sort) the skipped rows. Page 500 can be dramatically more expensive than page 1.

Prefer keyset (cursor) pagination:

My DB queries are fast, but responses are still slow—should I trim JSON payloads?

Usually yes for list endpoints. The fastest response is the one you don’t send.

Practical wins:

Select only needed columns (avoid SELECT *).
Add include= or fields= so clients opt into heavy fields.