Jeffrey Ullman’s Database Theory Behind Fast, Scalable Queries

Q: Who is Jeffrey Ullman, and why does his work matter if I only write SQL?

Jeffrey Ullman helped formalize how databases represent query meaning and how they can safely transform queries into faster equivalents. That foundation shows up every time an engine rewrites a query, reorders joins, or picks a different execution plan while guaranteeing the same result set.

Q: What’s the difference between a logical query plan and a physical query plan?

A logical plan describes what operations are needed (filter, join, aggregate) independent of storage details. A physical plan chooses how to run them (index scan vs. full scan, hash join vs. nested loop, parallelism, sort strategies). Most performance differences come from physical choices, enabled by logical rewrites.

Q: When should I expect nested loop, hash join, or merge join to be fastest?

- Nested loop join : best when the left side is small and the right side can be probed efficiently (often via an index). - Hash join : great for large equality joins on unsorted data, but needs enough memory to avoid spilling. - Merge join : strong when both inputs are already sorted (or can be sorted cheaply), often helped by indexes that provide order.

Q: How do I read an EXPLAIN plan without getting overwhelmed?

Focus on a few high-signal clues: - where row counts explode (first big blow-up is often the root cause) - “estimated vs actual rows” gaps (bad stats/assumptions) - expensive operators (large sorts, hash builds, nested loops over big inputs) - scan choice (full scan when you expected an index) Treat the plan like compiled output: it shows what the engine actually decided to do.

Q: What techniques help queries stay fast as data scales without changing results?

Scale often requires changing physical strategy while keeping query meaning identical. Common tools include: - partitioning for pruning and locality - materialized views to reuse equivalent subresults - plan changes driven by updated stats as data grows Caching helps repeated reads, but it won’t fix a query that must touch too much data or produces huge intermediate joins.

Why Ullman Matters to Modern Data Work

Most people who write SQL, build dashboards, or tune a slow query have benefited from Jeffrey Ullman’s work—even if they’ve never heard his name. Ullman is a computer scientist and educator whose research and textbooks helped define how databases describe data, reason about queries, and execute them efficiently.

The quiet influence behind everyday tools

When a database engine turns your SQL into something it can run fast, it’s relying on ideas that must be both precise and adaptable. Ullman helped formalize the meaning of queries (so the system can rewrite them safely), and he helped connect database thinking with compiler thinking (so a query can be parsed, optimized, and translated into executable steps).

That influence is quiet because it doesn’t show up as a button in your BI tool or a visible feature in your cloud console. It shows up as:

Queries that run quickly after you add an index or rewrite a JOIN
Optimizers that choose different plans as data grows
Systems that can scale without changing the result your query returns

What you’ll learn in this article (without the math overload)

This post uses Ullman’s core ideas as a guided tour of the database internals that matter most in practice: how relational algebra sits underneath SQL, how query rewrites preserve meaning, why cost-based optimizers make the choices they do, and how join algorithms often decide whether a job finishes in seconds or hours.

We’ll also pull in a few compiler-like concepts—parsing, rewriting, and planning—because database engines behave more like sophisticated compilers than many people realize.

A quick promise: we’ll keep the discussion accurate, but avoid math-heavy proofs. The goal is to give you mental models you can apply at work the next time performance, scaling, or confusing query behavior shows up.

Database Fundamentals Ullman Helped Cement

If you’ve ever written a SQL query and expected it to “just mean one thing,” you’re relying on ideas that Jeffrey Ullman helped popularize and formalize: a clean model for data, plus precise ways to describe what a query asks for.

The relational model in plain terms

At its core, the relational model treats data as tables (relations). Each table has rows (tuples) and columns (attributes). That sounds obvious now, but the important part is the discipline it creates:

Keys identify rows. A primary key is the “name tag” for each record.
Relationships connect tables via foreign keys, so you can keep facts in one place and reference them elsewhere.

This framing makes it possible to reason about correctness and performance without hand-waving. When you know what a table represents and how rows are identified, you can predict what joins should do, what duplicates mean, and why certain filters change results.

Relational algebra: a calculator for queries

Ullman’s teaching often uses relational algebra as a kind of query calculator: a small set of operations (select, project, join, union, difference) that you can combine to express what you want.

Why it matters to working with SQL: databases translate SQL into an algebraic form and then rewrite it into another equivalent form. Two queries that look different can be algebraically the same—which is how optimizers can reorder joins, push down filters, or remove redundant work while keeping the meaning intact.

Algebra vs. calculus (high level)

Relational algebra is more “how”: a sequence of operations to compute the result.
Relational calculus is more “what”: a description of the result you want.

SQL is largely “what,” but engines often optimize using algebraic “how.”

Foundation beats memorizing a dialect

SQL dialects vary (Postgres vs. Snowflake vs. MySQL), but the fundamentals don’t. Understanding keys, relationships, and algebraic equivalence helps you spot when a query is logically wrong, when it’s merely slow, and which changes preserve meaning across platforms.

Relational Algebra: The Hidden Language Under SQL

Relational algebra is the “math underneath” SQL: a small set of operators that describe the result you want. Jeffrey Ullman’s work helped make this operator view crisp and teachable—and it’s still the mental model most optimizers use.

The core operators (and what they mean)

A database query can be expressed as a pipeline of a few building blocks:

Select (σ): filter rows (the SQL WHERE idea)
Project (π): keep specific columns (the SQL SELECT col1, col2 idea)
Join (⋈): combine tables based on a condition (JOIN ... ON ...)
Union (∪): stack results with the same shape (UNION)
Difference (−): rows in A but not in B (like EXCEPT in many SQL dialects)

Because the set is small, it becomes easier to reason about correctness: if two algebra expressions are equivalent, they return the same table for any valid database state.

How SQL maps to algebra (conceptually)

Take a familiar query:

SELECT c.name
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE o.total > 100;

Conceptually, this is:

start with a join of customers and orders: customers ⋈ orders
select only orders above 100: σ(o.total > 100)(...)
project the one column you want: π(c.name)(...)

That’s not the exact internal notation used by every engine, but it’s the right idea: SQL becomes an operator tree.

Equivalence: the doorway to optimization

Many different trees can mean the same result. For example, filters can often be pushed earlier (apply σ before a big join), and projections can often drop unused columns sooner (apply π earlier).

Those equivalence rules are what let a database rewrite your query into a cheaper plan without changing the meaning. Once you see queries as algebra, “optimization” stops being magic and becomes safe, rule-guided reshaping.

From SQL to Query Plans: Rewrites That Preserve Meaning

When you write SQL, the database doesn’t execute it “as written.” It translates your statement into a query plan: a structured representation of the work to be done.

A good mental model is a tree of operators. Leaves read tables or indexes; internal nodes transform and combine rows. Common operators include scan, filter (selection), project (choose columns), join, group/aggregate, and sort.

Logical plan vs. physical plan (what vs. how)

Databases typically separate planning into two layers:

Logical plan: what result to compute, expressed with abstract operators (filter, join, aggregate) and relationships between them.
Physical plan: how to execute it on real storage and hardware (index scan vs. full scan, hash join vs. nested-loop join, parallel vs. single-threaded).

Ullman’s influence shows up in the emphasis on meaning-preserving transformations: rearrange the logical plan in many ways without changing the answer, then pick an efficient physical strategy.

Rule-based rewrites that reduce work

Before choosing the final execution approach, optimizers apply algebraic “cleanup” rules. These rewrites don’t change results; they reduce unnecessary work.

Common examples:

Selection pushdown: apply filters as early as possible so fewer rows flow into later steps.
Projection pruning: keep only the needed columns, reducing I/O and memory.
Join reordering: join smaller/intermediate results first (when it’s safe), instead of following the SQL’s surface order.

A simple rewrite example

Suppose you want orders for users in one country:

SELECT o.order_id, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.country = 'CA';

A naïve interpretation might join all users to all orders and then filter to Canada. A meaning-preserving rewrite pushes the filter down so the join touches fewer rows:

Filter users to country = 'CA'
Then join those users to orders
Then project only order_id and total

In plan terms, the optimizer tries to turn:

Join(Users, Orders) → Filter(country='CA') → Project(order_id,total)

into something closer to:

Filter(country='CA') on Users → Join(with Orders) → Project(order_id,total)

Same answer. Less work.

These rewrites are easy to overlook because you never type them—yet they’re a major reason the same SQL can run fast on one database and slow on another.

Cost-Based Optimization Without the Jargon

When you run a SQL query, the database considers multiple valid ways to get the same answer, then chooses the one it expects to be cheapest. That decision process is called cost-based optimization—and it’s one of the most practical places where Ullman-style theory shows up in everyday performance.

What a “cost model” really is

A cost model is a scoring system the optimizer uses to compare alternative plans. Most engines estimate cost using a few core resources:

Rows processed (work tends to scale with how much data flows through each step)
I/O (reading pages from disk or SSD, plus cache effects)
CPU (filtering, hashing, sorting, aggregating)
Memory (whether an operation fits in RAM or spills to disk)

The model doesn’t need to be perfect; it needs to be directionally right often enough to pick good plans.

Cardinality estimation, in plain terms

Before it can score plans, the optimizer asks a question at every step: how many rows will this produce? That’s cardinality estimation.

If you filter WHERE country = 'CA', the engine estimates what fraction of the table matches. If you join customers to orders, it estimates how many pairs will match on the join key. These row-count guesses determine whether it prefers an index scan over a full scan, a hash join over a nested loop, or whether a sort will be small or enormous.

Why statistics matter (and what goes wrong without them)

The optimizer’s guesses are driven by statistics: counts, value distributions, null rates, and sometimes correlations between columns.

When stats are stale or missing, the engine can misjudge row counts by orders of magnitude. A plan that looks cheap on paper can become expensive in reality—classic symptoms include sudden slowdowns after data growth, “random” plan changes, or joins that unexpectedly spill to disk.

The unavoidable trade-off: accuracy vs. planning time

Better estimates often require more work: more detailed stats, sampling, or exploring more candidate plans. But planning itself costs time, especially for complex queries.

So optimizers balance two goals:

Plan fast enough for interactive workloads
Plan smart enough to avoid catastrophic choices

Understanding that trade-off helps you interpret EXPLAIN output: the optimizer isn’t trying to be clever—it’s trying to be predictably correct under limited information.

Join Algorithms and the Heart of Query Performance

Get rewarded for sharing

Create content about your build and earn credits to keep iterating on your app.

Earn Credits

Ullman’s work helped popularize a simple but powerful idea: SQL isn’t “run” so much as translated into an execution plan. Nowhere is that more obvious than joins. Two queries that return the same rows can differ wildly in runtime depending on which join algorithm the engine chooses—and in what order it joins tables.

Nested loop, hash join, merge join—when each makes sense

Nested loop join is conceptually straightforward: for each row on the left, find matching rows on the right. It can be fast when the left side is small and the right side has a useful index.

Hash join builds a hash table from one input (often the smaller) and probes it with the other. It shines for large, unsorted inputs with equality conditions (e.g., A.id = B.id), but needs memory; spill-to-disk can erase the advantage.

Merge join walks two inputs in sorted order. It’s a great fit when both sides are already ordered (or cheaply sortable), such as when indexes can deliver rows in join-key order.

Why join order can dominate performance

With three or more tables, the number of possible join orders explodes. Joining two large tables first might create a huge intermediate result that slows everything else. A better order often starts with the most selective filter (fewest rows) and joins outward, keeping intermediates small.

Indexes don’t just speed up lookups—they make certain join strategies viable. An index on the join key can turn an expensive nested loop into a quick “seek per row” pattern. Conversely, missing or unusable indexes may push the engine toward hash joins or large sorts for merge joins.

Practical checklist: symptoms of a bad join plan

Runtime grows dramatically with slightly more data (join order likely amplifying intermediates).
The plan shows huge “rows estimated vs. rows actual” differences (bad cardinality guesses lead to wrong join choices).
You see large sorts or hash spills to disk (memory pressure or missing supporting indexes).
A small filtered table is joined late instead of early (filters not being applied soon enough).
The join predicate isn’t a clean equality on compatible types (prevents efficient hash/merge behavior).

Compiler Ideas Inside Database Engines

Databases don’t just “run SQL.” They compile it. Ullman’s influence spans both database theory and compiler thinking, and that connection explains why query engines behave like programming language toolchains: they translate, rewrite, and optimize before doing any work.

Parsing and syntax trees: how SQL is read

When you send a query, the first step looks like a compiler’s front end. The engine tokenizes keywords and identifiers, checks grammar, and builds a parse tree (often simplified into an abstract syntax tree). This is where basic errors are caught: missing commas, ambiguous column names, invalid grouping rules.

A helpful mental model: SQL is a programming language whose “program” happens to describe data relationships instead of loops.

From parse tree to logical operators

Compilers convert syntax into an intermediate representation (IR). Databases do something similar: they translate SQL syntax into logical operators such as:

Selection (filtering rows)
Projection (choosing columns)
Join (combining tables)
Aggregation (GROUP BY)

That logical form is closer to relational algebra than SQL text, which makes it easier to reason about meaning and equivalence.

Why optimizers resemble compiler optimizations

Compiler optimizations keep program results identical while making execution cheaper. Database optimizers do the same, using rule systems like:

Push filters earlier (reduce work sooner)
Reorder joins (same result, different cost)
Remove redundant computations

This is the database version of “dead code elimination”: not identical techniques, but the same philosophy—preserve semantics, reduce cost.

Debugging: reading plans like compiled code

If your query is slow, don’t stare at SQL alone. Look at the query plan the way you’d inspect compiler output. A plan tells you what the engine actually chose: join order, index usage, and where time is spent.

Practical takeaway: learn to read EXPLAIN output as a performance “assembly listing.” It turns tuning from guesswork into evidence-based debugging. For more on turning that into a habit, see /blog/practical-query-optimization-habits.

Schema Design Theory That Impacts Real Performance

Practice with real query plans

Generate a React plus Go plus PostgreSQL app, then use EXPLAIN on real queries.

Start Building

Good query performance often starts before you write SQL. Ullman’s schema design theory (especially normalization) is about structuring data so the database can keep it correct, predictable, and efficient as it grows.

Normalization goals (why it exists)

Normalization aims to:

Reduce anomalies (e.g., updating a customer address in five places and missing one)
Improve consistency by making each fact live in one “home”
Make constraints expressible (keys, foreign keys) so the engine can enforce rules instead of relying on application code

Those correctness wins translate into performance wins later: fewer duplicated fields, smaller indexes, and fewer expensive updates.

Normal forms in plain language

You don’t need to memorize proofs to use the ideas:

1NF: store values in atomic columns (no comma-separated lists). This makes filtering and indexing straightforward.
2NF: in tables with a composite key, every non-key column should depend on the whole key (not just part). This avoids repeating attributes across many rows.
3NF: non-key columns should depend only on the key, not on other non-key columns. This prevents hidden duplication.
BCNF: a stricter version of 3NF where every determinant is a candidate key—useful when “almost unique” columns create subtle duplicates.

When denormalization is reasonable

Denormalization can be a smart choice when:

You’re building analytics-heavy tables (wide fact tables, reporting)
Joins become the bottleneck and you can accept controlled redundancy
You’re optimizing for read speed with clear refresh rules (e.g., nightly rebuilds)

The key is to denormalize deliberately, with a process to keep duplicates in sync.

How schema choices affect the optimizer and scaling

Schema design shapes what the optimizer can do. Clear keys and foreign keys enable better join strategies, safer rewrites, and more accurate row-count estimates. Meanwhile, excessive duplication can bloat indexes and slow writes, and multi-valued columns block efficient predicates. As data volume grows, these early modeling decisions often matter more than micro-optimizing a single query.

How Theory Shows Up When Systems Scale

When a system “scales,” it’s rarely just about adding bigger machines. More often, the hard part is that the same query meaning must be preserved while the engine chooses a very different physical strategy to keep runtimes predictable. Ullman’s emphasis on formal equivalences is exactly what allows those strategy changes without changing results.

Scale is often a physical layout + plan choice

At small sizes, many plans “work.” At scale, the difference between scanning a table, using an index, or using a precomputed result can be the difference between seconds and hours. The theory side matters because the optimizer needs a safe set of rewrite rules (e.g., pushing filters earlier, reordering joins) that don’t alter the answer—even if they radically alter the work performed.

Partitioning changes the query you run, even if SQL looks the same

Partitioning (by date, customer, region, etc.) turns one logical table into many physical pieces. That affects planning:

Which partitions can be skipped (partition pruning)
Whether joins happen within partitions or require shuffling data across nodes
Whether grouping can be done locally before combining results

The SQL text may be unchanged, but the best plan now depends on where the rows live.

Materialized views: precomputation as algebraic shortcuts

Materialized views are essentially “saved subexpressions.” If the engine can prove that your query matches (or can be rewritten to match) a stored result, it can replace expensive work—like repeated joins and aggregations—with a fast lookup. This is relational algebra thinking in practice: recognize equivalence, then reuse.

Caching: helpful, but it can’t fix the wrong shape of work

Caching can speed up repeated reads, but it won’t rescue a query that must scan too much data, shuffle huge intermediate results, or compute a giant join. When scale issues appear, the fix is frequently: reduce the amount of data touched (layout/partitioning), reduce repeated computation (materialized views), or change the plan—not just “add cache.”

Practical Optimization Habits Inspired by Ullman

Ullman’s influence shows up in a simple mindset: treat a slow query as a statement of intent that the database is free to rewrite, then verify what it actually decided to do. You don’t need to be a theorist to benefit—you just need a repeatable routine.

1) Read an EXPLAIN plan: what to look at first

Start with the parts that usually dominate runtime:

Access method: is the engine scanning a whole table when you expected an index lookup?
Row estimates vs. actuals (if your database shows both): big gaps often explain “mysterious” slowness.
Join order: which table is used to drive the join, and does it start with the most selective filter?
Expensive operators: sorts, hash builds, large nested loops—these often reveal where the work really is.

If you only do one thing, identify the first operator where the row count explodes. That’s usually the root cause.

2) Common anti-patterns that defeat optimizers

These are easy to write and surprisingly costly:

Functions on indexed columns: WHERE LOWER(email) = ... can prevent index usage (use a normalized column or a functional index if supported).
Missing predicates: forgetting a date range or tenant filter turns a targeted query into a wide scan.
Accidental cross joins: a missing join condition can multiply rows and force huge intermediate results.

3) Form a hypothesis using algebraic thinking

Relational algebra encourages two practical moves:

Push filters earlier: apply WHERE conditions before joins whenever possible to shrink inputs.
Reduce columns early: select only needed columns (especially before joins) to cut memory and I/O.

A good hypothesis sounds like: “This join is expensive because we’re joining too many rows; if we filter orders to the last 30 days first, the join input drops.”

4) Index, rewrite, or schema change?

Use a simple decision rule:

Add an index when the query is correct, selective, and repeatedly executed.
Rewrite the query when EXPLAIN shows avoidable work (unnecessary joins, late filtering, non-sargable predicates).
Change the schema when the workload pattern is stable and you’re repeatedly fighting the same bottleneck (e.g., precomputed aggregates, denormalized lookup fields, or partitioning by time/tenant).

The goal isn’t “clever SQL.” It’s predictable, smaller intermediate results—exactly the kind of equivalence-preserving improvement Ullman’s ideas make easier to spot.

Applying These Ideas When You’re Building Real Products

Apply optimizer habits on mobile

Generate a Flutter app backed by Postgres and keep data access predictable.

Build Mobile App

These concepts aren’t just for database administrators. If you’re shipping an application, you’re making database and query-planning decisions whether you realize it or not: schema shape, key choices, query patterns, and the data access layer all influence what the optimizer can do.

If you’re using a vibe-coding workflow (for example, generating a React + Go + PostgreSQL app from a chat interface in Koder.ai), Ullman-style mental models are a practical safety net: you can review the generated schema for clean keys and relationships, inspect the queries your app relies on, and validate performance with EXPLAIN before problems show up in production. The faster you can iterate on “query intent → plan → fix,” the more value you get from accelerated development.

Where to Learn More and How to Apply It at Work

You don’t need to “study theory” as a separate hobby. The fastest way to benefit from Ullman-style fundamentals is to learn just enough to read query plans confidently—and then practice on your own database.

Beginner-friendly resources to look up

Search for these books and lecture topics (no affiliation—just widely cited starting points):

“A First Course in Database Systems” (Ullman & Widom) — approachable database foundations with practical framing.
“Principles of Database and Knowledge-Base Systems” (Ullman) — deeper theory if you want more rigor.
“Compilers: Principles, Techniques, and Tools” (Aho, Lam, Sethi, Ullman) — for the “why do optimizers resemble compilers?” connection.
Lecture/search topics: relational algebra, query rewriting, join ordering, cost-based optimization, indexes and selectivity, parsing and query languages.

A lightweight learning path

Start small and keep each step tied to something you can observe:

Relational algebra: learn selection, projection, join, and equivalence rules.
Plans: learn to read plan nodes (scan types, filters, joins, sorts, aggregates).
Joins: understand nested loop vs hash join vs merge join and when each tends to win.
Cost models: learn the few inputs that drive decisions (row counts, selectivity, I/O vs CPU).

Small exercises that pay off quickly

Pick 2–3 real queries and iterate:

Rewrite: change IN to EXISTS, push predicates earlier, remove unnecessary columns, compare results.
Compare plans: capture “before/after” plans and note what changed (join order, join type, scan type).
Vary indexes: try adding/removing one index at a time and watch estimated vs actual row counts.

Communicating findings to teammates

Use clear, plan-based language:

“The plan switched from a sequential scan to an index scan because the filter became selective.”
“Row estimates were off by 100×, so the optimizer picked the wrong join order.”
“This rewrite is equivalent (same result), but it enables predicate pushdown and fewer rows into the join.”

That’s the practical payoff of Ullman’s foundations: you get a shared vocabulary to explain performance—without guessing.

FAQ

Who is Jeffrey Ullman, and why does his work matter if I only write SQL?

Jeffrey Ullman helped formalize how databases represent query meaning and how they can safely transform queries into faster equivalents. That foundation shows up every time an engine rewrites a query, reorders joins, or picks a different execution plan while guaranteeing the same result set.

What is relational algebra, and how is it connected to SQL?

Relational algebra is a small set of operators (select, project, join, union, difference) that precisely describe query results. Engines commonly translate SQL into an algebra-like operator tree so they can apply equivalence rules (like pushing filters earlier) before choosing an execution strategy.

Why do “meaning-preserving” query rewrites matter in practice?

Because optimization depends on proving that a rewritten query returns the same results. Equivalence rules let the optimizer do things like:

push WHERE filters before a join
prune unused columns early
reorder joins when it’s logically safe

These changes can cut work dramatically without changing meaning.

What’s the difference between a logical query plan and a physical query plan?

A logical plan describes what operations are needed (filter, join, aggregate) independent of storage details. A physical plan chooses how to run them (index scan vs. full scan, hash join vs. nested loop, parallelism, sort strategies). Most performance differences come from physical choices, enabled by logical rewrites.

What is cost-based optimization in plain English?

Cost-based optimization evaluates multiple valid plans and chooses the one with the lowest estimated cost. Costs are usually driven by practical factors like rows processed, I/O, CPU, and memory (including whether a hash or sort spills to disk).

What is cardinality estimation, and why does it cause unpredictable performance?

Cardinality estimation is the optimizer’s guess of “how many rows will come out of this step?” Those estimates determine join order, join type, and whether an index scan is worthwhile. When estimates are wrong (often due to stale/missing statistics), you can get sudden slowdowns, big spills, or surprising plan changes.

When should I expect nested loop, hash join, or merge join to be fastest?

Nested loop join: best when the left side is small and the right side can be probed efficiently (often via an index).
Hash join: great for large equality joins on unsorted data, but needs enough memory to avoid spilling.
Merge join: strong when both inputs are already sorted (or can be sorted cheaply), often helped by indexes that provide order.

How do I read an EXPLAIN plan without getting overwhelmed?

Focus on a few high-signal clues:

where row counts explode (first big blow-up is often the root cause)
“estimated vs actual rows” gaps (bad stats/assumptions)
expensive operators (large sorts, hash builds, nested loops over big inputs)
scan choice (full scan when you expected an index)

Treat the plan like compiled output: it shows what the engine actually decided to do.

How does normalization affect query performance, and when is denormalization acceptable?

Normalization reduces duplicated facts and update anomalies, which often means smaller tables and indexes and more reliable joins. Denormalization can still be right for analytics or repeated read-heavy patterns, but it should be deliberate (clear refresh rules, known redundancy) so correctness doesn’t degrade over time.

What techniques help queries stay fast as data scales without changing results?

Scale often requires changing physical strategy while keeping query meaning identical. Common tools include:

partitioning for pruning and locality
materialized views to reuse equivalent subresults
plan changes driven by updated stats as data grows

Caching helps repeated reads, but it won’t fix a query that must touch too much data or produces huge intermediate joins.

Jeffrey Ullman’s Database Theory Behind Fast, Scalable Queries | Koder.ai