-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy path06-indexing.html
More file actions
523 lines (485 loc) · 49 KB
/
Copy path06-indexing.html
File metadata and controls
523 lines (485 loc) · 49 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0">
<script>try{var t=localStorage.getItem("dbms-theme")||"dark";document.documentElement.setAttribute("data-theme",t)}catch(e){document.documentElement.setAttribute("data-theme","dark")}</script>
<title>Indexing — DBMS Illustrated</title>
<link rel="stylesheet" href="../css/style.css">
<style>:root{--topic-color:#06B6D4}</style>
</head>
<body>
<div class="reading-progress" id="reading-progress"></div>
<nav class="navbar">
<div class="navbar-inner">
<a class="navbar-brand" href="../index.html"><span class="brand-icon" style="color:var(--primary-soft)">◆</span>DBMS Illustrated</a>
<div class="navbar-links"><a href="../index.html#topics">All Topics</a></div>
<div class="navbar-actions"><button class="btn-icon" id="theme-toggle" title="Toggle theme">☼</button></div>
</div>
</nav>
<div class="container">
<div class="breadcrumb">
<a href="../index.html">Home</a><span class="breadcrumb-sep">›</span>
<a href="../index.html#topics">Course</a><span class="breadcrumb-sep">›</span>
<span>Indexing</span>
</div>
<div class="topic-header">
<div class="topic-badge">Topic 06</div>
<h1>Database <span class="accent">Indexing</span></h1>
<p class="subtitle">Without an index, a database query scans every single row in a table — like searching a phone book by reading every entry. An index is like the alphabetical tabs in that phone book: it lets the database jump directly to the right place. This topic explains how indexes work, when to use them, and why using the wrong one can make things worse.</p>
<div class="company-badges">
<span class="badge">PostgreSQL GiST/GIN/BRIN</span><span class="badge">MySQL InnoDB B+tree</span>
<span class="badge">Redis Sorted Set</span><span class="badge">Elasticsearch</span><span class="badge">MongoDB</span>
</div>
</div>
<div class="at-a-glance reveal">
<h3>At a Glance</h3>
<ul>
<li><strong>B+tree index</strong> — the default type. Think of it as a sorted alphabetical list that stays balanced no matter how many entries are added. Looking up one value, a range, or a sort order are all fast. Used for almost all everyday queries.</li>
<li><strong>Clustered index</strong> — the physical rows on disk are arranged in the same order as this index. In MySQL/InnoDB, the primary key is always the clustered index. One per table. Range scans by primary key are extremely fast because the rows are stored together.</li>
<li><strong>Covering index</strong> — an index that already contains every column your query needs. The database never has to touch the actual table rows — it gets everything from the index. This is like answering a question from the index at the back of a book without having to flip back to the chapter.</li>
<li><strong>Composite index</strong> — an index on multiple columns together. The order matters: (dept, hire_date) helps you filter by dept (or by dept AND hire_date), but not by hire_date alone. Always put the column you filter on most often first.</li>
<li><strong>Indexes make reads faster but writes slower.</strong> Every insert, update, or delete must update every index on the table. A table with 10 indexes pays 10x the write cost. Add indexes purposefully after measuring, not speculatively.</li>
</ul>
</div>
<!-- ── Index Types Overview ────────────────────────────────── -->
<div class="section-label">Index Types</div>
<h2 class="section-title">Choosing the Right Index</h2>
<p class="section-desc">Not all indexes are B+trees. Choosing the right structure for the access pattern is the difference between a query that uses the index and one that silently falls back to a full scan.</p>
<div class="mini-cards">
<div class="mini-card">
<div class="mini-card-icon"></div>
<h4>B+Tree (default)</h4>
<p>The most common index type. Imagine a very tall, well-organized filing cabinet that always stays balanced — no matter how many files you add or remove. Looking up a specific value is fast because you just navigate the tree top-to-bottom (typically 3–4 hops for millions of rows). Range queries (like "all orders between January and March") are also fast because the leaf level is sorted and linked together. Works for almost everything: equality checks, ranges, sorting, and prefix text searches like LIKE 'john%'.</p>
</div>
<div class="mini-card">
<div class="mini-card-icon"></div>
<h4>Hash</h4>
<p>A hash index works like a locker room: you hand over a key (your value), it instantly gives you the locker number (the row). Single lookups are very fast — but only for exact equality matches. You cannot ask "which lockers are between 50 and 100" — there is no ordering. Use hash indexes only when you need exact equality lookups on things like tokens or UUIDs and never need to search a range.</p>
</div>
<div class="mini-card">
<div class="mini-card-icon"></div>
<h4>GIN (Inverted)</h4>
<p>GIN (Generalized Inverted Index) is built for searching inside complex data like JSON, arrays, or text documents. Think of it like the index at the back of a book — it maps individual words or elements to which pages (rows) they appear in. If you store product metadata as JSON and want to find all products where color is "red," a GIN index can scan that in milliseconds. It takes longer to build and update than a B+tree, but it handles things a B+tree simply cannot.</p>
</div>
<div class="mini-card">
<div class="mini-card-icon"></div>
<h4>BRIN (Block Range)</h4>
<p>BRIN (Block Range Index) is an extremely small index — just a few kilobytes even for tables with billions of rows. It works by recording the minimum and maximum value for each physical section of the table on disk. If you query for a date range, it can quickly rule out sections of the table that fall outside your range without reading them. This only works if the column is naturally ordered on disk — like an auto-incrementing ID or timestamps in an append-only log table. It is useless for randomly ordered data like UUIDs.</p>
</div>
</div>
<!-- ── B+Tree Deep Dive ───────────────────────────────────── -->
<div class="section-label">B+Tree Internals</div>
<h2 class="section-title">B+Tree Mechanics</h2>
<p class="section-desc">The B+tree is the workhorse of database indexing. Unlike a binary search tree, each node in a B+tree can have dozens to hundreds of children (the "fan-out"), which keeps the height short even for billion-row tables. A typical B+tree on a modern database has a height of 3–4 for 100 million rows.</p>
<div class="steps">
<div class="step-item">
<div class="step-num">1</div>
<div>
<h4>Node structure and fan-out</h4>
<p>Think of the B+tree as a building directory. Each floor (level) has multiple offices (nodes), and each office has multiple doors pointing to the next floor down. The number of doors per office is called the fan-out — in a database, it is typically 100 to 500. Because each level branches that many ways, the tree stays shallow even for enormous tables: a tree with a fan-out of 500 can index 125 million rows in just three levels. This is why looking up a row takes only 3–4 disk reads regardless of table size.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">2</div>
<div>
<h4>Point lookup (exact match)</h4>
<p>To find a specific value — like looking up a user by ID — the database starts at the top of the tree and follows pointers down toward the matching leaf. At each level, it checks which direction to go (left for smaller, right for larger). By the time it reaches the bottom (the leaf level), it has the exact row location. The whole journey is only 3–4 steps for most tables. The top levels of the tree are usually cached in memory, so often only the final step reads from disk.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">3</div>
<div>
<h4>Range scan</h4>
<p>For a range query — like "all orders placed in January" — the database first finds the leaf where January starts (a normal point lookup), then walks forward through the sorted leaf level reading the next entry, and the next, until the range is exhausted. The leaf nodes are linked together like a chain, so this is very efficient. The amount of work is proportional to how many rows match, not the total size of the table. This is why range queries are fast with a B+tree but impossible with a hash index.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">4</div>
<div>
<h4>Insert and rebalancing</h4>
<p>When a new row is inserted, the database finds the right leaf position in the tree and adds the new key there. If the leaf is full, it splits in two: the middle key moves up to the parent level, and the entries are divided between the two new leaves. If the parent is also full, the split ripples upward. In practice, splits are uncommon because each node has reserved some empty space (typically 10%) to absorb new entries without immediately splitting. Deletes leave small gaps, and the tree is not immediately reorganized.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">5</div>
<div>
<h4>Index fill factor and bloat</h4>
<p>Over time, lots of deletes can leave index pages half-empty — a problem called index bloat. A half-empty page still takes the same amount of time to read as a full one, so bloat means you are doing extra work for no benefit. To clean this up, you can rebuild the index from scratch (REINDEX in PostgreSQL, OPTIMIZE TABLE in MySQL). On a busy production table this matters — a table with high delete rates (like a job queue or audit log) should have its index bloat monitored and periodically cleaned up.</p>
</div>
</div>
</div>
<!-- ── Clustered vs Non-Clustered ────────────────────────── -->
<div class="section-label">Clustered Index</div>
<h2 class="section-title">Clustered vs. Non-Clustered</h2>
<p class="section-desc">The single most important structural distinction in indexing: whether the index IS the table, or whether it is a separate data structure that points to the table.</p>
<div class="table-wrap">
<table class="compare-table">
<thead>
<tr><th>Property</th><th>Clustered Index</th><th>Non-Clustered Index</th></tr>
</thead>
<tbody>
<tr><td>Row storage</td><td>Leaf pages ARE the data pages</td><td>Leaf pages store (key, PK pointer)</td></tr>
<tr><td>Count per table</td><td>Exactly one</td><td>Unlimited (within reason)</td></tr>
<tr><td>Range scans on index key</td><td>Extremely fast — sequential pages</td><td>Fast on index, then random for heap fetch</td></tr>
<tr><td>Secondary lookup cost</td><td>None — data is at the leaf</td><td>Extra hop to clustered index (key lookup)</td></tr>
<tr><td>Insert order</td><td>Must maintain sorted order — random PKs cause page splits</td><td>Index maintains order; heap is unordered</td></tr>
<tr><td>MySQL InnoDB</td><td>Always the PRIMARY KEY</td><td>All other indexes (store PK, not row pointer)</td></tr>
<tr><td>PostgreSQL</td><td>Heap is unordered; CLUSTER command physically reorders once</td><td>All indexes point to heap tuple (ctid)</td></tr>
</tbody>
</table>
</div>
<p style="font-size:.85rem;color:var(--text-muted);margin-bottom:28px;">PostgreSQL note: PostgreSQL does not have "clustered indexes" in the InnoDB sense — all indexes point to the heap. The <code>CLUSTER</code> command physically reorders heap rows by an index at a single point in time, but new writes are not clustered. InnoDB's clustered index is a fundamental design choice that affects all secondary index lookups.</p>
<!-- ── Composite Indexes ───────────────────────────────────── -->
<div class="section-label">Composite Indexes</div>
<h2 class="section-title">Column Order in Composite Indexes</h2>
<p class="section-desc">A composite index on (A, B, C) sorts first by A, then B within each A value, then C within each (A,B) group. This structure has direct implications for which query predicates the index can satisfy.</p>
<div class="steps">
<div class="step-item">
<div class="step-num">1</div>
<div>
<h4>Leftmost prefix rule</h4>
<p>An index on (dept, hire_date, salary) is sorted first by department, then by hire date within each department, then by salary within each hire date. You can only use it efficiently if you are filtering by the columns from the left. So filtering by dept works. Filtering by dept AND hire_date works. But filtering by hire_date alone does not — the index is not sorted by hire_date globally, only within each department. Think of it like a phone book sorted by last name, then first name: you can find all "Smiths" quickly, but you cannot find all "Johns" without reading the whole thing.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">2</div>
<div>
<h4>Equality columns before range columns</h4>
<p>When designing a composite index, put columns used in exact-match filters before columns used in range filters. For example: if you always filter by dept = 'Engineering' AND hire_date between two dates, put dept first. The exact match on dept narrows the search to one section of the index, then the range scan on hire_date within that section is small. If you reversed it (hire_date first), the range would span all hire dates across all departments — a much bigger search.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">3</div>
<div>
<h4>ORDER BY and GROUP BY with indexes</h4>
<p>If your query sorts by the same columns that are in an index (in the same order), the database can read the index in sorted order and skip the sorting step entirely. For example, if you have an index on (dept, hire_date) and your query says ORDER BY dept, hire_date — the results come out already sorted as the database traverses the index. This can be a big win because sorting is expensive when it requires loading a lot of rows into memory.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">4</div>
<div>
<h4>Index skip scan</h4>
<p>Modern databases have a trick: if the first column of a composite index only has a few distinct values (like a status column with "pending," "active," "closed"), they can scan the index multiple times — once per value — to answer a query that only filters on the second column. This "skip scan" avoids a full table scan when a full composite match is not available. It works best when the first column has very few distinct values; if there are thousands of distinct values, the trick becomes more expensive than just scanning the whole table.</p>
</div>
</div>
</div>
<!-- ── Covering Index ──────────────────────────────────────── -->
<div class="section-label">Covering Index</div>
<h2 class="section-title">Covering Indexes and INCLUDE Columns</h2>
<p class="section-desc">A covering index is one that contains all the columns a query needs — the query can be answered entirely from the index without touching the table. This eliminates the most expensive part of many queries: the random I/O lookup back to the heap.</p>
<div class="steps">
<div class="step-item">
<div class="step-num">1</div>
<div>
<h4>Traditional covering vs. INCLUDE</h4>
<p>To make a covering index, you add extra columns to the index so the query never has to touch the actual table. One way is to put those columns directly in the index key — but that changes the sort order and can add overhead to every write. PostgreSQL offers a cleaner option: INCLUDE. Columns in INCLUDE are stored in the leaf pages (so a query can read them without going back to the table) but they do not affect the sort order of the index. Think of them as passengers in the index — they are there when needed but do not drive.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">2</div>
<div>
<h4>Index-only scan</h4>
<p>When a covering index exists, the database can answer the entire query from the index without ever reading the original table rows. In EXPLAIN output, you will see "Index Only Scan" — this is the fastest possible outcome. The database does need to do a quick side check to confirm that all the rows in the index are visible to your transaction (because of MVCC), but running VACUUM regularly keeps that check cheap. An index-only scan is often 10x faster than even a normal index scan.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">3</div>
<div>
<h4>When to use INCLUDE</h4>
<p>Use INCLUDE when a query always filters on certain columns AND always reads a few extra columns. For example, if your dashboard always queries orders by user and date, and always needs to display the status and total, add status and total with INCLUDE. This makes the index covering for that specific query. The index key stays clean (just user_id and created_at), but the leaf nodes carry status and total along so no table lookup is needed.</p>
</div>
</div>
</div>
<!-- ── Partial Indexes ────────────────────────────────────── -->
<div class="section-label">Partial & Expression Indexes</div>
<h2 class="section-title">Targeted Index Strategies</h2>
<div class="steps">
<div class="step-item">
<div class="step-num">1</div>
<div>
<h4>Partial indexes</h4>
<p>A partial index is an index on a subset of rows — only rows where a condition is true. For example, if a jobs table has millions of completed jobs but only a few thousand pending ones, you can create an index that only covers the pending ones. This index is tiny, always selective, and very fast to scan. Any query that filters on pending jobs can use it. The key insight: you index only what you actually query frequently, not the whole table.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">2</div>
<div>
<h4>Expression (functional) indexes</h4>
<p>Sometimes you need to search by a transformed version of a column — like searching email addresses case-insensitively. If you store emails in mixed case and always query with LOWER(email) = '...', a regular index on email will not help — the index stores the original case, but your search uses the lowercase version. An expression index solves this: instead of indexing the raw column, you index the result of LOWER(email). Then queries that use LOWER(email) in their filter can use that index.</p>
</div>
</div>
<div class="step-item">
<div class="step-num">3</div>
<div>
<h4>Unique partial indexes</h4>
<p>You can combine partial indexes with a uniqueness constraint. For example, if you soft-delete users (keeping old records with a deleted_at timestamp), you want to enforce that no two active users share the same email — but it is fine if a deleted user's email is reused later. A unique partial index on email WHERE deleted_at IS NULL enforces uniqueness only among active rows. A regular UNIQUE constraint would reject the reuse entirely, even for deleted accounts.</p>
</div>
</div>
</div>
<!-- ── EXPLAIN Output ────────────────────────────────────── -->
<div class="section-label">Reading EXPLAIN</div>
<h2 class="section-title">Understanding EXPLAIN ANALYZE</h2>
<p class="section-desc">EXPLAIN shows the planner's estimated execution plan. EXPLAIN ANALYZE actually runs the query and shows real row counts and times. The gap between estimated and actual rows reveals stale statistics or planning mistakes.</p>
<div class="code-block">
<div class="cb-header"><span class="cb-lang">SQL — Index Creation and EXPLAIN</span></div>
<pre><span class="cmt">-- Composite index: equality column first, range column second</span>
<span class="kw">CREATE INDEX</span> idx_emp_dept_hire <span class="kw">ON</span> employees (dept, hire_date);
<span class="cmt">-- Covering index with INCLUDE (PG 11+)</span>
<span class="cmt">-- For: SELECT salary WHERE dept='Eng' ORDER BY hire_date</span>
<span class="kw">CREATE INDEX</span> idx_emp_covering <span class="kw">ON</span> employees (dept, hire_date) <span class="kw">INCLUDE</span> (salary);
<span class="cmt">-- Partial index: only index active, high-priority jobs</span>
<span class="kw">CREATE INDEX</span> idx_pending_jobs <span class="kw">ON</span> jobs (priority <span class="kw">DESC</span>, created_at)
<span class="kw">WHERE</span> status = <span class="str">'pending'</span>;
<span class="cmt">-- Expression index for case-insensitive email lookup</span>
<span class="kw">CREATE INDEX</span> idx_email_lower <span class="kw">ON</span> users (<span class="kw">LOWER</span>(email));
<span class="cmt">-- Hash index: equality-only, smaller than B+tree</span>
<span class="kw">CREATE INDEX</span> idx_token_hash <span class="kw">ON</span> sessions <span class="kw">USING HASH</span> (token);
<span class="cmt">-- BRIN index: tiny, good for time-series tables</span>
<span class="kw">CREATE INDEX</span> idx_events_brin <span class="kw">ON</span> events <span class="kw">USING BRIN</span> (created_at);
<span class="cmt">-- GIN for JSONB containment</span>
<span class="kw">CREATE INDEX</span> idx_meta_gin <span class="kw">ON</span> products <span class="kw">USING GIN</span> (metadata);
<span class="cmt">-- Query: SELECT * FROM products WHERE metadata @> '{"color":"red"}'</span>
<span class="cmt">-- EXPLAIN ANALYZE to verify index usage</span>
<span class="kw">EXPLAIN</span> (FORMAT TEXT, ANALYZE, BUFFERS)
<span class="kw">SELECT</span> salary <span class="kw">FROM</span> employees
<span class="kw">WHERE</span> dept = <span class="str">'Engineering'</span>
<span class="kw">AND</span> hire_date > <span class="str">'2022-01-01'</span>
<span class="kw">ORDER BY</span> hire_date;
<span class="cmt">-- Good output: "Index Only Scan using idx_emp_covering"
-- Bad output: "Seq Scan on employees" (index not used)
-- Also check: rows (estimated) vs. rows (actual) — if far apart, run ANALYZE</span>
<span class="cmt">-- Find unused indexes (waste write performance)</span>
<span class="kw">SELECT</span> schemaname, tablename, indexname, idx_scan
<span class="kw">FROM</span> pg_stat_user_indexes
<span class="kw">WHERE</span> idx_scan = <span class="num">0</span>
<span class="kw">ORDER BY</span> pg_relation_size(indexrelid) <span class="kw">DESC</span>;</pre>
</div>
<!-- ── Interactive Demo ───────────────────────────────────── -->
<div class="section-label">Interactive Demo</div>
<h2 class="section-title">B+Tree Traversal</h2>
<p class="section-desc">Enter a search key and watch the B+tree traversal animate — root to internal node to leaf. Toggle the index off to see the full table scan comparison.</p>
<div class="diagram-box">
<svg viewBox="0 0 700 230" role="img" aria-label="B+Tree structure: root, internal nodes, leaf nodes with sibling pointers">
<!-- Root -->
<rect x="265" y="10" width="170" height="42" rx="8" class="svg-box-p"/>
<text x="350" y="28" text-anchor="middle" class="svg-label">Root</text>
<text x="350" y="45" text-anchor="middle" class="svg-mono">[ 40 | 80 ]</text>
<!-- Internal left -->
<rect x="80" y="90" width="150" height="42" rx="8" class="svg-box-c"/>
<text x="155" y="108" text-anchor="middle" class="svg-label">Internal</text>
<text x="155" y="125" text-anchor="middle" class="svg-mono">[ 20 | 30 ]</text>
<!-- Internal mid -->
<rect x="275" y="90" width="150" height="42" rx="8" class="svg-box-c"/>
<text x="350" y="108" text-anchor="middle" class="svg-label">Internal</text>
<text x="350" y="125" text-anchor="middle" class="svg-mono">[ 50 | 65 ]</text>
<!-- Internal right -->
<rect x="470" y="90" width="150" height="42" rx="8" class="svg-box-c"/>
<text x="545" y="108" text-anchor="middle" class="svg-label">Internal</text>
<text x="545" y="125" text-anchor="middle" class="svg-mono">[ 85 | 95 ]</text>
<!-- Root -> internal edges -->
<line x1="310" y1="52" x2="210" y2="90" class="svg-line"/>
<line x1="350" y1="52" x2="350" y2="90" class="svg-line"/>
<line x1="390" y1="52" x2="490" y2="90" class="svg-line"/>
<!-- Leaf row -->
<rect x="15" y="173" width="110" height="42" rx="8" class="svg-box-g"/>
<text x="70" y="191" text-anchor="middle" class="svg-label">Leaf</text>
<text x="70" y="208" text-anchor="middle" class="svg-mono">[10|15|18]</text>
<rect x="143" y="173" width="110" height="42" rx="8" class="svg-box-g"/>
<text x="198" y="191" text-anchor="middle" class="svg-label">Leaf</text>
<text x="198" y="208" text-anchor="middle" class="svg-mono">[20|25|30]</text>
<rect x="271" y="173" width="110" height="42" rx="8" class="svg-box-g"/>
<text x="326" y="191" text-anchor="middle" class="svg-label">Leaf</text>
<text x="326" y="208" text-anchor="middle" class="svg-mono">[40|50|60]</text>
<rect x="399" y="173" width="110" height="42" rx="8" class="svg-box-g"/>
<text x="454" y="191" text-anchor="middle" class="svg-label">Leaf</text>
<text x="454" y="208" text-anchor="middle" class="svg-mono">[65|70|75]</text>
<rect x="527" y="173" width="150" height="42" rx="8" class="svg-box-g"/>
<text x="602" y="191" text-anchor="middle" class="svg-label">Leaf</text>
<text x="602" y="208" text-anchor="middle" class="svg-mono">[80|85|95|99]</text>
<!-- Internal -> leaf edges -->
<line x1="120" y1="132" x2="70" y2="173" class="svg-line"/>
<line x1="155" y1="132" x2="198" y2="173" class="svg-line"/>
<line x1="315" y1="132" x2="326" y2="173" class="svg-line"/>
<line x1="350" y1="132" x2="454" y2="173" class="svg-line"/>
<line x1="510" y1="132" x2="602" y2="173" class="svg-line"/>
<!-- Sibling pointers (leaf linked list) -->
<line x1="125" y1="194" x2="143" y2="194" class="svg-line-p svg-dash"/>
<line x1="253" y1="194" x2="271" y2="194" class="svg-line-p svg-dash"/>
<line x1="381" y1="194" x2="399" y2="194" class="svg-line-p svg-dash"/>
<line x1="509" y1="194" x2="527" y2="194" class="svg-line-p svg-dash"/>
<!-- Packets descending from root -->
<circle class="pkt" cx="350" cy="52" r="5" fill="#8B5CF6"/>
<circle class="pkt" cx="155" cy="132" r="5" fill="#0EA5E9"/>
<circle class="pkt" cx="70" cy="173" r="5" fill="#10B981"/>
</svg>
<p class="diagram-caption">B+tree search is O(log N) — at most 3-4 page reads for a billion-row table. All values live in leaves, linked in sorted order for efficient range scans. Internal nodes only store routing keys.</p>
</div>
<div class="demo-section" id="demo-index">
<div class="demo-header">
<h3>B+Tree Index Search</h3>
<div class="demo-controls">
<input class="demo-input" type="number" data-action="index-search" value="42" min="1" max="99" style="width:80px;min-width:0;">
<button class="demo-btn" data-action="search-btn" style="background:var(--topic-color);color:#fff;border-color:var(--topic-color)">Search</button>
<button class="demo-btn" data-action="range-query">Range: 10–48</button>
<button class="demo-btn danger" data-action="toggle-index">Disable Index</button>
</div>
</div>
<div class="demo-canvas-wrap"></div>
<div class="demo-hint">Purple = traversal path. Green border = leaf nodes. Linked leaves enable range scan without re-traversing the tree.</div>
</div>
<!-- ── Index Type Comparison ──────────────────────────────── -->
<div class="section-label">Comparison</div>
<h2 class="section-title">Index Type Reference</h2>
<div class="table-wrap">
<table class="compare-table">
<thead>
<tr><th>Type</th><th>Equality</th><th>Range</th><th>ORDER BY</th><th>Prefix LIKE</th><th>Array / JSONB</th><th>Size</th></tr>
</thead>
<tbody>
<tr><td>B+Tree</td><td style="color:#10B981">Yes</td><td style="color:#10B981">Yes</td><td style="color:#10B981">Yes</td><td style="color:#10B981">Yes</td><td style="color:#EF4444">No</td><td>Medium</td></tr>
<tr><td>Hash</td><td style="color:#10B981">Yes (O(1))</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td>Small</td></tr>
<tr><td>GIN</td><td style="color:#10B981">Yes</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#10B981">Yes (containment)</td><td>Large</td></tr>
<tr><td>GiST</td><td style="color:#10B981">Yes</td><td style="color:#10B981">Spatial</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#10B981">Geometric</td><td>Medium</td></tr>
<tr><td>BRIN</td><td style="color:#10B981">Approx</td><td style="color:#10B981">Approx</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td style="color:#EF4444">No</td><td>Tiny</td></tr>
<tr><td>Full-text (tsvector)</td><td style="color:#10B981">Yes</td><td style="color:#EF4444">No</td><td style="color:#10B981">By rank</td><td style="color:#10B981">Yes</td><td style="color:#EF4444">No</td><td>Large</td></tr>
</tbody>
</table>
</div>
<!-- ── Anti-patterns ──────────────────────────────────────── -->
<div class="section-label">Anti-patterns</div>
<div class="antipatterns">
<div class="antipattern">
<h4>Indexing low-selectivity columns</h4>
<p>An index on a boolean column with 50/50 distribution is rarely used — the planner will choose a full scan when more than ~5–10% of rows match. Partial indexes on the rare value are effective: <code>CREATE INDEX WHERE is_deleted = true</code> indexes only the small minority of deleted rows, giving high selectivity for queries that filter on deleted items.</p>
</div>
<div class="antipattern">
<h4>Functions on indexed columns in WHERE</h4>
<p><code>WHERE LOWER(email) = 'alice@example.com'</code> — the B+tree index on email is NOT used because the index stores the raw value, not LOWER(email). Fix: create an expression index <code>CREATE INDEX ON users (LOWER(email))</code>, or store the pre-lowercased value in a column and index that.</p>
</div>
<div class="antipattern">
<h4>Over-indexing write-heavy tables</h4>
<p>Each index on a table requires a B+tree update on every INSERT, UPDATE (to indexed columns), and DELETE. A table with 15 indexes can have 5–10x slower write throughput. Profile writes under load before adding indexes. Remove unused indexes using <code>pg_stat_user_indexes.idx_scan = 0</code>.</p>
</div>
<div class="antipattern">
<h4>Wrong composite column order</h4>
<p>Index (status, created_at) serves queries filtering by status. Index (created_at, status) serves range queries on created_at. Reversing the order for your actual query pattern means the optimizer falls back to a full scan. Rule: equality predicates come first, range predicates last, ORDER BY columns last if they align.</p>
</div>
<div class="antipattern">
<h4>Using LIKE '%prefix' (leading wildcard)</h4>
<p><code>WHERE name LIKE '%smith'</code> cannot use a B+tree index because the leading wildcard means the search key is not bounded from the left — the index sorts by the leftmost character. Only <code>LIKE 'smith%'</code> (trailing wildcard) is B+tree-compatible. Full-text or trigram indexes (pg_trgm in PostgreSQL) handle leading wildcard searches efficiently.</p>
</div>
</div>
<!-- ── Quiz ──────────────────────────────────────────────── -->
<div class="section-label">Quiz</div>
<div class="quiz-section">
<div class="quiz-question">
<div class="quiz-q-num">Question 1 of 5</div>
<div class="quiz-q-text">In a B+tree index, where are actual data pointers (row locations) stored?</div>
<div class="quiz-options">
<div class="quiz-option" data-correct="false" data-explanation="Root nodes in a B+tree only hold navigation keys to route searches downward."><span class="opt-letter">A</span>Root node only</div>
<div class="quiz-option" data-correct="false" data-explanation="Internal nodes hold navigation keys, not data pointers."><span class="opt-letter">B</span>Internal nodes</div>
<div class="quiz-option" data-correct="true" data-explanation="B+trees store all data pointers in leaf nodes. Internal nodes hold only keys for navigation. Leaf nodes are linked in a doubly-linked list, enabling efficient range scans without returning to internal nodes."><span class="opt-letter">C</span>Leaf nodes</div>
<div class="quiz-option" data-correct="false" data-explanation="There is no 'root leaf' in a standard B+tree."><span class="opt-letter">D</span>Root and leaf nodes equally</div>
</div>
<div class="quiz-feedback"></div>
</div>
<div class="quiz-question">
<div class="quiz-q-num">Question 2 of 5</div>
<div class="quiz-q-text">A covering index means:</div>
<div class="quiz-options">
<div class="quiz-option" data-correct="false" data-explanation="A covering index is specific to particular query columns, not all tables or all queries."><span class="opt-letter">A</span>The index covers all tables in the schema</div>
<div class="quiz-option" data-correct="true" data-explanation="A covering index includes all columns the query needs in the index itself, so the database can answer the query entirely from the index without looking up the table row. This eliminates random I/O to the heap, which is the expensive part."><span class="opt-letter">B</span>All columns needed by a query exist in the index — no table lookup required</div>
<div class="quiz-option" data-correct="false" data-explanation="Fill factor controls how full each index page is on creation, not whether the index covers a query."><span class="opt-letter">C</span>The index has 100% fill factor</div>
<div class="quiz-option" data-correct="false" data-explanation="A covering index can include non-PK columns — in fact, that is its whole purpose."><span class="opt-letter">D</span>The index only covers primary key columns</div>
</div>
<div class="quiz-feedback"></div>
</div>
<div class="quiz-question">
<div class="quiz-q-num">Question 3 of 5</div>
<div class="quiz-q-text">An index on (dept, salary). Which query will use this index efficiently?</div>
<div class="quiz-options">
<div class="quiz-option" data-correct="false" data-explanation="Filtering on salary alone cannot use this index because the index is sorted first by dept. Without constraining dept, the index offers no help."><span class="opt-letter">A</span>WHERE salary > 80000</div>
<div class="quiz-option" data-correct="true" data-explanation="Filtering on dept = 'Engineering' (equality on the leading column) uses the index to narrow to one dept, then a range scan on salary within that dept. This is the leftmost prefix rule in action."><span class="opt-letter">B</span>WHERE dept = 'Engineering' AND salary > 80000</div>
<div class="quiz-option" data-correct="false" data-explanation="Filtering on salary with LIKE does not benefit from the composite index since salary is the second column."><span class="opt-letter">C</span>WHERE salary LIKE '9%'</div>
<div class="quiz-option" data-correct="false" data-explanation="Sorting by salary alone cannot use this index for an efficient sort — the index sorts by dept first."><span class="opt-letter">D</span>ORDER BY salary DESC</div>
</div>
<div class="quiz-feedback"></div>
</div>
<div class="quiz-question">
<div class="quiz-q-num">Question 4 of 5</div>
<div class="quiz-q-text">Which index type is best for a JSONB containment query like WHERE metadata @> '{"color":"red"}'?</div>
<div class="quiz-options">
<div class="quiz-option" data-correct="false" data-explanation="B+tree indexes sort scalar values; they cannot efficiently search inside JSONB structures."><span class="opt-letter">A</span>B+tree</div>
<div class="quiz-option" data-correct="false" data-explanation="Hash indexes only support equality on a whole value — not containment within a JSON document."><span class="opt-letter">B</span>Hash</div>
<div class="quiz-option" data-correct="true" data-explanation="GIN (Generalized Inverted Index) indexes each key-value pair inside the JSONB document separately, enabling fast containment queries (@>). It is the standard index type for JSONB in PostgreSQL."><span class="opt-letter">C</span>GIN</div>
<div class="quiz-option" data-correct="false" data-explanation="BRIN stores min/max per block range — completely unsuitable for JSONB containment."><span class="opt-letter">D</span>BRIN</div>
</div>
<div class="quiz-feedback"></div>
</div>
<div class="quiz-question">
<div class="quiz-q-num">Question 5 of 5</div>
<div class="quiz-q-text">Why does WHERE LOWER(email) = 'alice@example.com' NOT use a B+tree index on the email column?</div>
<div class="quiz-options">
<div class="quiz-option" data-correct="false" data-explanation="Email is a valid column type for indexing."><span class="opt-letter">A</span>Email columns cannot be indexed</div>
<div class="quiz-option" data-correct="true" data-explanation="The B+tree index stores the raw email values. Applying LOWER() to each row at query time transforms the search key differently from what is stored in the index. The index on email cannot be used for LOWER(email) predicates. Solution: create an expression index on LOWER(email)."><span class="opt-letter">B</span>The index stores raw values; LOWER() transforms the search key so the index key ordering no longer matches</div>
<div class="quiz-option" data-correct="false" data-explanation="LOWER() is a perfectly valid SQL function; the issue is that the index does not know about it."><span class="opt-letter">C</span>LOWER() is not supported in SQL WHERE clauses</div>
<div class="quiz-option" data-correct="false" data-explanation="Case sensitivity affects how values compare, but the fundamental issue is that the index does not store LOWER(email)."><span class="opt-letter">D</span>B+tree indexes are case-sensitive and cannot handle lowercase comparisons</div>
</div>
<div class="quiz-feedback"></div>
</div>
</div>
<!-- ── Interview Q&A ──────────────────────────────────────── -->
<div class="section-label">Interview Q&A</div>
<div class="qa-section">
<div class="qa-item">
<div class="qa-q">What is the difference between a clustered and non-clustered index?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">A clustered index determines the physical order of rows on disk — the B+tree leaf pages ARE the data pages. In InnoDB, the primary key is always the clustered index. There can be only one per table. Accessing a row by PK requires no extra lookup — the data is right at the leaf. A non-clustered index is a separate B+tree whose leaf pages store (indexed column, PK) pairs. To get row data, you first look up the index entry (getting the PK), then do a second B+tree traversal of the clustered index to get the row. This second lookup is a "key lookup" or "bookmark lookup." Covering indexes avoid this second hop by including all needed columns in the index. PostgreSQL does not have clustered indexes in this sense — all indexes are heap pointers (ctids).</div></div>
</div>
<div class="qa-item">
<div class="qa-q">When should you use a composite index versus separate single-column indexes?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">Use a composite index when queries frequently filter on multiple columns together (e.g., always filtering by dept AND hire_date). The composite index supports the combined predicate in a single B+tree traversal and can sort the result. Separate indexes on each column require the optimizer to use "index merge" — scanning both indexes and intersecting the result sets — which is often slower than a full table scan. Use separate indexes when: columns are queried independently in different queries with no consistent pairing; or cardinality of one column is so high that a single-column index gives excellent selectivity alone. The key question: do your queries always combine these columns, or sometimes use them independently?</div></div>
</div>
<div class="qa-item">
<div class="qa-q">How does the query planner decide whether to use an index or do a full table scan?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">The planner uses a cost model: it estimates the cost of each candidate plan and picks the cheapest. Cost is in abstract units combining CPU cost and I/O cost. For an index scan: cost = (index height traversals) + (number of matching rows × random_page_cost). For a seq scan: cost = (total table pages × seq_page_cost). Key factors: (1) Selectivity — if the condition matches 40% of rows, random I/O for each row is more expensive than a sequential scan. Typical crossover is 5–15% of rows. (2) Correlation — if a column is physically ordered by the index (e.g., auto-increment ID), random I/O cost is lower because pages are read sequentially. PostgreSQL tracks correlation in <code>pg_stats.correlation</code>. (3) Available memory — if the result fits in work_mem, a sort + scan may beat an index. Use EXPLAIN to verify: always look at estimated vs. actual rows — if they diverge, statistics are stale (run ANALYZE).</div></div>
</div>
<div class="qa-item">
<div class="qa-q">Explain partial indexes with a practical example.<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">A partial index stores only rows matching a WHERE predicate. Example: an orders table with 100M rows, where 99% have status = 'completed' and 1% have status = 'pending'. An index on status has terrible selectivity (two values, 50/50 for pending/completed). A partial index: <code>CREATE INDEX idx_pending ON orders (created_at) WHERE status = 'pending'</code> — now the index is tiny (1M rows), highly selective, and fast for the most time-critical query (finding pending orders). The query must include <code>WHERE status = 'pending'</code> for the planner to consider this index. Other partial index patterns: index only non-null values, index only undeleted rows, index only the hot partition of a time-series table.</div></div>
</div>
<div class="qa-item">
<div class="qa-q">What is index bloat and how do you fix it?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">Index bloat occurs when deleted or updated rows leave "dead" entries in the B+tree that are never compacted. A page with 50% dead entries still costs a full I/O to read. PostgreSQL's VACUUM marks dead tuples and makes space reusable, but does not compact pages or remove empty pages from the B+tree (VACUUM only re-links free space within existing pages). To actually shrink the index: run <code>REINDEX TABLE tablename</code> (locks the table) or <code>REINDEX CONCURRENTLY INDEX indexname</code> (no lock, PG 12+). MySQL's <code>OPTIMIZE TABLE</code> rebuilds clustered and secondary indexes. Detect bloat: in PostgreSQL use <code>pgstatindex()</code> — if <code>avg_leaf_density</code> is below 50%, bloat is significant. Monitor tables with high DELETE rates (log tables, soft-delete tables, job queues) especially.</div></div>
</div>
<div class="qa-item">
<div class="qa-q">What are GiST and GIN indexes? When would you choose each?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">GiST (Generalized Search Tree) is a framework for building balanced indexes for complex data types. It supports geometric operations (PostGIS point/polygon intersection), range types (int4range, tsrange overlap), full-text search (tsvector). GiST supports ordering operators and nearest-neighbor queries (<code>ORDER BY point <-> target LIMIT 10</code>). GIN (Generalized Inverted Index) builds an inverted index — maps element values to the set of rows containing them. Ideal for JSONB containment (<code>@></code>), array overlap (<code>&&</code>), and full-text search. GIN queries faster than GiST for containment. GIN is larger and slower to update (writes are deferred to a "pending list" and merged in background). Choose GIN for text search and JSONB; GiST for geometric/spatial queries, range types, and nearest-neighbor.</div></div>
</div>
<div class="qa-item">
<div class="qa-q">How do you find and remove unused indexes?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">PostgreSQL tracks index usage in <code>pg_stat_user_indexes.idx_scan</code> — the cumulative count of times this index was used for a scan since last statistics reset. Indexes with idx_scan = 0 after representative workload are candidates for removal. Also check: <code>pg_relation_size(indexrelid)</code> to know how much storage unused indexes consume; and <code>pg_stat_user_tables.n_tup_ins/upd/del</code> to understand the write overhead each unused index is adding. Caution before dropping: (1) statistics reset after PostgreSQL restart, so a short observation window may miss infrequent queries; (2) some indexes enforce uniqueness and are vital even with idx_scan = 0; (3) partial indexes for maintenance jobs may scan rarely but are still needed. Safe removal: rename the index first (<code>ALTER INDEX name RENAME TO name_to_drop</code>), wait 1–2 weeks, then drop if no complaints.</div></div>
</div>
<div class="qa-item">
<div class="qa-q">What is a BRIN index and when is it appropriate?<span class="qa-chevron">▾</span></div>
<div class="qa-a"><div class="qa-a-inner">BRIN (Block Range Index) stores the minimum and maximum value of a column for each range of disk blocks (default 128 pages per range). The index is tiny — just a few kilobytes even for billion-row tables — because it only stores summaries. For a range query, BRIN eliminates block ranges whose min/max cannot contain the search value, then scans the remaining blocks. This is only useful when column values are correlated with physical disk order — e.g., auto-increment IDs (always increasing, so old rows are on old blocks), or append-only timestamp columns in a time-series table. For randomly distributed values (UUID PKs, hashed values), BRIN provides no filtering benefit — every block range's min/max will overlap every query range. Do not use BRIN where a B+tree would be used; use it for large append-only tables where you need very fast writes and can tolerate sequential-scan-like reads bounded by block ranges.</div></div>
</div>
</div>
<div class="section-label">Further Reading</div>
<div class="reading-links">
<a class="reading-link" href="https://use-the-index-luke.com/sql/anatomy" target="_blank">B-Tree Anatomy (Use The Index, Luke)</a>
<a class="reading-link" href="https://www.postgresql.org/docs/current/indexes.html" target="_blank">PostgreSQL Index Types Documentation</a>
<a class="reading-link" href="https://www.postgresql.org/docs/current/pgstatindex.html" target="_blank">pgstatindex — Detecting Index Bloat</a>
<a class="reading-link" href="https://use-the-index-luke.com/sql/where-clause/functions/case-insensitive-search" target="_blank">Case-Insensitive Indexes (UTILU)</a>
</div>
<div class="topic-nav">
<a href="05-transactions.html" class="topic-nav-link"><div class="topic-nav-arrow">←</div><div><div class="topic-nav-label">Previous</div><div class="topic-nav-title">Transactions & ACID</div></div></a>
<a href="07-query-processing.html" class="topic-nav-link next"><div class="topic-nav-arrow">→</div><div><div class="topic-nav-label">Next</div><div class="topic-nav-title">Query Processing</div></div></a>
</div>
</div>
<footer class="site-footer">
<p class="footer-sub"><a href="../index.html">Back to Course</a> — DBMS Illustrated</p>
</footer>
<script src="../js/main.js"></script>
<script src="../js/demos.js"></script>
</body>
</html>