The art of defining the right problem: a fifteen-minute phone call beats a week of code.

Why this book

Before being a book, these were columns in Communications of the ACM, written by a Bell Labs researcher in the 1980s. The 2nd edition (2000) refreshed the machines and added three columns. Steve McConnell called it "a celebration of design in the small": fifteen short essays about the moment before you code.

These notes are based on the book's official companion site, which carries the flagship columns in full (the phone call, the envelope, the strings), the sketches, both epilogs and the appendices. The famous parts are exactly the parts I could read whole.

The ideas that stick

1The fifteen-minute phone call (and the bitmap)

A programmer calls Bentley: how do I sort a disk file? Bentley starts sketching a disk-based merge sort, about 200 lines and a week of work. Then he asks questions. The file: at most 10 million toll-free phone numbers, 7-digit integers, no duplicates, no attached data, and "about a megabyte" of free memory. Suddenly the problem is not "sort a file"; it is "sort 10 million distinct small integers in 1 MB".

The answer: represent the data as 10 million bits: one array allocated in RAM in a single block, where slot i is 1 if phone number i is in the file. Nothing ever gets sorted: position in the array plays the role of value. Ten million bits is 1.25 MB, and the book shows how to fit them into the available megabyte. Read the input, set the bits, then walk the bitmap once and write the numbers back out, sorted for free. A few dozen lines, a few hours of work, about ten seconds of run time. "The programmer told me about his problem in a phone call; it took us about fifteen minutes to get to the real problem and find the bitmap solution" (Column 1). And the moral, in one line: "defining the problem was about ninety percent of this battle". Saint-Exupéry gets the last word: perfection is reached when there is nothing left to take away.

Ten million numbers, one megabyte, ten seconds: the most famous reframing in programming literature.

2Aha!: the hand flip and the sorted signature

Column 2 is titled "Aha!": the flash of insight, the instant the right idea turns a hard problem trivial. It collects solutions that feel like magic tricks once you see them — and tedious once you understand them. The first: rotate an array of n elements left by d positions. Take [1, 2, 3, 4, 5], shift left by d = 2, and you want [3, 4, 5, 1, 2]. The obvious fix copies everything into a fresh array (new[i] = old[(i + d) % n]): it works, but it doubles the memory. Bentley refuses that luxury and sets the real constraint: the same rotation in place, without a single extra byte. His answer, three reversals and still linear time:

// Step 1: reverse the first d elements
[1, 2, 3, 4, 5]  →  [2, 1, 3, 4, 5]
// Step 2: reverse the remaining n-d elements
[2, 1, 3, 4, 5]  →  [2, 1, 5, 4, 3]
// Step 3: reverse the whole array
[2, 1, 5, 4, 3]  →  [3, 4, 5, 1, 2]  ✓

Zero allocation. The trick: see the array as two blocks, [1, 2] then [3, 4, 5], that you want to swap. Reverse each block, then reverse the whole: the two blocks have traded places without anything ever being copied. Doug McIlroy demonstrated it with his hands: flip left, flip right, flip both.

And in my language, concretely?

The book is in C, but the move is the same in JavaScript. Rotating "in place" means swapping cells inside the same array with a single temp variable, never a second array:

function reverse(a, lo, hi) {
  while (lo < hi) {
    const t = a[lo]; a[lo] = a[hi]; a[hi] = t;   // swap in place, zero allocation
    lo++; hi--;
  }
}
// rotate left by d: reverse(a,0,d-1); reverse(a,d,n-1); reverse(a,0,n-1)

But day to day you would write the clear copy, [...a.slice(d), ...a.slice(0, d)], and you would be right: memory is cheap, arrays are small, readability wins. The in-place trick only earns its cost when n is huge, RAM is tight, or a hot loop makes the garbage collector hurt. Go reads almost identically, with a[lo], a[hi] = a[hi], a[lo] for the swap.

Second trick: find every anagram family in a 230,000-word dictionary. Brute force compares all pairs — 14.7 hours, by the book's own math. The insight fits in one function: give each word a signature by sorting its letters alphabetically.

signature = lambda word: "".join(sorted(word))
signature("deposit")   # → "deiopst"
signature("dopiest")   # → "deiopst"  ← same signature → they're anagrams

Sort all words by their signature: anagrams land next to each other automatically. A three-stage Unix pipeline; the full dictionary runs in 18 seconds by the 2nd edition. The lesson common to both tricks: the right representation makes the algorithm trivial.

3The back of the envelope

The most reusable column of the book teaches you to estimate an order of magnitude before building, on the back of a napkin, in thirty seconds. The goal is not an exact figure, it is knowing whether an idea will run in one second or a thousand years, before you write a single line.

Two rules do the work. First, "two answers are better than one": estimate the same quantity two independent ways, and if they agree, trust the result. Bentley estimates the Mississippi's flow twice, the river's width times its speed, then the area of land whose rain drains into it times the yearly rainfall. Both land in the same order of magnitude, so the estimate holds.

Second, keep a few landmarks in your head to convert quickly. The handiest one here: a year is about π × 10⁷ seconds, that is 31.5 million, accurate to half a percent. That is Bentley's gem, remembered in a playful form, π seconds for a "nanocentury" (a billionth of a century).

The reflex in action. You are tempted to compare every pair in a set of a billion items. A billion squared is 10¹⁸ operations, and at a billion operations per second:

# a billion items compared pairwise
(10⁹)²  =  10¹⁸ operations
10¹⁸ ops ÷ 10⁹ ops/s         =  10⁹ seconds
# a year ≈ π × 10⁷ seconds
10⁹ s ÷ (3.15 × 10⁷ s/yr)    ≈  32 years

Thirty-two years for a loop you thought was harmless: verdict in before lunch, on the back of the envelope, without a single line of code.

The anecdote that justifies the whole column: Bob Martin, who ran a large software shop, reviewing a proposed system at the Olympics, times sending himself a one-character message and concludes the design works only if there are "at least a hundred and twenty seconds in each minute". Design rejected; the system shipped a year later worked flawlessly.

The humbling appendix: a ten-question estimation quiz where you give 90% confidence ranges; most people get 3 to 6 right instead of the expected 9. We are all overconfident, measurably.

4The TRS-80 that beats the Alpha

One problem (the maximum-sum contiguous subsequence), four algorithms, from cubic to linear. O(n³), "order n cubed", means: double the problem size and the time is multiplied by eight. O(n): the time doubles with the size, no more. (New to Big O? Grokking Algorithms builds it up from scratch, with pictures.) The book then stages the fight nobody forgets. In one corner: the cubic algorithm, compiled C, on a 533 MHz Alpha workstation. In the other: the linear algorithm, interpreted BASIC, on a 1970s TRS-80 running at 2.03 MHz. The crossover lands between n = 1,000 and n = 10,000; past it, the museum piece wins. At n = 1,000,000: 19 years for the Alpha, 5.4 hours for the TRS-80.

A 250× slower machine, an interpreted language, and victory anyway: Big O is not academic.

The 2nd edition's epilogue adds a wink: rerunning the 1st edition's measurements 14 years later, Bentley found his Pentium II "almost exactly one thousand times faster than the venerable VAX", with nearly identical algorithmic coefficients. Machines change by powers of ten; the math does not move.

5Binary search was published in 1946. A correct one took much longer.

To find a word in a paper dictionary, you don't start at page 1: you open in the middle, and depending on whether the word falls before or after, you only search one half. Binary search does the same in a sorted array: look at the middle element, throw away the half that cannot hold the target, repeat. Each step halves the space, so a million elements are searched in about twenty tries.

A childishly simple idea, and yet: the first binary search was published in 1946, the first truly correct one years later. Bentley uses it precisely because it is treacherous. The most famous bug hides in the line that computes the midpoint:

lo = 0; hi = n - 1;
while (lo <= hi) {
  mid = (lo + hi) / 2;       // ✗ lo + hi can exceed the integer ceiling
  mid = lo + (hi - lo) / 2;  // ✓ same midpoint, never near it
}

Why the trap? A machine integer has a ceiling. On a very large array, the sum lo + hi punches through it and "wraps" to a negative number, silently corrupting the midpoint: an endless loop or a wrong answer. The form lo + (hi - lo) / 2 gives the exact same number, and the algebra proves it: lo + (hi - lo)/2 = lo + hi/2 - lo/2 = (lo + hi)/2. But it only ever handles values already within range, since hi - lo is smaller than hi: it never approaches the ceiling.

This precise bug slept in Java's standard library for nine years, spotted only in 2006.

Bentley uses this "simple" algorithm to teach a rare discipline: proving code instead of trying it. His tool, the loop invariant: a sentence that must stay true at every turn of the loop, whatever it just did. Here: "if the target is in the array, it lies somewhere between index lo and index hi." Write that sentence first, then derive each line from it. The code stops being a guess you check afterwards; it becomes a proof you build.

Column 5 adds the tooling that makes the proof practical: a test harness (a throwaway program that runs the function on hundreds of cases), assertions (checks that stop dead the moment an invariant breaks), and automated timing. Nine versions of binary search ship with the book, one deliberately wrong, so the harness can prove it catches it. That is already TDD (Test-Driven Development, write the tests before the code), two decades before the term existed, in the C of the day.

6Debugging is an exercise in disbelief

The debugging section is a short anthology of "impossible" bugs that all had a dull explanation. A programmer logs in fine sitting down, never standing up: seated, he touch-types from memory and hits the right keys; standing, he looks at the keyboard, where two keycaps had been swapped, and follows the wrong labels. A Chicago banking system crashes whenever a customer is named "Quito": the terminal reads those letters as its quit command and drops the connection. A program that "works once, then twice" and fails after that: a variable set up at launch was never reset before the next run.

The mindset fits in one image, which Bentley borrows from Rick Lemons: the best debugging lesson of his life came from a magic show, half a dozen impossible tricks in a row. None was truly impossible, and neither is your bug. In both cases you were watching the wrong hand. "Debugging is usually about refusing to believe": rule out the supernatural explanation and the logical one surfaces. Day to day, the bug that only shows up in production, or only after the Friday deploy, has nothing magical about it. Look for what changed: the environment, the cache, the order of requests.

In the front row of a magic show, a developer takes notes while squinting at the other hand of the magician pulling a dove from a hat — The best debugging lesson is a magic show: the impossible always has a boring explanation, in the other hand.

7Strings of pearls: the Bible, the Iliad and the grandfather of LLMs

Column 15, added in 2000, takes on large texts, and each problem comes down to picking the right data structure. First, count how often each word appears in the King James Bible (789,616 words in all, 29,131 distinct, "the" alone 62,053 times). The natural tool: a hash table, the structure that maps each key to its value in one direct lookup, exactly what a PHP associative array or a JavaScript object is. Here the key is a word, the value its counter. Bentley writes a hand-rolled one, 30 lines, tuned for English words, and it beats the C++ standard library's generic version by a factor of ten. The lesson: when you know the exact shape of your data, a small custom structure crushes the all-purpose tool.

Second problem: find the longest passage that repeats in a text, say the Iliad. Comparing every position against every other would be far too slow. The trick goes through a suffix array. A suffix is the tail of the text from a given position; for "ILIAD" the suffixes are ILIAD, LIAD, IAD, AD, D. Sort them all alphabetically. The point: if a passage appears twice, the two suffixes that start at its two occurrences begin with the same words, so the sort drops them side by side. All that is left is to compare each suffix to its neighbor and keep the longest shared opening. On the Iliad (807,503 suffixes), 4.8 seconds are enough to unearth a full sentence that Juno speaks and Minerva repeats word for word to Ulysses.

The showstopper: manufacture fake text that sounds like a source. The method is a Markov chain: at each step, look at the last k words produced (k is often 2), then draw the next word at random, weighted by the frequencies seen in the source. If "the sea" is followed by "was" two times out of three in the original, you pick it two times out of three. At order 2 (k = 2), trained on the book itself, the prose it produces is unsettlingly plausible; Bentley notes that "for purposes of parody, order-2 text is usually juiciest." In 2000 you wrote it in an evening. Take that exact principle, inflate it to a few billion parameters, train it on the whole internet, and you get the assistants we talk to all day long. The pearl aged into a pension fund.

8The rules that remain

Both epilogs are fictional interviews where Bentley grills himself ("people who interview themselves shouldn't criticize writing styles"). The six-item design checklist that emerges is the book's DNA: work on the right problem first, look at the data before designing, use the back of the envelope to kill bad ideas early, build a prototype before committing, keep it simple, and strive for elegance.

Each item is obvious in retrospect; each is routinely skipped.

Tom Duff's law gets a full citation as the best answer to "library or hand-rolled?": "whenever possible, steal code." In 2026 that's composer require, npm install and Stack Overflow, but the boundary has shifted: every dependency is now an attack surface and a liability (left-pad, the 2024 xz backdoor), so for a small need the hand-rolled version, which AI makes cheap, often beats the package that drags in forty others. The exception hardens in banking or health: security-critical code, encryption, password hashing, authentication, is always stolen from an audited library, never hand-rolled.

The line that best summarizes the book's aesthetic: "the cheapest, fastest and most reliable components of a computer system are those that aren't there." Every column ends up proving this. The bitmap sort works because no comparisons happen. The anagram pipeline works because one well-chosen transformation makes sorting trivial. Absence of mechanism is the mechanism.

The code-tuning appendix (Appendix 4) lists measured performance gains on specific operations — with a warning printed in bold: measure on YOUR machine, because the cost model changes with every CPU generation. The appendix's own example: a 12-byte allocation request in C actually consumed 48 bytes, because the allocator rounds up to its internal block size. Knowing the real cost is the whole point.

My take, honestly

I'm a web developer, not an algorithms researcher, and this book made me feel dumb in the best way. The phone-call story is forty years old, and I still catch myself doing what it warns against: rushing to build the feature I was handed instead of asking what the real problem is. How many times have I written the query, the loop, the cache, only to realize one good question in a meeting would have erased half the work. I now keep "fifteen minutes of questions before a week of code" as a literal budget.

What helped me most is the reflex of estimating an order of magnitude before diving in. The loop inside a loop that freezes the page at 10,000 rows, the N+1 query (one extra query per row shown) that passes locally and dies in production, the job you run "just to see" that grinds for three hours: Bentley would have done the math on a napkin and called it in thirty seconds. And the Markov chapter is uncanny to reread in 2026: the direct ancestor of the assistants I talk to all day fits in a few dozen readable lines, just inflated to a few billion parameters.

Not everything aged the same. The numbers are a Pentium-II time capsule, the code is old C that Bentley scolds himself for, and some of the middle columns get skimmed more than read. The chapter proving binary search correct took me two passes, I'll admit. But the reflexes have no expiry date: define the problem, estimate before building, refuse to believe your bugs. And the first has become THE skill of the AI era: an assistant will happily build you the on-disk merge sort for a week; only the human who asks the right questions gets the ten-second answer.

Odilon

Who is it for?

Read it if

You jump to code before interrogating the problem (the phone call is for you)
You can never tell if a thing will take seconds or hours: the envelope fixes that
You enjoyed Grokking Algorithms and want the artisan-grade sequel
You like essays with war stories more than textbooks with theorems

Skip it if

Dated examples break the spell for you: the machines here are museum pieces
You want a complete algorithms reference: it is fifteen essays, not a syllabus
You never touch performance-sensitive code and never will

Going further

The estimating mindset pairs with my Python course for actually timing things. In the library, Grokking Algorithms is the gentle on-ramp to the same material, Write Great Code vol. 2 continues the code-tuning appendix at machine level, and The Pragmatic Programmer shares the estimate-first culture (and the love of small sharp tools).

Programming Pearls