Stop Misquoting Donald Knuth!

DISCLAIMER: The following are my own personal opinions and are not shared, sanctioned or endorsed by anybody in particular, especially my employer. Any criticisms expressed herein are of a general nature and are neither inspired by nor directed at any person, entity, or product in particular.  Links to other online materials are used only for the purposes of illustration and are not meant to disparage or hurt the feelings of anybody in particular.

The Free Lunch ended a long time ago, and there are indications that the multi-core free lunch won’t last much longer.  Special purpose accelerators will become much more common, but by definition they are special purpose and cannot solve every problem.  We need to start paying as much attention to the performance, efficiency, and scalability of our code as we do to all the other factors. Will this waste resources is as important a question as will this break encapsulation or will this be maintainable.  I am not the only person to have these ideas. Lots of other people have discussed it relatively recently.  The fact that not all of them are graphics people suggests that we may be seeing the beginnings of a move in the right direction.

Things People Say

The are few pieces of conventional wisdom that we often hear about code optimization:

  • Optimize code only when necessary
  • Only optimize the bottlenecks, and only after they are identified
  • Micro-optimization is a waste of time
  • Constant factors don’t matter
  • Engineering time costs more than CPU time
  • The machine’s so fast it won’t matter
  • Nobody will notice an occasional delay
  • We can just buy more servers

It’s easy to see where this attitude comes from.

For most of the development cycle, most teams spend most of their time writing code, testing code, reviewing code, debugging code, refactoring code, and ensuring that their functional requirements have been met (all the while reacting to inevitable design changes). Putting it all together takes time, a lot of time, and there are a lot of things competing for our attention, and as a result, running time is usually far from people’s minds, because so much effort is required just to get things running at all.

Performance tuning is neither necessary nor desirable once the performance is within the required limits. Considerable time and thus money can be wasted chasing small inefficiencies, and in many cases the effort is rightly deemed futuile if the running time is already less than the time it takes a human being to notice that the job is done. Adequate is often good enough, and better still if it can be pulled off more quickly and with less effort.

Modern computers are fast, ridiculously fast. Our machines are so absurdly fast that the threshold of adequacy is easily reached. Typical applications can and do waste an incredible amount of their capacity and get away with it.

Donald Knuth once wrote these unfortunate words:

” We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”

This statement is often quoted in an attempt to de-emphasize code performance in favor of other factors which are considered more important:

  • Ease of use
  • Rapid development
  • Clarity and readability
  • Reusability
  • Risk of accidental bugs
  • Extensibility
  • Elegance
  • Object-Oriented Purity

There is a whole class of developer out there who rejects any optimization beyond occasional attention to asymptotic complexity.  They believe that their responsibility ends once they’ve picked the theoretical best algorithm, and that at this point things are as good as they’re likely to get.  Sometimes, they don’t even go that far, settling for n log n because somebody already wrote Quicksort for them.  They either won’t bother to do better, or they don’t think they can.  Note that the one can cause the other.

The Slippery Slope

The problem with this attitude is that it leads to complacency.  It allows the cycle eaters to multiply out of control.  We gloss over the small things, forgetting that the sum of a large number of small numbers is itself a large number. Waste creeps in, little by little, because we let it.

Bounds checks here, and exceptions there.

Integer overflow checks.

Passing smart pointers by value for no particular reason.

Using the wrong integer type.

Extra indirections just to speed up our compiles.

Extra temporary allocations for stylistic reasons.

Ignoring data locality.

All that stuff Mike Acton wrote about.

I’ve listed lots of relatively low-level things up there, but that’s just because it’s the level I work at. The same sorts of mismanagement can and do occur at every level of the stack, and in every language. An extra string copy here, a spurious string->int conversion there. A missing database index, a poor schema design, webpages loading copious amounts of JavaScript. The same mentality on a different scale. We just don’t take small efficiencies seriously, and there are consequences.

The worst sorts of slippery slopes are so close to flat that we don’t realize we’re on them.  Sometimes, our users’ machines are so ridiculously fast that the waste goes unnoticed, but other times, our user experiences annoying and unnecessary delay. And still other times, we utterly fail to function under load.

If you tell yourself, it’s only a malloc, it’s nothing, and you do this often enough, you will end up with 25000 temporary allocs for a single keystroke.  They may only take 50ns each, but I type 529 characters per minute.

When these sorts of things are pointed out, people aught to respond by fixing the problem, so that they can deliver a better product.  Instead, they typically start arguing with you about why it’s not that big a deal.

Knuth and Small Efficiencies

The irony about Knuth’s quote is that the people who most often quote it would be horrified by its original context. Knuth made this statement in an article in which he argued in favor of the careful use of go-to statements for micro-optimization in performance-intensive loops. He would probably flunk a modern code review.  You can find the quote in the linked PDF, page 8, second column.

Even more ironic is something else that Knuth wrote in the same article. This gem of a paragraph, published in 1974, still rings true 40 years later:

The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise- and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.

Considering performance in the small is not “premature optimization”, it is simply good engineering, and good craftsmanship. As software engineers, we have largely forgotten about our obligation to create quality programs where efficiency is concerned. We have built for ourselves a culture in which constant factors are the compiler’s problem and the hardware’s problem, in which our productivity, comfort, and profit are more important than the effective utilization of the user’s resources, and the quality of their experience under load.

As a user, I am tired of this.

I am tired of slow load times. The ratio of bytes loaded to load time should be very close to the I/O throughput of the machine. If it is not, somebody is wasting my time.  I am tired of programs not stopping immediately, the instant I click the little X, because somebody is traversing a large reference graph and doing lots of itty-bitty deletes. I am tired of seeing progress bars and splash screens.

As a developer, I am tired of my IDE slowing to a crawl when I try to compile multiple projects at a time. I am tired of being unable to trust the default behavior of the standard containers. I am tired of my debug builds being unusably slow by default.

As a citizen of planet Earth, I am tired of all the electricity that gets wasted by organizations who throw hardware at software problems, when a more efficient implementation might allow them to consume much, much less, and spend less money powering it all.

We need to stop wasting our customers’ resources.  We need to build a developer culture in which efficiency loss is not tolerated unless it is done consciously, to satisfy a functional requirement, or achieve some tangible benefit beyond mere expediency.    This is not merely a matter of taste or opinion, it is a matter of professional ethics.