“More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason – including blind stupidity.”
– W.A. Wulf
“On the other hand, we cannot ignore efficiency.”
– Jon Bentley
Many software engineers recommend what I call the “procrastination approach” to optimization. Delay optimization as much as possible, and don’t do it if you can avoid it. I agree with the basic premise. Optimizing too early or too often is not a good approach to engineering. Better to have a program that runs than a fast program that crashes. On the other hand, you’re not likely to write a successful app these days without doing optimization at some point in the process. Your compiler can help you, but you as a programmer understand more about your application than the compiler. As Michael Abrash puts it, the best compiler is “between your ears.”
There are many levels of optimization, but I’m going to focus on one in particular: C++ optimizations. Some of these techniques apply to other languages as well – like Java – but most are specific to C++. I’ll also cover how to configure your compiler for maximum C++ efficiency.
I originally wrote this document in 1998. A lot has changed since then, though many of the techniques listed are still relevant and valuable depending on your platform and compiler. As always, never accept optimization techniques at face value. Measure, measure, measure.
All of the examples are in C++. The code is designed to compile with any standard ANSI C++ – compliant compiler. Some of the more complex techniques involve templates and the Standard Template Library. I used Microsoft Visual C++ 6.0 for the example programs, targeting PCs running Microsoft Windows 95/98 or NT.
Except where noted, all benchmarks and profiling were done on a Pentium II – 400MHz Dell Dimension XPS400 running NT 4.00.1381. Most profiling runs were done with compiler optimizations disabled to prevent any compiler-specific options influencing the results.
All performance graphs show relative performance. If the unoptimized run takes 200mS and the optimized run takes 100mS, the optimized run will be shown as twice as tall as the unoptimized run (i.e. twice as fast). In other words, taller is better.
Most code examples use the following C++ objects for comparison
- string (standard C++ basic_string<char> class with an average of 32 characters per string)
- complex (standard C++ complex<double> class containing two double values)
- bitmap (bitmap class with expensive default and copy ctor; average of 10000 pixels)
There’s a right way and a wrong way to do optimization. Here’s some strategies common to all programming endeavors that work and don’t work.
- Optimization Strategies that Bomb
- Optimization Strategies that Work
- Example of Selecting the Proper Algorithm
When you start working on your next app and begin to think about coding conventions, compilers, libraries, and general C++ issues, there are many factors to consider. In this section I weigh performance issues involved with C++ design considerations.
- Take advantage of STL containers
- Consider using references instead of pointers
- Consider two-phase construction
- Limit exception handling
- Avoid Runtime Type Identification
- Prefer stdio to iostream
- Evaluate alternative libraries
Defy the software engineering mantra of “optimization procrastination.” These techniques can be added to your code today! In general, these methods not only make your code more efficient, but increase readability and maintainability, too.
- Pass class parameters by reference
- Postpone variable declaration as long as possible
- Prefer initialization over assignment
- Use constructor initialization lists
- Prefer operator= over operator alone
- Use prefix operators
- Use explicit constructors
Your app is up and running. The data structures are ideal, the algorithms sublime, the code elegant, but the program – well, it’s not quite living up to its potential. Time to get drastic, and with drastic measures, there are tradeoffs to consider. These optimizations are going to make your code less modular, harder to understand, and more difficult to maintain. They may cause unexpected side effects like code bloat. Your compiler may not even be able to handle some of the more advanced template-based techniques. Proceed with caution. Arm yourself with a good profiler.
- Inline functions
- Avoid temporary objects: the return value optimization
- Be aware of the cost of virtual functions
- Return objects via reference parameters
- Consider per-class allocation
- Consider STL container allocators
- The “empty member” optimization
- Template metaprogramming
- Copy on write
A good compiler can have a huge effect on code performance. Most PC compilers are good, but not great, at optimization. Be aware that sometimes the compiler won’t perform optimizations even though it can. The compiler assigns a higher priority to producing consistent and correct code than optimizing performance. Be thankful for small favors.
- C language settings
- C++ language settings
- The “ultimate” compiler settings
- Use the novtable option for abstract classes (Microsoft Visual C++)
- Indicate functions that don’t throw exceptions
- Use the fastcall calling convention (Microsoft Visual C++)
- Warning: Unsafe optimizations
- STL Container efficiency table
- Relative costs of common programming operations
- C code tuning and C++ efficiency resources
Pete Isensee has worked on products ranging from database software to Internet applications to games. He’s even done all three simultaneously, creating multiplayer Internet games at WON.net and Emerald City Technologies. Pete has his degree in Computer Engineering. He’s been programming in C since 1986 and in C++ since 1993, focusing on class design, templates, robustness and optimization technology.
Pete’s homepage is www.tantalon.com
Special thanks to Brian Fiete, Melanie McClaire, Brian Ruud, the WON Viper and Titan teams, the HyperBole X-Files engineering team, my favorite gurus Steve McConnell and Scott Meyers, and my favorite girls Kristi, Ali and Tayla.