Dr. Dobb's Journal - June 2008 - (Page 40) d06keas_p4ds 4/10/08 1:47 PM Page 40 State of the Art byJeff Keasler Performance Portable C++ Taking full advantage of new architectures machine and compiler to compiler. It’s easy to switch between the Array-like and Struct-like implementations in Figure 1 by hiding the array details behind a class API. Listing One shows how a coordinate array is implemented as a performance portable Point class. There are two important features of the Point class implementation: • Methods are inlined. • Methods return direct references to the underlying data. Together, these two features let almost all compilers efficiently optimize most (if not all) class overhead, especially when interprocedural analysis has been enabled in the compiler optimization flags. If you use classes having the above form, you can quickly switch between array layouts as you port code. The easiest way to do this is to create a configuration header file with system-specific layout choices, and #include that configuration file at the top of each array class header file. If you don’t hide the array implementation as I describe here, you can end up completely rewriting your software when switching from one form of array layout to the other. Jeff is a computer scientist at Lawrence Livermore National Laboratory where he contributes to several software projects managed through the ASC program. Programmers have two basic ways of organizing arrays of data; see Figure 1. The performance of each choice can vary greatly as code is ported from machine to I apply these calculations to hundreds of thousands of mesh elements. I include enough elements in my benchmark to have a memory footprint of over 20 megabytes, but I organize elements in a cache-optimal way, so cache reuse occurs. My benchmarks have two interacting array classes. I use a Point class to store coordinates, and a shape-specific class to store the point indices that define the shape. The examples I’ve chosen have subtly different memory layouts and memory access patterns. The subtlety helps to emphasize how the interplay of algorithms and data layouts can influence the effectiveness of compiler optimizations in addition to memory latency. The Results Figures 2, 3, and 4 show the performance of Listings Two, Three, and Four, respectively. For each bar in the graphs, 20 runs were made, and the minimum time was used. There was little variance among the 20 runs since each run was made on a “dedicated” processor having no other users. Table 1 provides the processor/compiler details of each benchmark. All results within a given bar color are normalized against the minimum time for that color. This lets you measure the relative performance of a given data layout for a given environment. The Benchmarks To illustrate how array organization can affect performance, I use three examples that calculate area and volume (source-code listings are available online; see “Resource Center,” page 5): • Triangle area, 8 FLOPs/Triangle; see Listing Two online. • 2D Quadrilateral area, 8 FLOPs/Quadrilateral; see Listing Three online. • 3D Brick volume, 60 FLOPs/Brick (Hexahedron); see Listing Four online. (a) double x[10000] ; double y[10000] ; double z[10000] ; Listing One #define ML_STRUCT 0 #define ML_ARRAY 1 #if POINT_MEM == ML_ARRAY class Point { public: Point(const int size) : m_x(size), m_y(size) {} inline double &x(const int idx) { return m_x[idx] ; inline double &y(const int idx) { return m_y[idx] ; private: Point() ; std::vector m_x ; std::vector m_y ; }; #else /* ML_STRUCT */ class Point { public: Point(const int size) : m_p(size) {} inline double &x(const int idx) { return m_p[idx].x inline double &y(const int idx) { return m_p[idx].y private: struct Coord { double x, y ; } ; Point() ; std::vector m_p ; }; #endif } } (b) struct { double x,y,z ; } point[10000] ; ;} ;} Figure 1: (a) Array-like, (b) Struct-like. 40 Dr. Dobb’s Journal l www.ddj.com l June 2008 http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - June 2008 Dr. Dobb's Journal - June 2008 Contents Friday Night Fish Fry Alia Vox Developer Diaries There Must Be Contest Conversations Building a Test Harness for RTOS QT and Windows CE Software to Hardware Parallelization Performance Portable C++ Effective Concurrency The Agile Edge Swaine's Flames Dr. Dobb's Journal - June 2008 Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page Cover1) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page Cover2) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 1) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 2) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 3) Dr. Dobb's Journal - June 2008 - Contents (Page 4) Dr. Dobb's Journal - June 2008 - Contents (Page 5) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 6) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 7) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 8) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 9) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 12) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 13) Dr. Dobb's Journal - June 2008 - Developer Diaries (Page 14) Dr. Dobb's Journal - June 2008 - Developer Diaries (Page 15) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 16) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 17) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 18) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 19) Dr. Dobb's Journal - June 2008 - Conversations (Page 20) Dr. Dobb's Journal - June 2008 - Conversations (Page 21) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 22) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 23) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 24) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page IBM-1) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page IMB-2) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 25) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 26) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 27) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 28) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 29) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 30) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 31) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 32) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 33) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 34) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 35) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 36) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 37) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 38) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 39) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 40) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 41) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 42) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 43) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 44) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 45) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 46) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 47) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 48) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 49) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 50) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 51) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 52) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 53) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 54) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 55) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page 56) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page Cover3) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.