Dr. Dobb's Journal - June 2008 - (Page 50) d06sutt_p3ds 4/11/08 11:19 AM Page 50 Effective Concurrency continued from page 48 (see Figure 2). So when you ask for just one byte on a line that’s not currently in cache, you incur two main costs: • Speed: A cache miss where the cache hardware has to load the entire line from memory. • Space: Cache overhead for storing the entire line in cache, even if you only ever touch one byte from the line. And now comes the fun part. On multicore hardware, if one core writes to a byte of memory, then typically, as part of the hardware’s cache coherency protocol, that core will automatically (read: invisibly) take an exclusive write lock on that cache line. The good news is that this prevents other cores from causing trouble by trying to perform conflicting writes. The sad news is that it also means, well, taking a lock. and y, and assume we’ve disabled optimizations to prevent the optimizer from eliminating the loops entirely in this toy example: // Thread 1 for( i = 0; i < MAX; ++i ) { ++x; } // Thread 2 for( i = 0; i < MAX; ++i ) { ++y; } Sharing and False Sharing (Ping-Pong) Consider the following code where two threads update two distinct global integers x Question: What relative performance would you expect if running Thread 1 in isolation versus running both threads: • On a machine with one core? • On a machine with two or more cores? On a machine with one core, the program would probably take twice as long to run, as we’d probably get the same throughput (additions/sec), maybe with a little overhead for context switches as the operating system schedules the two threads interleaved on the single core. On a machine with two or more cores, we’d probably expect to get a 2x throughput improvement as the two threads each run at full speed on their own cores. And that is what in fact will happen…but only if x and y are on different cache lines. If x and y are on the same cache line, however, only one core can be updating the cache line at a time, because only one core can have exclusive access at a time—it’s as if the cache line is a token being passed between the threads/cores to say who is currently allowed to run. So the situation is exactly as if we had explicitly written: // Thread 1 for( i = 0; i < MAX; ++i ) { lightweightMutexForXandY.lock(); ++x; lightweightMutexForXandY.unlock(); } // Thread 2 for( i = 0; i < MAX; ++i ) { lightweightMutexForXandY.lock(); ++y; lightweightMutexForXandY.unlock(); } PC Presents Bug of the Month #553 There were a lot of red faces at the university when an honorary degree was awarded to a Mr. Bin Laden. What went wrong? Visit our web site at www.gimpel.com l. st ge y oo nd on sl e t ki e l uou ar an Th n tw m nti of . . co sed s of . ti ry ver to ad e his th in -lint 8.0 for C/C++ #include #define DOCTORATE 1 #define CHOICE 2 const char *a[2][3] = { {"Hitler", "Stalin", "Bin Laden"}, {"Clinton", "Bush", "Powell" } }; int main() { const char **array; #if D0CT0RATE array = a[1]; #else array = a[0]; #endif printf( "An honorary doctorate award " " is issued to %s\n", array[CHOICE] ); } PC-lint for C/C++ will catch this and many other bugs. It will analyze a mixed suite of C and C++ modules to uncover bugs, glitches, quirks and inconsistencies. Not your Grandpa’s lint: PC-lint has introduced several spectacular and revolutionary innovations in the art of static program analysis. Taking clues from initializers, assignments, and conditionals, variable and member values are tracked, enabling reports on potential uses of null pointers and out-of-bounds subscripts. New with Version 8: Interfunction value tracking – Actual argument values are used to initialize parameters; return values are computed; a multipass operation (you control the number of passes) allows you to plumb the depths of function behavior to arbitrary levels. PC-lint for C/C++ Plus Our Traditional C/C++ Warnings: Uninitialized variables, inherited non-virtual destructors, strong type mismatches,ill-formed macros, inadvertent name-hiding, suspicious expressions, etc., etc. Full Language Support for ANSI/ISO C and C++. Numerous compilers/ libraries supported. Runs on Windows, MS-DOS, and OS/2. $239 FlexeLint for C/C++ The same great product for other operating systems. Runs on all UNIX systems, VMS, mainframes, etc. Distributed in shrouded C source form. Call for pricing. 30 Day Money Back Guarantee Gimpel Software Serving the C/C++ Community for 22 Years. CALL TODAY (610) 584-4261 Or FAX (610) 584-4266 www.gimpel.com PC-lint and FlexeLint are trademarks of Gimpel Software 50 Dr. Dobb’s Journal l www.ddj.com l June 2008 Which of course is exactly what we said we would never willingly do: Only one thread can make progress at a time. This effect is called “false sharing” because, even though the cores are trying to update different parts of the cache line, that doesn’t matter; the unit of sharing is the whole line, and so the performance effect is the same as if the two threads were trying http://www.gimpel.com http://www.gimpel.com http://www.gimpel.com http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - June 2008 Dr. Dobb's Journal - June 2008 Contents Friday Night Fish Fry Alia Vox Developer Diaries There Must Be Contest Conversations Building a Test Harness for RTOS QT and Windows CE Software to Hardware Parallelization Performance Portable C++ Effective Concurrency The Agile Edge Swaine's Flames Dr. Dobb's Journal - June 2008 Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page Cover1) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page Cover2) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 1) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 2) Dr. Dobb's Journal - June 2008 - Dr. Dobb's Journal - June 2008 (Page 3) Dr. Dobb's Journal - June 2008 - Contents (Page 4) Dr. Dobb's Journal - June 2008 - Contents (Page 5) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 6) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 7) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 8) Dr. Dobb's Journal - June 2008 - Friday Night Fish Fry (Page 9) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 12) Dr. Dobb's Journal - June 2008 - Alia Vox (Page 13) Dr. Dobb's Journal - June 2008 - Developer Diaries (Page 14) Dr. Dobb's Journal - June 2008 - Developer Diaries (Page 15) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 16) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 17) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 18) Dr. Dobb's Journal - June 2008 - There Must Be Contest (Page 19) Dr. Dobb's Journal - June 2008 - Conversations (Page 20) Dr. Dobb's Journal - June 2008 - Conversations (Page 21) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 22) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 23) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 24) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page IBM-1) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page IMB-2) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 25) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 26) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 27) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 28) Dr. Dobb's Journal - June 2008 - Building a Test Harness for RTOS (Page 29) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 30) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 31) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 32) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 33) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 34) Dr. Dobb's Journal - June 2008 - QT and Windows CE (Page 35) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 36) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 37) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 38) Dr. Dobb's Journal - June 2008 - Software to Hardware Parallelization (Page 39) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 40) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 41) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 42) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 43) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 44) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 45) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 46) Dr. Dobb's Journal - June 2008 - Performance Portable C++ (Page 47) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 48) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 49) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 50) Dr. Dobb's Journal - June 2008 - Effective Concurrency (Page 51) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 52) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 53) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 54) Dr. Dobb's Journal - June 2008 - The Agile Edge (Page 55) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page 56) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page Cover3) Dr. Dobb's Journal - June 2008 - Swaine's Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.