MSDN Magazine - October 2008 - (Page 139) KEnny KErr WindoWs With C++ Exploring High-Performance Algorithms In the concurrency space, issues such as coordination, asynchronous behavior, responsiveness, and scalability get the lion’s share of the attention. These are the high-level topics that developers think about when designing applications. But, perhaps due to inexperience or the lack of good performance tools, some equally important topics are often overlooked. A case in point is highperformance algorithms. At the enterprise level, developers dwell on issues such as distributed file systems and caches, clusters, message queues, and databases. But what good are all those things if, at the core, you have inefficient algorithms and data structures? Algorithm efficiency is not as straightforward as you might think. A well-designed algorithm on a single processor can often outperform an inefficient implementation on multiple processors. But a well-designed algorithm today also needs to show measurable scalability and efficiency when multiple processors are available. To complicate matters further, algorithms optimized for a single processor are often difficult to parallelize, whereas slightly less-efficient algorithms can often provide better performance on multiple processors. As an example, I’m going to use Visual C++ and walk through the development of a fairly simple algorithm, but it’s not entirely trivial even if it may seem that way at first glance. Here’s what we need to implement: void MakeGrayscale(BYTE* const const const bitmap, int width, int height, int stride); Figure 1 Inefficient Single-Threaded Algorithm void MakeGrayscale(BYTE* bitmap, const int width, const int height, const int stride) { for (int x = 0; x < width; ++x) for (int y = 0; y < height; ++y) { const int offset = x * sizeof(Pixel) + y * stride; Pixel& pixel = *reinterpret_cast (bitmap + offset); } MakeGrayscale(pixel); } be obtained by blending 30% red, 59% green, and 11% blue for a given color pixel. Here’s a simple function to convert just one pixel to gray scale: void MakeGrayscale(Pixel& pixel) { const BYTE scale = static_cast (0.30 * pixel.Red + 0.59 * pixel.Green + 0.11 * pixel.Blue); pixel.Red = scale; pixel.Green = scale; pixel.Blue = scale; } The byte offset of a particular pixel within the bitmap can be obtained by calculating the product of its horizontal position and the size of a pixel, along with the product of its vertical position and the stride, and then adding these values together: offset = x * sizeof(Pixel) + y * stride The bitmap parameter points to an image consisting of 32 bits per pixel. Again, this is to keep this article focused. The absolute value of the stride indicates the number of bytes from one row of pixels in memory to the next. Padding may exist at the end of each row. The stride’s sign indicates whether these rows are top-down with a positive stride or bottom-up with a negative stride in memory. Let’s get the basics out of the way first. We can use the following structure to represent a pixel in memory: typedef unsigned char BYTE; // from windef.h struct Pixel { BYTE Blue; BYTE Green; BYTE Red; BYTE Alpha; }; How then might you implement the MakeGrayscale function? If you jumped right in without further consideration you might write something along the lines of the algorithm in Figure 1. At first glance this might look reasonable and, if passed, a small bitmap might even appear to work quite well. But what about larger bitmaps? What about a 20,000 pixel by 20,000 pixel bitmap? I happen to have a Dell PowerEdge with a Quad-Core Intel Xeon X3210 processor. This machine has a clock speed of 2.13GHz, a 1066MHz front-side bus, 8MB of L2 cache, as well as a variety of other cool features. Admittedly it’s not the very latest Intel Xeon processor, but it is certainly nothing to scoff at. It’s being driven by the 64-bit version of Windows Server 2008 and will do nicely for performance testing. Send your questions and comments to mmwincpp@microsoft.com. October 2008 139 A quick Web search reveals that a reasonable gray scale value can
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.