Dr. Dobb's Journal - April 2008 - (Page 20) d04scarp_p4ds 2/13/08 11:45 AM Page 20 Core Technology by Daniele Paolo Scarpazza, Oreste Villa, and Fabrizio Petrini String Search on Multicore Processors Mapping fundamental algorithms onto parallel hardware While your current PC may be dual- or quad-core, the one sitting on your desk in a couple of years Daniele, Oreste, and Fabrizio were affiliated with the Pacific Northwest National Laboratory when this article was written. Daniele and Fabrizio have since moved to the Cell Solutions department of IBM. They can currently be contacted at scarpaz@scarpaz.com. will be a “many-core.” Multicores are the future, and because programming them is so different than programming for traditional processors, you must prepare for the change now, especially in mapping basic algorithms efficiently onto the new hardware. String searching is one of these basic algorithms. It has a host of applications, including search engines, network intrusion detection, virus scanners, spam filters, and DNA analysis, among others. The Cell processor, with its multiple cores, promises to speed-up string searching a lot. In this article, we show how we mapped string searching efficiently on the Cell. We present two implementations: • The fast implementation supports a small dictionary size (approximately 100 patterns) and provides a throughput of 40 Gbps, which is 100 times faster than reference implementations on x86 architectures. • The heavy-duty implementation is slower (3.3–4.3 Gbps), but supports dictionaries with tens of thousands of strings. This task is not trivial. We had to change our algorithm significantly to reach top performance. In particular, to exploit the memory subsystem at its best, we employ a pipelined parallelization strategy, and we shuffle the data layout to fight congestion—techniques that are unfamiliar to most programmers of traditional architectures. The source code that implements the techniques we present here is available electronically; see “Resource Center,” page 5. tion, which checks the payload against a database containing signatures of a large number of threats. But with network links getting faster (10 Gbps Ethernet, for instance), performing on-the-fly deeppacket inspection isn’t child’s play. For example, running the Snort (www.snort.org) intrusion-detection system on a Pentium 4 PC only ensures a filtering throughput between 200–400 Mbps. This is why network appliance manufacturers often employ specialized hardware like Field-Programmable Gate Arrays (FPGAs) or Application-Specific Instruction Processors (ASIPs). The Cell processor is a new player in this field, with the potential to offer lots of computing power at low price. In fact, its large production volumes (it’s used in the Sony PlayStation 3) contribute to keeping its price low. Programming the Cell: Rethinking Bottom-Up The Cell is so different from traditional processors that you can’t get the best out of it unless your code explicitly takes advantage of its hardware. The Cell contains nine processors—one Power Processing Element (PPE), a 64-bit PowerPC-like processor that is mainly used to run the operating system and load the tasks, and eight worker elements called “Synergistic Processing Elements” (SPEs). SPEs have 128-bit-wide registers and Single Instruction Multiple Data (SIMD) instructions. Why Is String Searching So Important? The Internet is a dirty place, with malware, spyware, spam, and viruses trying to penetrate your systems through all possible vulnerabilities. Undesired traffic cannot be filtered anymore on the basis of header information. What’s needed is“deep-packet inspec20 Dr. Dobb’s Journal l www.ddj.com l April 2008 http://www.snort.org http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - April 2008 Dr. Dobb's Journal - April 2008 Contents Hmmmm Alia Vox Developer Diaries Dr. Dobb's Excellence in Programming Award Conversations Fast String Search on Multicore Processors The Byzantine Generals Problem Optimizing Math-Intensive Applications with Fixed-Point Arithmetic Random Numbers in a Range Using Generic Programming The Agile Edge Effective Concurrency Swaine's Flames Dr. Dobb's Journal - April 2008 Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page Cover1) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page Cover2) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 1) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 2) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 3) Dr. Dobb's Journal - April 2008 - Contents (Page 4) Dr. Dobb's Journal - April 2008 - Contents (Page 5) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 6) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 7) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 8) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 9) Dr. Dobb's Journal - April 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - April 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 14) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 15) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Excellence in Programming Award (Page 16) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Excellence in Programming Award (Page 17) Dr. Dobb's Journal - April 2008 - Conversations (Page 18) Dr. Dobb's Journal - April 2008 - Conversations (Page 19) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 20) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 21) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 22) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 23) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 24) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 25) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 26) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 27) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 28) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 29) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 30) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 31) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 32) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 33) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 34) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 35) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 36) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 37) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 38) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 39) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 40) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 41) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 42) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 43) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 44) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 45) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 46) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 47) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 48) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 49) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 50) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 51) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 52) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 53) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 54) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 55) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page 56) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page Cover3) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.