Dr. Dobb's Journal - September 2008 - (Page 37) d09kraus_p2db 7/14/08 11:00 AM Page 37 by Kirk J. Krauss State of the Art Matching Wildcards: An Algorithm A simple wildcard text-matching algorithm in a single while loop Some programs need to compare a lot of text strings quickly. Text output filters, search engines, and natural-language processors are examples of applications where rapid text string comparison is important. Besides simple character-by-character comparisons, many text comparison routines need to support matching on wildcard characters. One of the strings might treat the “*” character as a special wildcard character; for example, *sip* should match mississippi. Kirk is a software developer working on IBM Rational PurifyPlus. He can be contacted at kjkrauss@us.ibm.com. When I searched for a textbook wildcard string-matching algorithm that would fit this need, I discovered that a good algorithm is harder to find than I had expected. The algorithms that I found posted on the Internet were buggy, slow, and/or needlessly complex—typically all three. So I developed my own. Some colleagues then told me that I could have found similar algorithms among open-source regular expression matching routines. I looked at some of those routines, but the ones that I found use recursion. I find recursive algorithms difficult to understand and debug. They’re also exposed to stack overflows; for example, if the input strings are lacking proper termination. It’s difficult, if not impossible, to prevent stack overflows in a recursive algorithm and still keep it performing fast—and stack overflows are often fatal even in the face of exception-handling logic. When no stack space is available, it’s tricky to do so much as output a diagnostic message. There’s no such difficulty with my nonrecursive approach. Listing One is a coded-up version of my algorithm, which does all of its processing in a single while loop. It not only handles wildcard (“*”) characters, but also alternate string-termination characters. And it does all of this in a minimal number of clock cycles. It’s relatively compact and understandable, compared to the wildcard comparison routines I found by searching. The two text strings passed as input into the algorithm include a tame string, which contains no wildcard characters, and a wild string that may contain one or more wildcard characters. The two strings are traversed in parallel during processing. In Listing One, the first if clause in the loop checks for termination of the tame string. If the wild string has additional characters by the time processing reaches the end of the tame string, and if at least one of those additional characters in the wild string is not a wildcard character, then the strings don’t match. Most of the remaining code is in the big else clause corresponding to this first if clause. In the big else clause, the first thing that happens is that an instant character from each string (both tame and wild) is lowercased, if the caller has specified case insensitivity. The core wildcard matching code is found in the several lines of code following the lowercasing operation. This core code executes only when the instant characters from the tame and wild strings don’t match. One of those September 2008 l www.ddj.com l Dr. Dobb’s Journal 37 http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - September 2008 Dr. Dobb's Journal - September 2008 Contents Friday Night Fish Fry Alia Vox Developer Diaries Developer’s Notebook A Conversation With Erik Demaine Application Lifecycle Management Meets Model-Driven Development Building a Robust Development Environment Real Users Really Matter Matching Wildcards: An Algorithm The Android Mobile Phone Platform Managing Application Thread Use Signalling Integer Overflows in Java .NET Development & the IBM WebSphere Portal Server The Agile Edge Effective Concurrency Swaine’s Flames Dr. Dobb's Journal - September 2008 Dr. Dobb's Journal - September 2008 - Dr. Dobb's Journal - September 2008 (Page Cover1) Dr. Dobb's Journal - September 2008 - Dr. Dobb's Journal - September 2008 (Page Cover2) Dr. Dobb's Journal - September 2008 - Dr. Dobb's Journal - September 2008 (Page 1) Dr. Dobb's Journal - September 2008 - Dr. Dobb's Journal - September 2008 (Page 2) Dr. Dobb's Journal - September 2008 - Dr. Dobb's Journal - September 2008 (Page 3) Dr. Dobb's Journal - September 2008 - Contents (Page 4) Dr. Dobb's Journal - September 2008 - Contents (Page 5) Dr. Dobb's Journal - September 2008 - Friday Night Fish Fry (Page 6) Dr. Dobb's Journal - September 2008 - Friday Night Fish Fry (Page 7) Dr. Dobb's Journal - September 2008 - Friday Night Fish Fry (Page 8) Dr. Dobb's Journal - September 2008 - Friday Night Fish Fry (Page 9) Dr. Dobb's Journal - September 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - September 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - September 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - September 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - September 2008 - Developer’s Notebook (Page 14) Dr. Dobb's Journal - September 2008 - Developer’s Notebook (Page 15) Dr. Dobb's Journal - September 2008 - A Conversation With Erik Demaine (Page 16) Dr. Dobb's Journal - September 2008 - A Conversation With Erik Demaine (Page 17) Dr. Dobb's Journal - September 2008 - A Conversation With Erik Demaine (Page 18) Dr. Dobb's Journal - September 2008 - A Conversation With Erik Demaine (Page 19) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 20) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 21) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 22) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 23) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 24) Dr. Dobb's Journal - September 2008 - Application Lifecycle Management Meets Model-Driven Development (Page 25) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 26) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 27) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 28) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 29) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 30) Dr. Dobb's Journal - September 2008 - Building a Robust Development Environment (Page 31) Dr. Dobb's Journal - September 2008 - Real Users Really Matter (Page 32) Dr. Dobb's Journal - September 2008 - Real Users Really Matter (Page 33) Dr. Dobb's Journal - September 2008 - Real Users Really Matter (Page 34) Dr. Dobb's Journal - September 2008 - Real Users Really Matter (Page 35) Dr. Dobb's Journal - September 2008 - Real Users Really Matter (Page 36) Dr. Dobb's Journal - September 2008 - Matching Wildcards: An Algorithm (Page 37) Dr. Dobb's Journal - September 2008 - Matching Wildcards: An Algorithm (Page 38) Dr. Dobb's Journal - September 2008 - Matching Wildcards: An Algorithm (Page 39) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 40) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 41) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 42) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 43) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 44) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 45) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 46) Dr. Dobb's Journal - September 2008 - The Android Mobile Phone Platform (Page 47) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 48) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 49) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 50) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 51) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 52) Dr. Dobb's Journal - September 2008 - Managing Application Thread Use (Page 53) Dr. Dobb's Journal - September 2008 - Signalling Integer Overflows in Java (Page 54) Dr. Dobb's Journal - September 2008 - Signalling Integer Overflows in Java (Page 55) Dr. Dobb's Journal - September 2008 - Signalling Integer Overflows in Java (Page 56) Dr. Dobb's Journal - September 2008 - Signalling Integer Overflows in Java (Page 57) Dr. Dobb's Journal - September 2008 - Signalling Integer Overflows in Java (Page 58) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 59) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 60) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 61) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 62) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 63) Dr. Dobb's Journal - September 2008 - .NET Development & the IBM WebSphere Portal Server (Page 64) Dr. Dobb's Journal - September 2008 - The Agile Edge (Page 65) Dr. Dobb's Journal - September 2008 - The Agile Edge (Page 66) Dr. Dobb's Journal - September 2008 - The Agile Edge (Page 67) Dr. Dobb's Journal - September 2008 - Effective Concurrency (Page 68) Dr. Dobb's Journal - September 2008 - Effective Concurrency (Page 69) Dr. Dobb's Journal - September 2008 - Effective Concurrency (Page 70) Dr. Dobb's Journal - September 2008 - Effective Concurrency (Page 71) Dr. Dobb's Journal - September 2008 - Swaine’s Flames (Page 72) Dr. Dobb's Journal - September 2008 - Swaine’s Flames (Page Cover3) Dr. Dobb's Journal - September 2008 - Swaine’s Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.