Dr. Dobb's Journal - October 2008 - (Page 53) d10NG_p4db 8/14/08 3:12 PM Page 53 As Figure 1 illustrates, the ImageSearcher application is made up of six main classes: • • • • • • ImageSearcherForm ImageDialog ImageViewer ThumbnailController ThumbnailImageEventHandler BuildIndex Figure 1: ImageSearcher classes and their relationships. ImageSearcherForm is the main point of entry into the application. It lets users enter the directory where the images are stored, where the index directory is to be created, and what search parameters (start and end dates, user comments, and so on) to use. The remaining classes control the display of the thumbnail images in the status window. This portion makes extensive reuse of code from Lieven’s Image Thumbnail Viewer. The BuildIndex class is where the index creation and searching takes place. To use Lucene, I first create an index by instantiating an IndexWriter(). An IndexWriter is created using the following constructor: IndexWriter idxWriter = new IndexWriter(indexDir, new StandardAnalyzer(), true); where indexDir represents the path to the index directory. Text is analyzed with the StandardAnalyzer; the last argument is a Boolean variable that, if set to True, creates the index or overwrites the existing one. If set to False, it appends to the existing index. Analyzers in Lucene can be used to tokenize text, extract relevant words, remove common words, stem the words (that is, reduce them to the root form; for example, “edits,” “editor,” and “editing” are condensed to “edit”), and perform any other processing before storing it into the index. The common analyzers provided by Lucene are: • WhiteSpaceAnalyzer, which separates tokens based on whitespace. • SimpleAnalyzer, which tokenizes the string to a set of words and converts them to lowercase. • StandardAnalyzer, which tokenizes the string to a set of words identifying acronyms, e-mail addresses, host names, and so on, discarding the basic English stop words (a, an, the, to) and stemming the words. A Lucene index is a sequence of files. All searching is done on this index. The raw EXIF metadata associated with the image files has to be read and extracted from my image files, and passed to Lucene where it can be indexed and searched. The IndexWriter object is created in the BuildIndex constructor, which takes in two arguments; the first is the directory containing your image files, the second is the directory in which the index files are created. Next, the IndexDocs() method is called. This method has one argument, which is the name of the directory containing your image files. It runs through each file in the specified directory, checks that it is a JPEG file, and creates an Image object from the file, using the call Image.FromFile(filename) from the System.Drawing package: Image img = Image.FromFile(file) Figure 2: ImageSearcher application startup screen. The aforementioned EXIF tags are extracted from each image file. For each image file processed, a Document() object is created. This is created using the Document constructor as follows: Document d = new Document(); Next, the ExifExtractor library is used to extract EXIF information from the image files. To keep the application simple, I extract only “Date Time” and “User Comment” EXIF data. EXIF data is extracted as follows: First, create the EXIFExtractor instance, Goheer.EXIF.EXIFextractor er = new Goheer.EXIF. EXIFextractor(ref img, "\n"); Next, retrieve Date/Time EXIF data: string s1 = (String)er["Date Time"]; Likewise, to extract the user comments EXIF information, we access the er object as follows: string s2 = (String)er["User Comment"]; Documents are the primary retrievable items from a Lucene query. Each Document object is made up of one or more field objects. Fields represent a section of the Document. They contain the name of the section and actual data associated with the section. Each field contains information that you query against or display in your search results. Because I will be using the filename, date, and time the picture was taken and user comments in the search results, these keywords would be added to the Document object as a field. Each of these keywords has an associated value. These values are the EXIF data extracted from the image file. Field values are a sequence of terms. I construct the Field object using the constructor: October 2008 l www.ddj.com l Dr. Dobb’s Journal 53 http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - October 2008 Dr. Dobb's Journal - October 2008 Contents Friday Night Fish Fry Alia Vox Developer Diaries Developer’s Notebook Is Your Next Language COBOL? Conversations Safe Coding Practices Code Signing in Adobe AIR OpenID Single Sign-On The Book Cipher Algorithm Indexing and Searching Image files Extending Continuous Integration Into ALM The Agile Edge Effective Concurrency Swaine’s Flames Dr. Dobb's Journal - October 2008 Dr. Dobb's Journal - October 2008 - (Page Bellyband1) Dr. Dobb's Journal - October 2008 - (Page Bellyband2) Dr. Dobb's Journal - October 2008 - Dr. Dobb's Journal - October 2008 (Page Cover1) Dr. Dobb's Journal - October 2008 - Dr. Dobb's Journal - October 2008 (Page Cover2) Dr. Dobb's Journal - October 2008 - Dr. Dobb's Journal - October 2008 (Page 1) Dr. Dobb's Journal - October 2008 - Dr. Dobb's Journal - October 2008 (Page 2) Dr. Dobb's Journal - October 2008 - Dr. Dobb's Journal - October 2008 (Page 3) Dr. Dobb's Journal - October 2008 - Contents (Page 4) Dr. Dobb's Journal - October 2008 - Contents (Page 5) Dr. Dobb's Journal - October 2008 - Friday Night Fish Fry (Page 6) Dr. Dobb's Journal - October 2008 - Friday Night Fish Fry (Page 7) Dr. Dobb's Journal - October 2008 - Friday Night Fish Fry (Page 8) Dr. Dobb's Journal - October 2008 - Friday Night Fish Fry (Page 9) Dr. Dobb's Journal - October 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - October 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - October 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - October 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - October 2008 - Developer’s Notebook (Page 14) Dr. Dobb's Journal - October 2008 - Developer’s Notebook (Page 15) Dr. Dobb's Journal - October 2008 - Is Your Next Language COBOL? (Page 16) Dr. Dobb's Journal - October 2008 - Is Your Next Language COBOL? (Page 17) Dr. Dobb's Journal - October 2008 - Is Your Next Language COBOL? (Page 18) Dr. Dobb's Journal - October 2008 - Is Your Next Language COBOL? (Page 19) Dr. Dobb's Journal - October 2008 - Conversations (Page 20) Dr. Dobb's Journal - October 2008 - Conversations (Page 21) Dr. Dobb's Journal - October 2008 - Conversations (Page 22) Dr. Dobb's Journal - October 2008 - Conversations (Page 23) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 24) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 25) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 26) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 27) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 28) Dr. Dobb's Journal - October 2008 - Safe Coding Practices (Page 29) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 30) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 31) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 32) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 33) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 34) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 35) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 36) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 37) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 38) Dr. Dobb's Journal - October 2008 - Code Signing in Adobe AIR (Page 39) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 40) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 41) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 42) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 43) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 44) Dr. Dobb's Journal - October 2008 - OpenID Single Sign-On (Page 45) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 46) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 47) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 48) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 49) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 50) Dr. Dobb's Journal - October 2008 - The Book Cipher Algorithm (Page 51) Dr. Dobb's Journal - October 2008 - Indexing and Searching Image files (Page 52) Dr. Dobb's Journal - October 2008 - Indexing and Searching Image files (Page 53) Dr. Dobb's Journal - October 2008 - Indexing and Searching Image files (Page 54) Dr. Dobb's Journal - October 2008 - Indexing and Searching Image files (Page 55) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 56) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 57) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 58) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 59) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 60) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 61) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 62) Dr. Dobb's Journal - October 2008 - Extending Continuous Integration Into ALM (Page 63) Dr. Dobb's Journal - October 2008 - The Agile Edge (Page 64) Dr. Dobb's Journal - October 2008 - The Agile Edge (Page 65) Dr. Dobb's Journal - October 2008 - The Agile Edge (Page 66) Dr. Dobb's Journal - October 2008 - The Agile Edge (Page 67) Dr. Dobb's Journal - October 2008 - Effective Concurrency (Page 68) Dr. Dobb's Journal - October 2008 - Effective Concurrency (Page 69) Dr. Dobb's Journal - October 2008 - Effective Concurrency (Page 70) Dr. Dobb's Journal - October 2008 - Effective Concurrency (Page 71) Dr. Dobb's Journal - October 2008 - Swaine’s Flames (Page 72) Dr. Dobb's Journal - October 2008 - Swaine’s Flames (Page Cover3) Dr. Dobb's Journal - October 2008 - Swaine’s Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.