Dr. Dobb's Journal - February 2008 - (Page 30) D02gray_p5ma.qxp 12/7/07 2:25 PM Page 30 Core Technology Matthew Curry and Jeff Gray BibPort: Creating Bibliographic References Mining bibliographic citations and Microsoft VSTO Bibliographic software such as BibTeX and EndNote help you construct customized databases of literature references and manage the citation and formatting of a list of references, making the insertion of new citations into documents a seamless operation. A database of citations serves as the repository for each of these tools and can record all pertinent information about specific publications—author names, title, publication type, and date of publication. However, when a large collection of articles already exists, the task of creating the database of references can be daunting. Manually typing or copying-and-pasting from existing documents into databases can be tedious and error prone. And with existing bibliographic software, there is no prepackaged solution to assist in automating the initial manual entry of references into the database. In this article, we present BibPort, an automated tool that extracts relevant information from a set of legacy documents to create a database of bibliographic references using Word 2007’s reference-management facilities. In the context of bibliographic text mining, this article is a case study in using Microsoft Visual Studio Tools for Office (VSTO) (msdn .microsoft.com/office/understanding/vsto) in combination with Visual Studio 2008 and Word 2007. Similarly, an export manager serves as a back-end for multiple export engines for citation database formats. Input/output processes are modularized such that any export engine is able to understand the output of any parser. Matthew is a Ph.D. student and Jeff an associate professor in the Department of Computer and Information Sciences at the University of AlabamaBirmingham. They can be contacted at curryml@cis.uab .edu and gray@cis.uab.edu, respectively. Parsing a Citation Distinguishing the particular type of a reference—a journal, conference paper, or book—within a specific citation format requires an inference to be made based on textual cues appearing in each entry. The inference of the reference type can be extended to differentiate between numerous citation formats. When possible, users may be consulted to disambiguate the conflicts resulting from typographical errors. Design Principles for BibPort Many organizations have specific formatting styles for bibliographic citation; the IEEE, ACM, APA, and MLA. A cursory web search yields several dozen citation styles in widespread use. Therefore, an important BibPort design goal addresses the flexibility needed to work with multiple citation formats. BibPort is not hard-coded to a specific format; new citation formats can be added by supplying new parsers. Furthermore, while individual bibliographic-management tools tend to have their own format, the choice of database format is flexible within BibPort. BibPort has a plug-in architecture that has a parser manager as a front-end to any number of citation formats. 30 Dr. Dobb’s Journal l www.ddj.com l February 2008 Ambiguity in a Typical Reference List Entry Figure 1 is a typical entry in a reference list formatted in APA style. In this entry, the author “A. Provenzale,” has a paper entitled “Page Swapping is Bad. What You Should Know” that has been accepted by “Memory Management and You: Novel Memory Management Techniques.” A noticeable problem occurs in this entry due to the title containing two sentences. It is impossible for the parser to determine where the specific title ends and journal name begins. Underlining the journal name is therefore crucial to the successful interpretation continued on page 34 http://msnd.microsoft.com/office/understanding/vsto http://msnd.microsoft.com/office/understanding/vsto http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - February 2008 Dr. Dobb's Journal - February 2008 Contents Hmmmm Alia Vox Developer Diaries Developer’s Notebook South American Software Development Conversations Inside Visual Studio 2008 BibPort: Creating Bibliographic References Continuous LINQ The ZK Framework Static Testing C++ Code The Agile Edge Effective Concurrency Swaine’s Flames Dr. Dobb's Journal - February 2008 Dr. Dobb's Journal - February 2008 - Dr. Dobb's Journal - February 2008 (Page Cover1) Dr. Dobb's Journal - February 2008 - Dr. Dobb's Journal - February 2008 (Page Cover2) Dr. Dobb's Journal - February 2008 - Dr. Dobb's Journal - February 2008 (Page 1) Dr. Dobb's Journal - February 2008 - Dr. Dobb's Journal - February 2008 (Page 2) Dr. Dobb's Journal - February 2008 - Dr. Dobb's Journal - February 2008 (Page 3) Dr. Dobb's Journal - February 2008 - Contents (Page 4) Dr. Dobb's Journal - February 2008 - Contents (Page 5) Dr. Dobb's Journal - February 2008 - Hmmmm (Page 6) Dr. Dobb's Journal - February 2008 - Hmmmm (Page 7) Dr. Dobb's Journal - February 2008 - Hmmmm (Page 8) Dr. Dobb's Journal - February 2008 - Hmmmm (Page 9) Dr. Dobb's Journal - February 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - February 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - February 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - February 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - February 2008 - Developer’s Notebook (Page 14) Dr. Dobb's Journal - February 2008 - Developer’s Notebook (Page 15) Dr. Dobb's Journal - February 2008 - South American Software Development (Page 16) Dr. Dobb's Journal - February 2008 - South American Software Development (Page 17) Dr. Dobb's Journal - February 2008 - South American Software Development (Page 18) Dr. Dobb's Journal - February 2008 - South American Software Development (Page 19) Dr. Dobb's Journal - February 2008 - Conversations (Page 20) Dr. Dobb's Journal - February 2008 - Conversations (Page 21) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 22) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 23) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 24) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 25) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 26) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 27) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 28) Dr. Dobb's Journal - February 2008 - Inside Visual Studio 2008 (Page 29) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 30) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 31) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 32) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 33) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 34) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 35) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 36) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 37) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 38) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 39) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 40) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 41) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 42) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 43) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 44) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 45) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 46) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 47) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 48) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 49) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 50) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 51) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 52) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 53) Dr. Dobb's Journal - February 2008 - BibPort: Creating Bibliographic References (Page 54) Dr. Dobb's Journal - February 2008 - Continuous LINQ (Page 55) Dr. Dobb's Journal - February 2008 - Continuous LINQ (Page 56) Dr. Dobb's Journal - February 2008 - Continuous LINQ (Page 57) Dr. Dobb's Journal - February 2008 - Continuous LINQ (Page 58) Dr. Dobb's Journal - February 2008 - Continuous LINQ (Page 59) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 60) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 61) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 62) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 63) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 64) Dr. Dobb's Journal - February 2008 - The ZK Framework (Page 65) Dr. Dobb's Journal - February 2008 - Static Testing C++ Code (Page 66) Dr. Dobb's Journal - February 2008 - Static Testing C++ Code (Page 67) Dr. Dobb's Journal - February 2008 - Static Testing C++ Code (Page 68) Dr. Dobb's Journal - February 2008 - Static Testing C++ Code (Page 69) Dr. Dobb's Journal - February 2008 - Static Testing C++ Code (Page 70) Dr. Dobb's Journal - February 2008 - The Agile Edge (Page 71) Dr. Dobb's Journal - February 2008 - The Agile Edge (Page 72) Dr. Dobb's Journal - February 2008 - The Agile Edge (Page 73) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 74) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 75) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 76) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 77) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 78) Dr. Dobb's Journal - February 2008 - Effective Concurrency (Page 79) Dr. Dobb's Journal - February 2008 - Swaine’s Flames (Page 80) Dr. Dobb's Journal - February 2008 - Swaine’s Flames (Page Cover3) Dr. Dobb's Journal - February 2008 - Swaine’s Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.