Dr. Dobb's Journal - March 2008 - (Page 63) Listing One Listing One is code to get web page content. Java has built-in support for parsing HTML text. The HTMLEditorKit.ParserCallback class allows for text extraction. The Jericho HTML Parser (jerichohtml.sourceforge.net/ doc/index.html) is another option. Regular expression parsing can be done using Java’s java.util.regex.Matcher and Pattern classes. If full Perl 5 syntax support is required, the Jakarta ORO classes can be used. Multithreading to retrieve multiple website pages concurrently is greatly simplified by the java.util.concurrent package and HTTPClient’s MultiThreadedHttpConnectionManager class. Java 6 introduced additional features that aid in the UI implementation: results are stored using Java 6’s built-in JAXB support. The TableRowSorter and RowFilter classes are used for sorting and filtering the table of results. Last of all, the Substance Look and Feel (https://substance.dev.java.net) is used to give the application a polished appearance. DDJ public static String getContent(HttpClient httpClient, String url) { logger.info("getting content from url: " + url); GetMethod get = new GetMethod(url); get.setFollowRedirects(true); // impersonate browser get.setRequestHeader("User-Agent", “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)"); get.setRequestHeader("Accept", "image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-msapplication, */*"); get.setRequestHeader("Accept-Language", "en-us"); get.setRequestHeader("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7"); try { int statusCode = httpClient.executeMethod(get); logger.info("http status code: " + statusCode); InputStream is = get.getResponseBodyAsStream(); ByteArrayOutputStream baos = new ByteArrayOutputStream(2048); byte[] buff = new byte[2048]; int bytesRead = 0; while ( (bytesRead = is.read(buff)) != -1) { baos.write(buff, 0, bytesRead); } is.close(); baos.close(); String text = baos.toString(); logger.debug("raw html=" + text); return text; } catch (Exception ex) { logger.error(ex); } finally { get.releaseConnection(); } return null; } March 2008 l www.ddj.com l Dr. Dobb’s Journal 63 http://jerichohtml.sourceforge.net/doc/index.html http://jerichohtml.sourceforge.net/doc/index.html https://substance.dev.java.net http://www.sparxsystems.com http://www.sparxsystems.com http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - March 2008 Dr. Dobb's Journal - March 2008 Contents Hmmmm Alia Vox Developer Diaries Developer’s Notebook Social Networks and Software Development Conversations Detecting Bugs in Safety-Critical Code Change Code Without Fear Continuous Integration and Performance Testing Wt: A Web Toolkit Automating Release Notifications The Agile Edge Effective Concurrency Swaine’s Flames Dr. Dobb's Journal - March 2008 Dr. Dobb's Journal - March 2008 - (Page Belly1) Dr. Dobb's Journal - March 2008 - (Page Belly2) Dr. Dobb's Journal - March 2008 - Dr. Dobb's Journal - March 2008 (Page Cover1) Dr. Dobb's Journal - March 2008 - Dr. Dobb's Journal - March 2008 (Page Cover2) Dr. Dobb's Journal - March 2008 - Dr. Dobb's Journal - March 2008 (Page 1) Dr. Dobb's Journal - March 2008 - Dr. Dobb's Journal - March 2008 (Page 2) Dr. Dobb's Journal - March 2008 - Dr. Dobb's Journal - March 2008 (Page 3) Dr. Dobb's Journal - March 2008 - Contents (Page 4) Dr. Dobb's Journal - March 2008 - Contents (Page 5) Dr. Dobb's Journal - March 2008 - Hmmmm (Page 6) Dr. Dobb's Journal - March 2008 - Hmmmm (Page 7) Dr. Dobb's Journal - March 2008 - Hmmmm (Page 8) Dr. Dobb's Journal - March 2008 - Hmmmm (Page 9) Dr. Dobb's Journal - March 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - March 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - March 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - March 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - March 2008 - Developer’s Notebook (Page 14) Dr. Dobb's Journal - March 2008 - Developer’s Notebook (Page 15) Dr. Dobb's Journal - March 2008 - Social Networks and Software Development (Page 16) Dr. Dobb's Journal - March 2008 - Social Networks and Software Development (Page 17) Dr. Dobb's Journal - March 2008 - Social Networks and Software Development (Page 18) Dr. Dobb's Journal - March 2008 - Social Networks and Software Development (Page 19) Dr. Dobb's Journal - March 2008 - Conversations (Page 20) Dr. Dobb's Journal - March 2008 - Conversations (Page 21) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 22) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 23) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 24) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 25) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 26) Dr. Dobb's Journal - March 2008 - Detecting Bugs in Safety-Critical Code (Page 27) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 28) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 29) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 30) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 31) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 32) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 33) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 34) Dr. Dobb's Journal - March 2008 - Change Code Without Fear (Page 35) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 36) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 37) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 38) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 39) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 40) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 41) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 42) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 43) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 44) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 45) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 46) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 47) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 48) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 49) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 50) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 51) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 52) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 53) Dr. Dobb's Journal - March 2008 - Continuous Integration and Performance Testing (Page 54) Dr. Dobb's Journal - March 2008 - Wt: A Web Toolkit (Page 55) Dr. Dobb's Journal - March 2008 - Wt: A Web Toolkit (Page 56) Dr. Dobb's Journal - March 2008 - Wt: A Web Toolkit (Page 57) Dr. Dobb's Journal - March 2008 - Wt: A Web Toolkit (Page 58) Dr. Dobb's Journal - March 2008 - Wt: A Web Toolkit (Page 59) Dr. Dobb's Journal - March 2008 - Automating Release Notifications (Page 60) Dr. Dobb's Journal - March 2008 - Automating Release Notifications (Page 61) Dr. Dobb's Journal - March 2008 - Automating Release Notifications (Page 62) Dr. Dobb's Journal - March 2008 - Automating Release Notifications (Page 63) Dr. Dobb's Journal - March 2008 - Automating Release Notifications (Page 64) Dr. Dobb's Journal - March 2008 - The Agile Edge (Page 65) Dr. Dobb's Journal - March 2008 - The Agile Edge (Page 66) Dr. Dobb's Journal - March 2008 - The Agile Edge (Page 67) Dr. Dobb's Journal - March 2008 - Effective Concurrency (Page 68) Dr. Dobb's Journal - March 2008 - Effective Concurrency (Page 69) Dr. Dobb's Journal - March 2008 - Effective Concurrency (Page 70) Dr. Dobb's Journal - March 2008 - Effective Concurrency (Page 71) Dr. Dobb's Journal - March 2008 - Swaine’s Flames (Page 72) Dr. Dobb's Journal - March 2008 - Swaine’s Flames (Page Cover3) Dr. Dobb's Journal - March 2008 - Swaine’s Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.