Discussions

News: HtmlUnit implementation charts of the latest snapshot

  1. HtmlUnit is a pure java GUI-Less browser, which allows high-level manipulation of web pages, such as filling forms, clicking links, accessing attributes and values of specific elements within the pages, you do not have to create lower-level requests of TCP/IP or HTTP, but just getPage(url), find a hyperlink, click() and you have all the HTML, JavaScript, and Ajax are automatically processed. The most common use of HtmlUnit is test automation of web pages, but sometimes it can be used for web scraping, or downloading website content. JavaScript support is one of the powerful features of HtmlUnit, to the extent that all tests pass of complex JavaScript libraries like jQuery, Google Web Toolkit, MochiKit and Sarissa. Still frequently it is asked: "it works for many sites, but not always identical to the real browsers, so how similar is it compared to my browser?" There is a Continuous Integration process which provides the latest snapshot and now it also gives detailed information about which properties or methods that are implemented, missing, or incorrectly added. The reports are individually made for the supported browsers, namely Internet Explorer 6/7, and Firefox 2/3. You can view the charts and the web reports in the 'Build Artifacts' area of the Cruise Control server, which always includes the latest snapshot after each commit. The main website has more information about the project, the development team is looking forward for getting your feedback.
  2. Thats great, I used HtmlUnit in a past project to do screen scrapping. Very easy to use and worked perfectly. Great job people!!!
  3. I've used Apache HTTPClient for doing the same and it's worked out great. Here's a link to the About page for HttpClient and HttpCore: http://hc.apache.org/index.html Venkatt
  4. I've used Apache HTTPClient for doing the same and it's worked out great.

    Here's a link to the About page for HttpClient and HttpCore:
    http://hc.apache.org/index.html

    Venkatt
    HttpClient does not handle any JavaScript/DOM/Ajax. It is a powerful library for lower-level HTTP request/response, and BTW it is one of the main libraries used in HtmlUnit :)