Discussions

News: HtmlUnit 2.1, a headless java browser, released

  1. A new release of the pure java headless browser is available, which allows high-level manipulation of web pages, such as filling forms, clicking links, accessing attributes and values of specific elements within the pages, you do not have to create lower-level requests of TCP/IP or HTTP, but just getPage(url), find a hyperlink, click() and you have all the HTML, JavaScript, and AJAX are automatically processed. The most common use of HtmlUnit is test automation of web pages (even with complex JavaScript libraries, for instance Google Web Toolkit 1.4.60 tests now pass), but sometimes it can be used for web scraping, or downloading website content. Version 2.0 includes many new enhancements such as W3C DOM implementation, Java 5 features, better XPath support, and improved handling for incorrect HTML, in addition to the usual JavaScript various enhancements, while version 2.1 mainly focuses on tuning some performance issues reported by users. You can find more information in the main website, the team is looking forward to getting your feedback.
  2. I recently started using HtmlUnit. It's got a great interface. Often, I need to visit a part of a website and scrape its data. I usually write a script in wget or httpclient to accomplish this. But this can be a serious pain in the ass if a page along the way requires you to submit a form. I'll use HtmlUnit for this from now on. My only complaint about HtmlUnit is you end up doing a LOT of casting when you use it. Hopefully they'll reduce that in future versions.
  3. My only complaint about HtmlUnit is you end up doing a LOT of casting when you use it. Hopefully they'll reduce that in future versions.
    Did you try version 2.x? It uses generics and many types are already cast. On the other hand, how to cast something like page.getHtmlElementById()? Please provide your suggestions to the user-list.
  4. My only complaint about HtmlUnit is you end up doing a LOT of casting when you use it. Hopefully they'll reduce that in future versions.


    Did you try version 2.x? It uses generics and many types are already cast.

    On the other hand, how to cast something like page.getHtmlElementById()? Please provide your suggestions to the user-list.
    How about having a separate method for each expected type? E.g. page.getLIById(), page.getAById(), etc. If you fear that would clutter your page object, make a separate object for that. For example: page.getById().LI(), page.getById().A(), etc.
  5. Casting[ Go to top ]

    public T getHtmlElementById(String id, Class clazz) { ... HtmlElement e = ...; if (clazz.isAssignableFrom(e.getClass()) { return (T)e; } else { return null; } } It has the downside of needing to pass in the class, but gives you typesafety: HtmlForm form = page.getHtmlElementById("myFrom", HtmlForm.class);
  6. Re: Casting[ Go to top ]

    public T getHtmlElementById(String id, Class clazz) {
    ...

    HtmlElement e = ...;
    if (clazz.isAssignableFrom(e.getClass()) {
    return (T)e;
    }
    else {
    return null;
    }
    }

    It has the downside of needing to pass in the class, but
    gives you typesafety:

    HtmlForm form = page.getHtmlElementById("myFrom", HtmlForm.class);
    Thanks for hinting, actually it was suggested in the user-list by Julien Henry to have: public T getHtmlElementById(String id) {} HtmlButton button = page.getHtmlElementById(). However, Java 5 still suffers http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5003431. And because HtmlUnit doesn't currently require Java 6, it will take some time to implement this.
  7. Congratulations on 2.1[ Go to top ]

    Congratulations to the HTMLUnit team in getting 2.1 out the door so quickly. Wasn't it just last week that 2.0 shipped. :-) We have been using HTMLUnit for the past year in . We're using it to convert Selenium and TestGen4Web tests into functional tests, load and performance tests, and business service monitors. HTMLUnit incorporates the Rhino JavaScript engine and is very useful for Ajax testing. I see a lot of our customers recording functional tests of Ajax applications with Selenium. They play the functional test back through Firefox and the application responds. These tests work well for functional tests, including regression, integration, and acceptance testing. The dependency on running the test through the browser gives HTMLUnit a boost to Selenium users. Selenium by itself is not useful for performance testing because each simulated user requires a new instance of the browser. To accomplish a load and performance test I used a "beta" version of the PushToTest TestMaker's Transformer utility to transform the Selenium test a JUnit TestCase in Jython. The resulting script uses HTMLUnit and Rhino to communicate with AjaxService. The experimental version of the Transformer is available here. We've had good luck with this approach. -Frank Cohen http://www.pushtotest.com
  8. The most common use of HtmlUnit is test automation of web pages
    An unfortunate side effect is that spammers are handed a perfect tool to spam even more :(