I was working on a task to parse some of Amazon web-services. There are lots of ways to parse it Using DOM/SAX/Stax .  All of them require some amount of coding. I wanted a quick fix and i finally landed on to JSoup an opensource HTML Parser ( Other html parser i like is HTMLParser) . In this article i’m going to explain how i’m going to parse DZone HTML links in java.

I’ll be retreiving description’s of all links in Dzone using the code

Note: This is not the best way to read links from Dzone ( You can use rss feed’s instead).  This tutorial is to take you through css selectors for Java

All DZone pagination queries looks like this


i used an opensource java library to parse this and extract link text description (jsoup)

Here is sample tags we have in dzone response

