Tracking User Sentiments on Twitter with Twitter4J and Esper


News: Tracking User Sentiments on Twitter with Twitter4J and Esper

  1. Managing big data and mining useful information from it is the hottest discussion topic in technology right now. Explosion of growth in semi-structured data flowing from social networks like Twitter, Facebook and Linkedin is making technologies like Hadoop, Cassandra a part of every technology conversation. So as not to fall behind of competition, all customer centric organizations are actively engaged in creating social strategies. What can a company get out of data feeds from social networks? Think location based services, targeted advertisements and algorithm equity trading for starters. IDC Insights have some informative blogs on the relationship between big data and business analytics. Big data in itself will be meaningless unless the right analytic tools are available to sift through it, explains Barb Darrow in her blog post  on


    Companies often listen into social feeds to learn customers’ interest or perception about the products. They also are trying to identify “influencers” – the one with most connections in a social graph – so they could make better offers to such individuals and get better mileage out of their marketing. The companies involved in equity trading want to know which public trading companies are discussed on Twitter and what are the users' sentiments about them. From big companies like IBM to  smaller start-ups, everyone is racing to make most of the opportunities of big data management and analytics. Much documentation about big data like this ebook from IBM 'Big Data Platform'  is freely available on the web. However a lot of this covers theory only. Jouko Ahvenainen in reply to Barb Darrow’s post above makes a good point that “many people who talk about the opportunity of big data are on too general level, talk about better customer understanding, better sales, etc. In reality you must be very specific, what you utilize and how”.


    It does sound reasonable, doesn't it? So I set out to investigate this a bit further by prototyping an idea, the only good option I know. If I could do it, anybody could do it. Read more here.



  2. The real power of sentiment mining[ Go to top ]

    It isn't twitter and unreliable data sources. Many companies have vast databases with customer feedback. The first place to apply sentiment technology is trusted data like CSR databases and other internal databases with unstructured data. Saying twitter just sounds cool, but it's not really all that useful. It is far too easy to create a tweet storm and thereby negate any benefit of using sentiment mining on social media.

  3. The real power of sentiment mining[ Go to top ]

    Peter, Thank you and appreciate your comment. You could be right about reliability of tweeter feeds. However, for many companies, twitter feed is one of the major component to their big data.

  4. So has come to this? I woudln't even define this as a crude implementation of sentimental analysis.

    Sentimental analysis is very difficult to accomplish, especially with Tweets due to the lack of a good corpus. The title of this posting is very misleading.

    I highly recommend changing it to "A brief example of Esper and Twitter4j".
    For those interested in doing Sentimental Analysis, I highly recommend looking at OpenNLP or Linepipe. There are a few Twitter corpuses out there, but the largest I found was around 1,800 tweets.

  5. Richard, Thank you and I appreciate your comment.  My feedback below.

    Title - I get what you are saying. I should be more careful while selecting title.

    Complexity of NLP - Agreed. I mentioned an example of why it is complex and also linked to learning resource for NLP. But just because coding NLP yourself is complex deos not necessarily mean using it is. Example - Complex event processing. I will check out the NLP libraries you mentioned. Thanks for the tip.

    Not even a crude example. - Why not? It does show twitter users' collective sentiments/emotions. It looks too easy only because, we are looking for just an emotion and nothing else. Abbreviations help communicating the exact sentiments. Just like a person would pick one group over another as being funnier from the laughter the groups generate, the code is tracking total of "lol", thereby tracking collective happiness index. If one  omits this conversation from the blog post, then the blog is just an example of twitter4j and Esper working together - your point exactly.