Interesting comments on your linked post Nikita.
A couple quick points.
Cascading is intended to be a general purpose compute model that will span multiple compute applications. It won't always be a perfect fit, but for specific workload types, it will be a huge benefit.
I'm not a GridGain user, but I do recognize there are applications that are best served by more reactive applications than what Hadoop can provide. But as I understand GridGain and Coherence and other data-grid technologies, considering the cost/benefit ratio, I probaby wouldn't stuff a petabyte of data on them for both scheduled and ad-hoc analysis. But for near real-time responses from a subset of hot data, then yes.
The bottom line is this. No one tool is a silver bullet, so a mix of applications will be necessary to solve most problems. Providing a easy way for them to integration together and to participate in the same user level workflows and automation is a major win for all parties involved. This is where Cascading aims to help.
Also, saying Hadoop and Cascading are for ETL would be missing the big picture. When you take away the data-warehouse (and associated RDBMS databases) and are enabled to perform complex analysis and data mining unrestrained without them, the concept of ETL goes away. Schema restricted datastores (RDBMS and data-warehouses) as simply caches. ETL is only around to load the cache.
Hadoop is an unrestrained platform for performing data analysis (data mining, machine learning, data cleansing, etc) on raw data. It starts to make very little sense to stuff data into a schema when you can a lazily resolve the data into a 'view' for use by other processes. Where performance is necessary, caching of the 'views' becomes an attribute or switch, not an architectural component of the system. I touch on this concept here Cascading and Hive
Look forward to you and Victoria's presentations at the Cloud Computing Group meetup. http://web.meetup.com/66/calendar/8561664
Oh and Billy, I thought you were doing something like this already for IBM? heh