|
Sponsored Links
Resources
Enterprise Java Research Library
Get Java white papers, product information, case studies and webcasts
|
News
News
News
|
Messages: 16
Messages: 16
Messages: 16
Printer friendly
Printer friendly
Printer friendly
Post reply
Post reply
Post reply
XML
XML
XML
|
 |
Data Transformation & Processing with Smooks v1.0
The Smooks team is proud to announce the release of Smooks v1.0.
The most commonly accepted definition of Smooks would be that it is a "Transformation Engine". However, at it's core, Smooks is just a "Structured Data Event Stream Processor". The core code makes no mention of "data transformation". It is designed simply to support hooking of custom "Visitor logic" into an Event Stream produced by a data Source of some kind (XML, CSV, EDI, Java etc).
Of course, the most common application of this will be in the creation of Transformation solutions i.e. implementing Visitor logic that uses the Event Stream, produced from a Source of one type (XML, EDI, CSV, Java etc), to produce a Result of some other type (XML, EDI, CSV, Java etc). These Event Stream Processing capabilities enable more than just Message Transformation. We have implemented a range of other solutions on top of this basic processing model:
1. Java Binding Framework: Use the Event Stream to create and populate a Java Object Model. It can actually create and populate multiple Object Models concurrently (i.e. in a single pass of the message), which can be very useful when splitting messages (see below). It can create and populate Object Models whose hierarchies don't “line up” with that of the Source message.
2. Java to Java Transforms: Transform between Java Object Models of different types.
3. Message Splitting & Routing: Split up a message and route the “split messages” to different destinations (with native support for File, JMS and Database destinations). Supports conditional routing (Content Based Routing) of each split message to multiple destinations of different types and in different formats (concurrently) e.g. XML1 to D1, Java1 to D2, Java2 to D3, EDI to D4 etc. Supports complex splits, where each split message contains data from different sub-hierarchies of the Source message i.e. not just dumb XPath based fragment splitting.
4. Huge Message Processing: Process GB size messages through Transformation, Splitting & Routing, or Persistence.
5. Message Enrichment: Enrich messages with data from external sources (e.g. a Database). Using a Splitting & Routing example, imagine splitting Order-Item messages out of an Order message, where the Customer details in each Order-Item split-message needs to be enriched with additional Customer info (e.g. addressing info), before routing the Order-Item split-message to a partner interface.
6. Fragment Based Transforms: Develop modular transformation logic, using a number of supported technologies (FreeMarker, XSLT, StringTemplate, Java, Groovy) and target them at message fragments. Avoid monolithic transformation solutions that are difficult to maintain. Avoid resource-hungry transformation pipelines. Supports mixing of technologies within the context of a single message transform.
Smooks is proving to be a very useful tool in the ESB/SOA world. It has been part of JBossESB since the early days and has, more recently, been made available in a number of other ESB Platforms (Mule and Apache Synapse/WS02, with others to follow).
For more on Smooks and the features outlined above, visit the project pages and download the v1.0 distribution. There are lots of well documented examples you can run out-of-the-box. Your feedback will be greatly appreciated!!
|
|
Message #252158
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: Smooks project page
Looks like something went wrong with the links in the article...The Smooks project page can be found here:
http://milyn.codehaus.org/Smooks
Regards,
Daniel Bleah. Know what happened? Those stupid &rquo; things happened - all of the links had the entities around the attributes instead of the regular ascii quotes. Sorry about that. It's been fixed.
|
|
Message #252177
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: Performance
We don't have a formally documented set of metrics.
We've performed quite a few tests (of different kinds) in order to give ourselves an idea as to the overhead. For example, comparing a very simple streamed XSL transform inside and outside Smooks, our tests show Smooks adding approx 5%. Two points to note here. however:
1. The XSL in question was very simple and the test was loaded in favor of the XSL processor. If the transformation becomes more complex, requiring more random access, or simply requiring transforms not easily performed with XSLT e.g. date/string manipulation, Smooks offers more options re performing the transform in a more performant manner. In this case, we were basically interested in seeing the worse case scenario from a Smooks perspective.
2. All non-commercial XSL Processors we tried broke down once the input message reached a certain size (< 100Mb). Smooks was able to implement and perform the same transform up to 4Gb (was as far as we went) by implementing fragment based transforms. Greater than 100Mb, you might say! Well sure, not many need this capability but that's not the only point there IMO. Smooks ability to offer alternatives and to keep going is in itself the interesting point (for me at least :-) ).
We implemented performance tests of other kinds too, but the problem is that we don't seem to have anything to compare against in order to make real sense of any numbers.
We've also done extensive profiling on Smooks in an effort to eliminate memory leaks etc. The code is written with a close eye to performance (the whole visitor model is stateless etc). Please take a look!!
There are of course many more tests we could run e.g. comparing the Java Binding functionality with frameworks such as XStream, JAXB etc, but the bottom line is... Smooks is in use in quite a few mission critical envs now and performance is not something we've had any "complaints" about (quite the opposite in fact). This of course is a somewhat hollow statement so one would really need to take Smooks and try it in their solution. I'm sure anyone that tries it will not be unhappy.
We're not claiming to be faster than X or Y. We're hoping people can look at Smooks and see that it has many quite cool capabilities that can not be achieved easily elsewhere (at least in open source). We're convinced that the flexibility, maintainability etc that it offers will more than compensate for any performance related shortcomings it *might* have.
|
|
Message #252178
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: Performance
Tom did some performance comparisons with XSLT a year ago. Not sure if he has anything more recent. But I'm sure if he has he'll post here.
Hey Mark... thanks :-)
So those XSLT comparison tests were performed using the Smooks v0.9 codebase, which didn't support the SAX based processing model. Now that Smooks supports a SAX processing model, those figures have swung back in Smooks favor (considerably :-) ).
|
|
Message #252179
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: Performance
Yeah, I figured they'd now represent a worst-case scenario. But I knew you'd have something ready :-)
|
|
Message #252185
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
XStream, java queues and db optimization
Looks good, it solves a very core processing issue in many systems - especially financial world. Data is transformed and enriched as it passes through multiple departments (ledger and subledgers) and reconciled across departments. Within a single application data undergoes through multiple transformations. So support for java queues as a destination will enhance the throughput - transaction semantics and failover can be taken care of at the boundaries.
When database is a destination throughput becomes a concern if records are written one after another - batching is one solution. Some benchmarks we have done internally on db performance are at http://onelinejdbc.wiki.sourceforge.net/PerformanceComparison.
Finally on comparing with tools like XStream. We use XStream in our product in places where data to be dealt with is low. On comparing XStream with a straight java reflection based binding (public fields or setters) reflection based binding is twice as fast as XStream (with XStream using XPP). So if Smooks is using reflection where field and setter method objects are cached it should be fine.
Thanks Sunil & Abinash http://sunilabinash.vox.com/ http://oneline.wiki.sourceforge.net/index.html
|
|
Message #252194
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: XStream, java queues and db optimization
Finally on comparing with tools like XStream. We use XStream in our product in places where data to be dealt with is low. On comparing XStream with a straight java reflection based binding (public fields or setters) reflection based binding is twice as fast as XStream (with XStream using XPP). So if Smooks is using reflection where field and setter method objects are cached it should be fine.
Maurice Zeijen is currently looking at optimizing this using Javassist. We hope this will boost Java Binding performance considerably.
|
|
Message #252196
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: XStream, java queues and db optimization
In Smooks 1.0 setter methods are being cached.
I am currently working on a BCM enhanced setter mechanism. It looks like that calling the setter methods with a Javassist generated class, instead of a reflective method, makes the invocation of the setter method 10x faster. But I am still at an early stage...
|
|
Message #252204
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
RE: Comparable tool
OpenAdaptor has been rewritten recently to be a spring application that relies on ApplicationContext.xml to declare all the connectors, processors, convertors and the routing logic. There is also support for enriching data through a scriptprocessor bean using javascript. For most needs, the out of the box components are sufficient and do not require any custom java code to be written. However, the interfaces are there to be implemented to write custom components if required.
Sarwar
|
|
Message #252214
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: RE: Comparable tool
Sounds a lot like what OpenAdaptor does (http://www.openadaptor.org)
Hey Sarwar.
Looks like an interesting project and looks to be covering some of the same usecases. I'll download and have a look at it. Thanks for pointing it out :-)
|
|
Message #252226
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: RE: Comparable tool
Sounds a lot like what OpenAdaptor does (http://www.openadaptor.org)
Hey Sarwar.
Looks like an interesting project and looks to be covering some of the same usecases. I'll download and have a look at it. Thanks for pointing it out :-)
I had a quick look at openadaptor and it looks to me as though it has more in common with ESB i.e. exposing endpoints of different types (Http, JMS...), with routing and of course it has some conversion capabilities.
|
|
Message #252256
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
How does it compare with JIBX and JAXB
Hi
Smooks sounds like a great alternative to JIBX that I am using in one of our projects.
How does it compare with JIBX?
|
|
Message #252427
Post reply
Post reply
Post reply
Go to top
Go to top
Go to top
|
 |
Re: How does it compare with JIBX and JAXB
Hi
Smooks sounds like a great alternative to JIBX that I am using in one of our projects.
How does it compare with JIBX?
We haven't done comparisons yet between the Javabean cartridge and other comparable libraries like JIBX and JAXB. You must know that the Javabean cartridge is only a part of Smooks, mainly used within a Transformation. For instance from EDI -> Javabeans -> XML. We should however do some comparisons some day.
I haven't actually worked with JIBX yet. But I took a quick look on their website. Here are some differences between JIBX and Smooks:
- JIBX uses a compiler to enhance the bean classes to be able the marshal and unmarshal XML. Smooks uses (cached) reflection to call the bean methods. In the next version we will probably use Byte Code Manipulation to generate classes that call the bean methods, which is a lot faster.
- With JIBX you can directly marshal objects to XML. With Smooks you need to create a template file, using Freemarker or StringTemplate, to be able to write XML.
- I am not sure if JIBX can handle the flexible selection of nodes as Smooks can. What I mean is that I think that Smooks can handle a better mismatch between the XML and Javabeans.
- JIBX can only work with XML and Smooks can work with about everything structured.
The following advice is based on my gut feeling, because I haven't done an actual benchmark. If you simply want to marshal and unmashal XML and you don't mind the compile step then use JIBX (or JAXB) but if you need more flexibility, more control or need to process something else then XML, use Smooks.
|
|
 |
New content on TheServerSide.comNew content on TheServerSide.comNew content on TheServerSide.com |
 |
 |
Reza Rahman explores the features of the proposed JSR 299, Contexts and Dependency Injection for Java EE (CDI). When approved, it promises to be a key feature of Java EE 6.
(November 2, Article)
SAML is an XML-based standard for exchanging authentication and authorization data between security domains. The single most important problem that SAML was created to solve is the Web browser Single Sign-On problem. Many organizations are debating whether to stay with version 1.1 or move to 2.0. This article makes observations about both options.
(September 28, Article)
Joe Ottinger takes a look at how people learn, and applies it to the practice of programming. He notes that understanding how people learn is an essential part of working in a programming team.
(September 22, Article)
Stephen Maryka gave us an article about the Asynchronous Web and posed a number of questions that get examined like an approach to delivering Asynchronous Web capabilities through extensions to existing Java EE technologies.
(July 14, Article)
JavaServer Faces Flex goal is to provide users capability in creating standard Flex components, part of flexSDK which is open sourced through MPL license, as normal JSF components. This article by Ji Hoon Kim will provide an overview of creating a simple multilingual JSF page consisting of JSF Flex tags.
(June 29, Article)
In this session Jeff explores the key characteristics of successful SOA projects. He covers some of the patterns, and anti-patterns, tool sets, and strategies that he himself learned the hard way. Last, he provides a strategy and blueprint for achieving a high likelihood of success in your SOA project.
(June 23, Tech Talk)
Ari Zilka, CTO of Terracotta, Inc., talks about the new features in Terracotta 3.1, announced during JavaOne and available now.
(June 15, Tech Talk)
In this Tech Talk, Josh Long explores an integration challenge using Spring Integration and walks through the implementation, employing and expanding on the basic patterns of Enterprise Application Integration to tie together components into a function integration solution, and then demonstrates how Spring Integration helps address the integration requirements.
(June 15, Tech Talk)
In this Tech Talk, David Geary teaches you: The basics of Google Web Toolkit; How to implement Ajax-enabled applications in Java; Internationalization; Hooking into the browser history mechanism; Remote procedure calls.
(June 4, Tech Talk)
Jon Kern discusses the best architecture/technical solutions and ensure that they are repeated by all developers. By tackling the architecture up-front in a serial manner, subsequent parallel development will be much more manageable and predictable.
(May 28, Tech Talk)
This keynote describes the frustrations of modern knowledge workers in their quest to actually get some work done, and solutions for how to guard yourself against all those distractions. Neal Ford talks about environments, coding, acceleration, automation, and avoiding repetition as ways to defeat the misguided attempts to sap your ability to produce good work.
(May 26, Tech Talk)
Gil demonstrates how new, aggressive uses of already abundant compute capacity by common applications offer competitive value for application designers.
(May 21, Tech Talk)
Chris Keene introduces WaveMaker as a new way to automate the ability to generate Hibernate classes in order to more quickly bring OR mapping into an application.
(May 19, Article)
In this session Nati Shalom demonstrates how to take a standard Java EE web application and scale it out or down dynamically without changes to the application code. Seeing as most web applications are over-provisioned to meet infrequent peak loads, this is a dramatic change because it enables growing your application as needed, when needed, without paying for unutilized resources.
(May 19, Tech Talk)
Mastering EJB was one of the original and most influential EJB books in the industry. Mastering EJB III now returns with two new expert co-authors, updated for EJB 2.1 and 30% new chapters including security, integration, best practices, open source, and more.
(Book PDF Download)
The Application Server Matrix is a detailed listing of J2EE vendors and their application server products, with information on latest version numbers, J2EE spec support and licensing, pricing, platform support, and links to product downloads and reviews.
(Application Server Comparison Matrix)
|
|