Discussions

News: Kettle 2.2 goes LGPL

  1. Kettle 2.2 goes LGPL (12 messages)

    Up to now, the options for creating Management Information Systems (MIS) using open source tools where limited. Although there are numerous possibilities appearing on the reporting side, the absence of a complete open-source ETL tool (Extraction, Transformation & Loading) has been causing problems for those organisations with little money.

    I'm pleased to be able to place 158.000 lines of Java code under the LGPL license in the form of a complete ETL environment called Kettle. Kettle has been more then 4 years in the making and in that period has grown to a very usable and complete ETL tool. It is available for download from the Kettle project homepage at Javaforge.

    Kettle comes with 4 tools:
    • Spoon: GUI allowing you to design complex transformations
    • Pan: Batch executor of transformations (XML or in repository)
    • Chef: GUI allowing you to design complex jobs
    • Kitchen: Batch executor of jobs (XML or in repository)
    Note: Spoon & Chef use SWT from Eclipse to provide a native GUI on Windows, Linux, OSX and Motif. However, no dependencies exist between the GUIs and the runtime code.

    Interesting things to know about Kettle:
    • Transformations and jobs are made up of 100% meta data. This meta-data is parsed by Kettle and executed. No code-generation is involved.
    • At the moment there are around 35 different step types available to create transformations and about 10 job entry types.
    • Almost every popular database is supported including MySQL, PostgreSQL, MS Access, SQL Server, Oracle, DB2, Sybase, Informix, MaxDB, Firebird, AS/400, Ingres, Caché, ...
    • Kettle can be used for many things, but was developed to create and populate data warehouses. As such, slowly changing dimensions (Kimbal Types I, II and III) and junk dimensions are supported in a single step.
    • Many advanced features exists to allow fast inserts such as batch updates.
    • Kettle is one of the only ETL tools on the market to support partitioned tables on PostgreSQL by allowing records to be inserted into different inherited tables.
    • Kettle is providing a plugin mechanism that allows you to create plugins for any possible data acquisition or transformation purpose. For example the German company Proratio (http://www.proratio.de/kettle/kettleen/) provides a Kettle plugin to read information from an SAP application server. They have a free trial you can use for 30 days.
    For screencaps to see how Kettle works, go to http://www.kettle.be/en/screenshots.htm

    This message serves as an invitation to all Java and Data Warehouse developers to come and help make Kettle even better. Let me know what you all think and please join the Kettle development team!

    I hope you all have as much fun using Kettle as I had writing it.

    Matt
    ___________________________________________________________________________
    Matt Casters
    Brussels/Belgium, december 2005
    matt.casters@kettle.be

    Threaded Messages (12)

  2. Looks Good[ Go to top ]

    I took a look at the demos and it looks good on the usability front. Thanks for the great work and I look forward to testing it out. I will be comparing it with Octopus from ObjectWeb.

    Mpume
  3. error[ Go to top ]

    Seems very useful. I tried to downloand and run spoon and I get the following error.

    I tried using JDK 1.4.1_05 and JDK 1.4.2_07

    Exception in thread "main" java.lang.UnsupportedClassVersionError: be/ibridge/kettle/spoon/Spoon (Unsupported major.minor version 49.0)
            at java.lang.ClassLoader.defineClass0(Native Method)
            at java.lang.ClassLoader.defineClass(ClassLoader.java:502)
            at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
            at java.net.URLClassLoader.defineClass(URLClassLoader.java:250)
            at java.net.URLClassLoader.access$100(URLClassLoader.java:54)
            at java.net.URLClassLoader$1.run(URLClassLoader.java:193)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.net.URLClassLoader.findClass(URLClassLoader.java:186)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:299)
            at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:272)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:255)
            at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:315)

    I could not find a support forum or email address on the kettle site.

    Sanjiv
  4. Kettle Project[ Go to top ]

    The kettle project page is at: http://kettle.javaforge.com
    The project you're having is caused by the automatic build by the javaforge service (which is using 1.5...)

    The forum is here: http://www.javaforge.com/proj/forum.do?proj_id=318

    The problem will be fixed by tomorrow. (or you could try JRE 1.5)

    Cheers,
    Matt
  5. error (FIXED)[ Go to top ]

    Seems very useful. I tried to downloand and run spoon and I get the following error.I tried using JDK 1.4.1_05 and JDK 1.4.2_07Exception in thread "main"

    My informants reported that this problem is solved.
    To update, just grab the kettle.jar and put it in the lib/ directory.

    Matt
  6. Kettle[ Go to top ]

    Do you think we can load from mutli-data sources while the target is one?

    The multi-data sources will be: Mainframe IMS, Oracle, Sysbase,DB2?

    Thx

    Jerry
  7. Kettle[ Go to top ]

    Do you think we can load from mutli-data sources while the target is one?The multi-data sources will be: Mainframe IMS, Oracle, Sysbase,DB2?ThxJerry

    IMS is not (yet) supported unless ofcourse you have an ODBC driver for it. In that case the answer is yes. (The other 3 are a no-brainer)

    If you have a JDBC driver for IMS, I'd gladly add support for it though...

    Combining the sources can be done with database lookups, in memory lookups (stream lookups), stream joins, ...

    Cheers,

    Matt
  8. Plugins[ Go to top ]

    Thanks for contributing this useful software to the java communitiy!
    I've tried a simple textfile-to-database and it works well.
    Is there any further documentation related to the plugin-mechanism ? An FTP/HTTP/URL-Filereader would be quite useful and maybe some webservice-related input...

    kind regard,
    Andreas
  9. Plugins[ Go to top ]

    Hi,

    At the moment I only support FTP using a job entry. (Chef) Spoon is then reading the textfile in a normal way.
    That way you get more code-reuse.
    I'm sure that an URL filereader wouldn't be much of a stretch.

    The job entries are not yet completely plugin enabled, but it should get there in a couple of weeks or so. (docs...)

    I'm working on the plugin documentation, but basically you do it my implementing a kind of Model-Viewer-Controller strategy:

      A class that implements StepMetaInterface (model: metadata)
      A class that implements StepDialogInterface (viewer: dialog)
      A class that implements StepInterface (controller: execute/read/write/modify/...)

    As an extra I added StepDataInterface because from version 3.0 Kettle will use thread pools and I want to be able to kill and restart threads at will. (better performance when a large number of steps is used) The data class is used to store all intermediate data like open ResultSets, streams, whatever...

    The most simple step is DummyTrans: check out package: be/ibridge/kettle/trans/step/dummytrans
    Copy the code into a new project / package and your plugin is almost done.
    Grab a plugin.xml from the examples in the distribution.

    --> The interfaces are shown here: http://www.kettle.be/docs/api/be/ibridge/kettle/trans/step/package-summary.html

    "ant javadoc" will work on the source. It will end up in your kettle directory under docs/api

    Thanks for the kind feedback.

    Matt
  10. Kettle 2.2 goes LGPL[ Go to top ]

    Matt,

    Do you have a development roadmap for where you want to take Kettle?
  11. Roadmap (sort of )[ Go to top ]

    Hi Chris,

    Here is a short list of things I'dd like to do sooner or later... (ordered from sooner to later ;-)

    - Re-useable transformations dubbed Mappings (in 'honour' of Oracle Warehouse builder ;-)): steps can be found now in Experimental category. (almost complete)
    - Creation of a series of "Convert" steps to make data type conversions even easier
    - Re-enablement of Webstart for Spoon and Chef.
    - Plugins for database connections (why didn't I do this before :-( )
    - Thread pools for better performance when running large transformations
    - Creation of WEKA machine learning plugin (http://www.cs.waikato.ac.nz/~ml/index.html) for source table analyses, data mining, categorisation (any takers?)
    - Anything I need on my daily job as a data warehouse architect ;-)

    Cheers,

    Matt
  12. Kettle 2.2 goes LGPL[ Go to top ]

    Matt,

    Do you have any example that show how to use a web services as source or target ?

    I want to know if is possible to use kettle ETL to use a source a spring web service and insert the result in a database using jdbc.

    thanks in advance for your answer

    Jaime MADRID
  13. Project homepage & Pentaho[ Go to top ]

    The Kettle project pages contain a lot of answers to various questions on Kettle usage as well as a Weekly Kettle tip: go to http://kettle.javaforge.com

    Also see the announcement of the Kettle acquisition by Pentaho: http://www.pentaho.org/index.php?option=com_content&task=view&id=130&Itemid=238

    And the FAQ on the merger: http://www.pentaho.org/index.php?option=com_content&task=view&id=131&Itemid=283

    Thanks,
    Matt