has been released. NUX is an open-source toolkit meant to make working with XML easier, by providing easy access to Xquery and Xpath, full-text searching through Lucene, pools and factories for documents, builders, transforms, and queries, a streaming path filter API, and the ability to query badly-formed HTML.
It is designed to enable maximum efficiency for on-the-fly matchmaking combining structured and fuzzy fulltext search in realtime streaming applications such as XQuery based XML message queues, publish-subscribe systems for Blogs/newsfeeds, text chat, data acquisition and distribution systems, application level routers, firewalls, classifiers, etc.
Arbitrary Lucene fulltext queries can be run from Java or from XQuery/XPath/XSLT via a simple extension function. The former approach is more flexible whereas the latter is more convenient. Lucene analyzers can split on whitespace, normalize to lower case for case insensitivity, ignore common terms with little discriminatory value such as "he", "in", "and" (stop words), reduce the terms to their natural linguistic root form such as "fishing" being reduced to "fish" (stemming), resolve synonyms/inflexions/thesauri (upon indexing and/or querying), etc. Also see Lucene Query Syntax as well as Query Parser Rules.
For the full release announcement, see http://lists.ibiblio.org/pipermail/xom-interest/2005-May/002299.html