Discussions

News: Lucene 2.9 Released

  1. Lucene 2.9 Released (13 messages)

    On behalf of the Lucene dev community (a growing community far larger than just the committers) I would like to announce the release of Lucene 2.9. While we generally try and maintain full backwards compatibility between major versions, Lucene 2.9 has a variety of breaks that are spelled out in the 'Changes in backwards compatibility policy' section of CHANGES.txt. We recommend that you recompile your application with Lucene 2.9 rather than attempting to “drop” it in. This will alert you to any issues you may have to fix if you are affected by one of the backward compatibility breaks. As always, its a really good idea to thoroughly read CHANGES.txt before upgrading. Lucene 2.9 comes with a bevy of new features, including: * Per segment searching and caching (can lead to much faster reopen among other things) * Near real-time search capabilities added to IndexWriter * New Query types * Smarter, more scalable multi-term queries (wildcard, range, etc) * A freshly optimized Collector/Scorer API * Improved Unicode support and the addition of Collation contrib * A new Attribute based TokenStream API * A new QueryParser framework in contrib with a core QueryParser replacement impl included. * Scoring is now optional when sorting by Field, or using a custom Collector, gaining sizable performance when scores are not required. * New analyzers (PersianAnalyzer, ArabicAnalyzer, SmartChineseAnalyzer) * New fast-vector-highlighter for large documents * Lucene now includes high-performance handling of numeric fields. Such fields are indexed with a trie structure, enabling simple to use and much faster numeric range searching without having to externally pre-process numeric values into textual values. --- And many, many more features, bug fixes, optimizations, and various improvements. You can find the full list of changes here: http://lucene.apache.org/java/2_9_0/changes/Changes.html Many changes have also occurred in Lucene's Contrib area: http://lucene.apache.org/java/2_9_0/changes/Contrib-Changes.html Binary and source distributions are available at http://www.apache.org/dyn/closer.cgi/lucene/java/ Lucene artifacts are also available in the Maven2 repository at http://repo1.maven.org/maven2/org/apache/lucene/ The Next Release: The next release will be Lucene 3.0. This should come along shortly, and will remove all of the deprecated code in Lucene 2.9. Lucene 3.0 will also be the first release to move from Java 1.4 to Java 1.5 as a requirement. Thanks, Mark Miller

    Threaded Messages (13)

  2. congrats![ Go to top ]

    Lucene is an exceptional open source library for text mining and search. Many of today's most popular websites and web tools are powered by it. Simply amazing. Keep up the good work.
  3. Just wondering, as I'm totally new to this kind of technology, but what is the advantage of using Lucene vs the full text search capabilities of a database such as PostgreSQL?
  4. Re: Lucene 2.9 Released[ Go to top ]

    Near real-time search capabilities added to IndexWriter
    You have just made my day, thanks! Lucene was/is/will be great :) Btw, you have skipped a few version, why not 3.0 now?
  5. Re: Lucene 2.9 Released[ Go to top ]

    Excellent news. Wonder if there is a new book in the works too.
  6. Response[ Go to top ]

    bq. Btw, you have skipped a few version, why not 3.0 now? We have a standard policy of release a .9 version with many API deprecations, and then the next major number removes those deprecations. Part of our back compat policy. 3.0 won't likely really include any new features. RE: a new book Yes - the new Lucene In Action will cover 3.0 and be finished right after its released. You can get the early edition MEAP now, and its fantastic.
  7. Re: Lucene 2.9 Released[ Go to top ]

    Great job!
  8. Lucene VS db search[ Go to top ]

    Generally, DB full text search has been very slow and lacking in customizability. I won't single any one out, but go do some perf tests and you will quickly see what I mean. They can be very useful for limited needs, but you will find something like Lucene to be *much* faster, *much* more scalable, and with scads more customizability and options.
  9. Re: Lucene VS db search[ Go to top ]

    Generally, DB full text search has been very slow and lacking in customizability. I won't single any one out, but go do some perf tests and you will quickly see what I mean. They can be very useful for limited needs, but you will find something like Lucene to be *much* faster, *much* more scalable, and with scads more customizability and options.
    Plus using a db's search means you will need a db.
  10. Re: Lucene VS db search[ Go to top ]

    Plus using a db's search means you will need a db.
    I considered using lucene but I already have databases and in geneneral I do see big lacks only doing fulltext indexes without having a real database. I think many projects just using lucene and doing just plain fulltext indexing (like Google does on websites) do oversee the benefits of having structured meta information. I also could never get friend with object oriented databases. And if I want to use Lucene over the DB fulltext features, how can I combine an SQL query with the fulltext results? - I do see a challenge here. Is there a solution?
  11. Re: Lucene VS db search[ Go to top ]

    And if I want to use Lucene over the DB fulltext features, how can I combine an SQL query with the fulltext results? - I do see a challenge here. Is there a solution?
    Do not use SQL at all, why bother? :) I mean it.
  12. Re: Lucene VS db search[ Go to top ]

    Do not use SQL at all, why bother? :)
    I mean it.
    From my point of view there are major drawbacks in using just fulltext engine. - Sometimes I have the feeling that it is modern to just get rid of relational database structures. I cannot understand thise movement as it fits just for certain requirements.
  13. Re: Lucene and db search[ Go to top ]

    One approach is for the Lucene query to return primary keys. You can then use those to execute your sql query... where pk in (...) Given you are often "paging" the search results this works well - limiting the number of primary keys to the number you want to display on the first page of results.
  14. Re: Lucene and db search[ Go to top ]

    One approach is for the Lucene query to return primary keys. You can then use those to execute your sql query... where pk in (...)

    Given you are often "paging" the search results this works well - limiting the number of primary keys to the number you want to display on the first page of results.
    I was also considering this option and I think this is the best. Thanks