-
Lucene 2.9 Released (13 messages)
- Posted by: Mark Miller
- Posted on: September 28 2009 07:56 EDT
On behalf of the Lucene dev community (a growing community far larger than just the committers) I would like to announce the release of Lucene 2.9. While we generally try and maintain full backwards compatibility between major versions, Lucene 2.9 has a variety of breaks that are spelled out in the 'Changes in backwards compatibility policy' section of CHANGES.txt. We recommend that you recompile your application with Lucene 2.9 rather than attempting to “drop� it in. This will alert you to any issues you may have to fix if you are affected by one of the backward compatibility breaks. As always, its a really good idea to thoroughly read CHANGES.txt before upgrading. Lucene 2.9 comes with a bevy of new features, including: * Per segment searching and caching (can lead to much faster reopen among other things) * Near real-time search capabilities added to IndexWriter * New Query types * Smarter, more scalable multi-term queries (wildcard, range, etc) * A freshly optimized Collector/Scorer API * Improved Unicode support and the addition of Collation contrib * A new Attribute based TokenStream API * A new QueryParser framework in contrib with a core QueryParser replacement impl included. * Scoring is now optional when sorting by Field, or using a custom Collector, gaining sizable performance when scores are not required. * New analyzers (PersianAnalyzer, ArabicAnalyzer, SmartChineseAnalyzer) * New fast-vector-highlighter for large documents * Lucene now includes high-performance handling of numeric fields. Such fields are indexed with a trie structure, enabling simple to use and much faster numeric range searching without having to externally pre-process numeric values into textual values. --- And many, many more features, bug fixes, optimizations, and various improvements. You can find the full list of changes here: http://lucene.apache.org/java/2_9_0/changes/Changes.html Many changes have also occurred in Lucene's Contrib area: http://lucene.apache.org/java/2_9_0/changes/Contrib-Changes.html Binary and source distributions are available at http://www.apache.org/dyn/closer.cgi/lucene/java/ Lucene artifacts are also available in the Maven2 repository at http://repo1.maven.org/maven2/org/apache/lucene/ The Next Release: The next release will be Lucene 3.0. This should come along shortly, and will remove all of the deprecated code in Lucene 2.9. Lucene 3.0 will also be the first release to move from Java 1.4 to Java 1.5 as a requirement. Thanks, Mark MillerThreaded Messages (13)
- congrats! by sheesh kebab on September 28 2009 09:37 EDT
- Difference with native full text search in a DB? by augustientje bloem on September 29 2009 15:07 EDT
- Re: Lucene 2.9 Released by Istvan Soos on September 28 2009 11:48 EDT
- Re: Lucene 2.9 Released by Mark N on September 28 2009 12:51 EDT
- Response by Mark Miller on September 28 2009 16:24 EDT
- Re: Lucene 2.9 Released by Nicholas Hrycan on September 29 2009 03:05 EDT
- Lucene VS db search by Mark Miller on September 30 2009 08:53 EDT
- Re: Lucene VS db search by Mark N on October 01 2009 13:28 EDT
-
Re: Lucene VS db search by Martin Wildam on October 02 2009 06:08 EDT
-
Re: Lucene VS db search by Istvan Soos on October 02 2009 09:10 EDT
- Re: Lucene VS db search by Martin Wildam on October 16 2009 08:43 EDT
-
Re: Lucene and db search by rob bygrave on October 05 2009 05:28 EDT
- Re: Lucene and db search by Martin Wildam on October 16 2009 08:43 EDT
-
Re: Lucene VS db search by Istvan Soos on October 02 2009 09:10 EDT
-
Re: Lucene VS db search by Martin Wildam on October 02 2009 06:08 EDT
- Re: Lucene VS db search by Mark N on October 01 2009 13:28 EDT
-
congrats![ Go to top ]
- Posted by: sheesh kebab
- Posted on: September 28 2009 09:37 EDT
- in response to Mark Miller
Lucene is an exceptional open source library for text mining and search. Many of today's most popular websites and web tools are powered by it. Simply amazing. Keep up the good work. -
Difference with native full text search in a DB?[ Go to top ]
- Posted by: augustientje bloem
- Posted on: September 29 2009 15:07 EDT
- in response to sheesh kebab
Just wondering, as I'm totally new to this kind of technology, but what is the advantage of using Lucene vs the full text search capabilities of a database such as PostgreSQL? -
Re: Lucene 2.9 Released[ Go to top ]
- Posted by: Istvan Soos
- Posted on: September 28 2009 11:48 EDT
- in response to Mark Miller
Near real-time search capabilities added to IndexWriter
You have just made my day, thanks! Lucene was/is/will be great :) Btw, you have skipped a few version, why not 3.0 now? -
Re: Lucene 2.9 Released[ Go to top ]
- Posted by: Mark N
- Posted on: September 28 2009 12:51 EDT
- in response to Mark Miller
Excellent news. Wonder if there is a new book in the works too. -
Response[ Go to top ]
- Posted by: Mark Miller
- Posted on: September 28 2009 16:24 EDT
- in response to Mark Miller
bq. Btw, you have skipped a few version, why not 3.0 now? We have a standard policy of release a .9 version with many API deprecations, and then the next major number removes those deprecations. Part of our back compat policy. 3.0 won't likely really include any new features. RE: a new book Yes - the new Lucene In Action will cover 3.0 and be finished right after its released. You can get the early edition MEAP now, and its fantastic. -
Re: Lucene 2.9 Released[ Go to top ]
- Posted by: Nicholas Hrycan
- Posted on: September 29 2009 03:05 EDT
- in response to Mark Miller
Great job! -
Lucene VS db search[ Go to top ]
- Posted by: Mark Miller
- Posted on: September 30 2009 08:53 EDT
- in response to Mark Miller
Generally, DB full text search has been very slow and lacking in customizability. I won't single any one out, but go do some perf tests and you will quickly see what I mean. They can be very useful for limited needs, but you will find something like Lucene to be *much* faster, *much* more scalable, and with scads more customizability and options. -
Re: Lucene VS db search[ Go to top ]
- Posted by: Mark N
- Posted on: October 01 2009 13:28 EDT
- in response to Mark Miller
Generally, DB full text search has been very slow and lacking in customizability. I won't single any one out, but go do some perf tests and you will quickly see what I mean. They can be very useful for limited needs, but you will find something like Lucene to be *much* faster, *much* more scalable, and with scads more customizability and options.
Plus using a db's search means you will need a db. -
Re: Lucene VS db search[ Go to top ]
- Posted by: Martin Wildam
- Posted on: October 02 2009 06:08 EDT
- in response to Mark N
Plus using a db's search means you will need a db.
I considered using lucene but I already have databases and in geneneral I do see big lacks only doing fulltext indexes without having a real database. I think many projects just using lucene and doing just plain fulltext indexing (like Google does on websites) do oversee the benefits of having structured meta information. I also could never get friend with object oriented databases. And if I want to use Lucene over the DB fulltext features, how can I combine an SQL query with the fulltext results? - I do see a challenge here. Is there a solution? -
Re: Lucene VS db search[ Go to top ]
- Posted by: Istvan Soos
- Posted on: October 02 2009 09:10 EDT
- in response to Martin Wildam
And if I want to use Lucene over the DB fulltext features, how can I combine an SQL query with the fulltext results? - I do see a challenge here. Is there a solution?
Do not use SQL at all, why bother? :) I mean it. -
Re: Lucene VS db search[ Go to top ]
- Posted by: Martin Wildam
- Posted on: October 16 2009 08:43 EDT
- in response to Istvan Soos
Do not use SQL at all, why bother? :)
From my point of view there are major drawbacks in using just fulltext engine. - Sometimes I have the feeling that it is modern to just get rid of relational database structures. I cannot understand thise movement as it fits just for certain requirements.
I mean it. -
Re: Lucene and db search[ Go to top ]
- Posted by: rob bygrave
- Posted on: October 05 2009 17:28 EDT
- in response to Martin Wildam
One approach is for the Lucene query to return primary keys. You can then use those to execute your sql query... where pk in (...) Given you are often "paging" the search results this works well - limiting the number of primary keys to the number you want to display on the first page of results. -
Re: Lucene and db search[ Go to top ]
- Posted by: Martin Wildam
- Posted on: October 16 2009 08:43 EDT
- in response to rob bygrave
One approach is for the Lucene query to return primary keys. You can then use those to execute your sql query... where pk in (...)
I was also considering this option and I think this is the best. Thanks
Given you are often "paging" the search results this works well - limiting the number of primary keys to the number you want to display on the first page of results.