- A powerful data schema
- External configuration via xml
- Faceted search
- Hit highlighting
- Flexible caching
- A web admin interface
Apache Solr 1.1 is the first Solr release since joining the Apache Incubator. Solr is a high performance full-text search server based on Lucene, written in Java5, and easily extensible through plugins written in Java. Documents are added to a search collection via XML over HTTP, and the search collection is queried via HTTP to receive an XML response (or alternately JSON, Python or Ruby text formats). Major featurs include:
- Posted by: Yonik Seeley
- Posted on: December 23 2006 07:32 EST
- Re: Solr 1.1, Lucene based search server, released by Robert McIntosh on December 24 2006 03:31 EST
- Xml Documents Only? by paul browne on December 26 2006 19:00 EST
- Solr and clustering by Tony Starks on December 27 2006 05:01 EST
- Re: Solr 1.1, Lucene based search server, released by Konstantin Ignatyev on December 27 2006 12:01 EST
- Re: Solr 1.1, Lucene based search server, released by Yonik Seeley on December 27 2006 20:45 EST
- Structured Data? by Steve Dyer on January 02 2007 17:28 EST
- Highlighting search key words in results obtained from Lucene by Anil Kumar on February 28 2008 06:41 EST
I recently used Solr on a project and it has worked wonderfully. The added features that were implemented on top of Lucene really helped us out.
Does Solr handle XML Documents only? If so it is a very useful (but incomplete) next step on top of Apache Lucene. Paul http://red-piranha.sourceforge.net
Does Solr handle XML Documents only?No, XML is only used as the transport for indexing Lucene documents (Solr doesn't have "crawlers"). This assumes you already know how to split up your data into fields and values. Example: Lucene In Action Erik Hatcher Otis Gospodnetic 1932394281 [...]
Can someone highlight the strengths of Solr in terms of clustering and replication? The Solr homepage mentions that they can do this and that - but don't describe HOW they achieve it. Project based usages could help to decide whether this a solution worth considering.
Can someone highlight the strengths of Solr in terms of clustering and replication?Did you read the Wiki? http://wiki.apache.org/solr/ This items focus on what you want: # CollectionDistribution * SolrCollectionDistributionScripts * SolrCollectionDistributionStatusStats * SolrCollectionDistributionOperationsOutline # CollectionRebuilding I hope it helps. Luis Neves
The Solr homepage mentions that they can do this and that - but don't describe HOW they achieve it.
Project based usages could help to decide whether this a solution worth considering.
Master/Worker pattern, similar to Terracotta DSO/Clustering. basically a centralize server that handles all the updates that will happen.. and you have "searcher" nodes which maintains a local copy of the index,the master server updates the worker nodes indexes in interval...
Could you provide comparison with Nutch please?
Could you provide comparison with Nutch please?Nutch is more like an open-source google... it's for crawling, converting, indexing, and searching websites. Solr is more of a general-purpose search server, and it assumes you already have structured data (like catalog data, music collections, etc).
If the data is structured, why not use a real database? Steve
Current free databases don't do full-text search well, and it's painful to try and do things like faceted search.
Hi Everyone, Can you please let me know what configurations should be done for highlighting search key words in results obtained from Lucene. I read somewhere that i have to give hl=true while querying for a document in Lucene & the attributes to be highlighted will be enclosed in tag, when we receive the response document from Lucene. But,i am not getting any attributes enclosed with tag. Please let me know the steps for highlighting the search keys in the result, that i have got from Lucene. Thanks, Anil
You need to explicitly specify which fields you want to highlight: hl.fl=text1,text2