667481 members! Sign up to stay informed.

Sponsored Links


Resources

Enterprise Java
Research Library

Get Java white papers, product information, case studies and webcasts

News News News Messages: 16 Messages: 16 Messages: 16 Printer friendly Printer friendly Printer friendly Post reply Post reply Post reply XML XML XML

Apache Solr 1.3.0 Released

Posted by: Grant Ingersoll on September 17, 2008 DIGG
The Apache Solr team is happy to announce the availability of Solr 1.3.0 for public download. This version contains many enhancements and bug fixes, including:
- Distributed search capabilities
- Numerous Lucene and other performance improvements
- Support for multiple indexes in a single deployment
- SolrJ client and a binary response protocol for faster client-server communication
- Search Components that can be chained together to offer flexible query processing. Components include existing functionality like faceting and add More Like This, Editorial Boosting (Query Elevation) and Spell Checking
- New DataImportHandler for easily indexing database content into Solr

See the http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.3.0/CHANGES.txt for more details. The download is available from http://www.apache.org/dyn/closer.cgi/lucene/solr/. See the Solr Wiki for documentation: http://wiki.apache.org/solr/

About Apache Solr:
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and many more features. It runs in a Java servlet container such as Tomcat. For more information, refer to the Solr website at http://lucene.apache.org/solr/.

Threaded replies

·  Apache Solr 1.3.0 Released by Grant Ingersoll on Wed Sep 17 08:58:44 EDT 2008
  ·  Re: Apache Solr 1.3.0 Released.. what is it? Oh. by Joseph Ottinger on Wed Sep 17 12:48:12 EDT 2008
    ·  Re: Apache Solr 1.3.0 Released.. what is it? Oh. by Jared Bunting on Wed Sep 17 19:02:28 EDT 2008
      ·  Re: Apache Solr 1.3.0 Released.. what is it? Oh. by random fletch on Wed Sep 17 19:12:06 EDT 2008
  ·  Check out DBSight 1.6.0 by Chris Lu on Wed Sep 17 16:53:23 EDT 2008
    ·  Re: Check out DBSight 1.6.0 by Yonik Seeley on Wed Sep 17 17:33:40 EDT 2008
    ·  Re: Check out DBSight 1.6.0 by uri b on Wed Sep 17 18:36:09 EDT 2008
      ·  Re: Check out DBSight 1.6.0 by Chris Lu on Wed Sep 17 19:10:39 EDT 2008
        ·  Re: Check out DBSight 1.6.0 by Shalin Mangar on Sat Sep 20 05:20:09 EDT 2008
          ·  Re: Check out DBSight 1.6.0 by Chris Lu on Tue Sep 23 03:21:04 EDT 2008
  ·  Re: Apache Solr 1.3.0 Released by Emmanuel Bernard on Wed Sep 17 18:00:48 EDT 2008
    ·  Re: Apache Solr 1.3.0 Released by Yonik Seeley on Wed Sep 17 19:31:54 EDT 2008
  ·  Re: Apache Solr 1.3.0 Released by Kumar Mettu on Wed Sep 17 19:08:42 EDT 2008
  ·  Re: Apache Solr 1.3.0 Released by Sunil n Abinash - on Wed Sep 17 20:37:43 EDT 2008
  ·  Re: Apache Solr 1.3.0 Released by Mark Nuttall on Thu Sep 18 09:28:31 EDT 2008
    ·  Solr usages Scenarios by Grant Ingersoll on Thu Sep 18 11:14:37 EDT 2008
  ·  Lucene: Nutch, Hadoop/HBase, SOLR, Compass(?), DBSight(???) by Fuad Efendi on Sun Nov 02 22:33:01 EST 2008
  Message #268796 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released.. what is it? Oh.

Posted by: Joseph Ottinger on September 17, 2008 in response to Message #268762
Thank God you told me what I needed to know - like "What is Apache Solr?" - before telling me stuff I needed to know less, like what this release changed.

Oh, wait...

  Message #268803 Post reply Post reply Post reply Go to top Go to top Go to top

Check out DBSight 1.6.0

Posted by: Chris Lu on September 17, 2008 in response to Message #268762
Solr comes a long way. Congratulations!

DBSight actually started on 2004, long before Solr. Actually some features are in DBSight first, and copied into Solr. And some features are not copied yet.

If you have any problem with Solr, try DBSight. It is free to use, and is really Instant Scalable Full-Text Search On Any Database/Application. You can get started pretty quickly.

site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! So as long as you know how to pull data into the database, DBSight can earn some money for you.

  Message #268806 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Check out DBSight 1.6.0

Posted by: Yonik Seeley on September 17, 2008 in response to Message #268803
Actually, I started Solr in July 2004 (it wasn't open source at the time though... this was within CNET). And I assume you're using the term "copied into" loosely, as I was not even aware of DBSight for some time, and I can assure you that I've not personally copied any features from it.

  Message #268807 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released

Posted by: Emmanuel Bernard on September 17, 2008 in response to Message #268762
Sweet.
Are you guys using a trunk version of Lucene 2.4?
Do you know when 2.4 will come out officially?

  Message #268809 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Check out DBSight 1.6.0

Posted by: uri b on September 17, 2008 in response to Message #268803
Surely you're kidding!!! DBSight was founded by 3 Ex-Oracle employees. You're main product is targeting database indexing and is heavily based on Lucene. Now... you have Solr, which is a generic search engine (not only targeting DB) that was developed by the Lucene committers themselves. In fact, quite a few Solr features and Lucene enhancements done in Solr found their way into the Lucene code base (so in a sense DBSight benefits from Solr as well).

Yes... Solr came a long way, and it's still going. Extremely active community and development. Of course I would like to see better quality and structured code base, but if you ask me to choose a search solution, I would definitely go for Solr as it's developed by the same IR experts who brought us Lucene, Nutch, Hadoop, Tika, and Mahout.

And no... as someone who's been monitoring the code base of Solr for years now, as well as the user/dev forums, I can assure you that none of Solr's features/code was "copied" from your product. If anything, most of the new concepts and ideas are coming based on the enterprise search market as a whole and the features offered by the big commercial players in it (which I'm sorry to say, but you're not one of them).

About making money. You can check out the following link to see a (partial) list of companies making a lot of money with Solr-based products: http://wiki.apache.org/solr/PublicServers

Sorry for the somewhat "harsh" response... I just don't like to see people/companies take credits for other people's hard work.

Congratulations to all people involved in Solr for finally making this 1.3 release!!! I do hope though, that from now on, there will be more steady and shorter release cycles.

  Message #268812 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released.. what is it? Oh.

Posted by: Jared Bunting on September 17, 2008 in response to Message #268796
Hey...it's better than when we get a post that doesn't even mention what the product is, only talks about what has changed.

  Message #268813 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released

Posted by: Kumar Mettu on September 17, 2008 in response to Message #268762
Just 15 months back when Solr 1.2 release was announced on TheServerSide I was asking for references of any one using Solr in their production environments here:
http://www.theserverside.com/news/thread.tss?thread_id=45719#234206

Today I am more than happy that we chose Solr over any commercial product available in the market.

  Message #268814 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Check out DBSight 1.6.0

Posted by: Chris Lu on September 17, 2008 in response to Message #268809
Well, you are right and I was too quick. DBSight works on database only. But it's funny to see after so many years, the "new" feature, the data importer, is learned from DBSight. You can check the jira entry.

  Message #268815 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released.. what is it? Oh.

Posted by: random fletch on September 17, 2008 in response to Message #268812
Shhhhhhh... it's a secret. There have been big changes, though, oh yes, BIG changes.

  Message #268816 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released

Posted by: Yonik Seeley on September 17, 2008 in response to Message #268807
Yes, we occasionally make Solr releases with trunk versions of Lucene that we feel comfortable enough with.
Barring any serious bugs, I'd estimate that we'll have Lucene 2.4 out by very early October.

  Message #268819 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released

Posted by: Sunil n Abinash - on September 17, 2008 in response to Message #268762
A very happy user of Solr and Hadoop. The ability to deal with even structured data is amazing.

Thanks
Sunil
http://sunilabinash.vox.com

  Message #268830 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Apache Solr 1.3.0 Released

Posted by: Mark Nuttall on September 18, 2008 in response to Message #268762
About Apache Solr:
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and many more features. It runs in a Java servlet container such as Tomcat. For more information, refer to the Solr website at http://lucene.apache.org/solr/.

Is there something somewhere that explains what it does/scenarios in which to use it? (besides the blurb and features). It sounds very interesting and I have used Lucene (java and .net) and a few projects.

  Message #268836 Post reply Post reply Post reply Go to top Go to top Go to top

Solr usages Scenarios

Posted by: Grant Ingersoll on September 18, 2008 in response to Message #268830
Granted, I'm a bit biased, but I think you can use Solr pretty much anywhere you would Lucene, or for that matter, any other search vendor. I think it particularly shines in the application that has text, plus metadata (price, author, manufacturer, etc.) and you want to offer search and faceting (i.e. like what you see in the left hand side of Amazon.com when you do a search). I've also used it for general search, since it has all of Lucene's goodness in it. If you're used to Lucene, it's easy to work with Solr too. If you're not used to Lucene but want to take advantage of it, Solr is a much easier starting point, since you don't have to build up all the infrastructure to take the Lucene library and make it a search server.

One good starting point to answer your question on scenarios is the "Powered By" page on the wiki: http://wiki.apache.org/solr/PublicServers/ From there, you can see how a number of different people use it. Personally, I found that every time I did a Lucene project, I ended up writing something that more or less looked like Solr in terms of how it manages the Lucene indexes, etc. (this was before Solr was open sourced). Now, I just use Solr.

  Message #269412 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Check out DBSight 1.6.0

Posted by: Shalin Mangar on September 20, 2008 in response to Message #268814
We started writing DataImportHandler at AOL to simplify an extremely common use-case. A majority of users store content in databases which need to be transferred to Solr for scalable full text search. We thought it would be good to contribute such a feature back into Solr.

We were unaware of your product until I subscribed to the lucene java-user mailing list and saw one of your emails with the promotional footer text. This was well after we had suggested this feature to solr mailing list and opened the DataImportHandler issue in Solr's jira. We had already developed a large part of the functionality before proposing this to the solr mailing list. I'll leave it up to you to search the java-user archives and figure out the dates.

Let us not indulge in accusing one another and focus on adding value to our users. Let the users decide for themselves the merits of each solution.

  Message #269653 Post reply Post reply Post reply Go to top Go to top Go to top

Re: Check out DBSight 1.6.0

Posted by: Chris Lu on September 23, 2008 in response to Message #269412
Thanks for clearing my own confusion and mis-understanding. We do not really follow SOLR development process, but only saw some visiting references from links like this:

http://marc.info/?l=solr-dev&m=117789117914453&w=2

I totally understand the same approach can happen independently.

And I know in order to survive, software companies always need to innovate, to bring easy-to-use software to the users.

  Message #272351 Post reply Post reply Post reply Go to top Go to top Go to top

Lucene: Nutch, Hadoop/HBase, SOLR, Compass(?), DBSight(???)

Posted by: Fuad Efendi on November 02, 2008 in response to Message #268762
Congratulations, and thank you for sharing this very interesting Lucene implementation! Don't forget: it started as a shopping engine for CNET.

I didn't try DBSight but I noticed some noizy posts in Lucene-related message boards. I heard about DBSight from a colleague who suggested it "to have full text search for a database", who believed it is quick and easy solution.

I tried to evaluate DBSight and first of all browsed available configuration settings directrly in WEB-INF folder and subfolders. Looks weak... I tried Compass before SOLR.

For a "search add-on" for existing database SOLR offers most of possible freedom. You don't even need to have a database for it: indeed, Lucene internals implement "data normalization" automatically for you. Behind the scenes, Apache Hadoop/HBase uses several layers of data compression of different kinds (different algo) which is also "data normalization" but not the way as DBA understands it...

Never ever try to automate full-text searches with databases!!!

For instance, Compass (Hibernate + Lucene) promises "transactional support" but... in some cases "commit" may take few minutes in Lucene (merging few files), what about "optimize"?

Recently I got a call from well-known technology company, they have a client who needs SOLR to implement database full-text search for about 8-10 billions documents, and SOLR was choosen as a "simplest" solution. Are you kidding? Even pure Lucene can't handle that in a single index, even SOLR Shards will need 64 additional GET request parameters for such a distributed search!!!

Lucene uses FieldCache internally for performance optimizations, the primary cause of hundreds-thousands posts related to OutOfMemoryException in SOLR-user and Lucene-user mailing lists (including posts from DBSight technical staff). What is it: it is an array storing "Field" for each non-tokenized non-boolean field for all documents stored in an index. For 10 billions of documents with simplest field such as Social Insurance Number or ISBN, single Lucene index will need an array of average size 1 Terabytes. SOLR can't handle such distribution (only if you have hardware with few terabytes of RAM).

A lot of work is going on in Lucene: for instance, to remove synchronization on isDeleted() method which is called for each query. Would be nice to have non-synchronized versions for read-only indexes.

SOLR is not as huge as Lucene or LingPipe or GATE projects, but it is extremely effective tool. It is very easy to configure XML schema instead of working directly with Lucene API. Main selling point of SOLR (since CNET-based project started and contributed to Apache) is so-called "faceted search" which is simply calculating of intersection sizes of different DocSets (just look at search results of modern price comparison sites - they show subset counts for different categories). However, that was too... architectural mistake. Look at https://issues.apache.org/jira/browse/SOLR-711 - counting frequencies of Terms for a given DocSet is faster than counting intersections.

Lucene + Database: transactional???...

I started with Compass, then moved to Nutch, then - SOLR!!! Now I am using HBase just because power of MySQL + InnoDB is not enough for a highly concurrent application. No need to index database: instead, I am indexing data :)

Thanks,


http://www.tokenizer.org/bot.html
(Robot-based Shopping Engine)

New content on TheServerSide.comNew content on TheServerSide.comNew content on TheServerSide.com

Dependency Injection in Java EE 6 - Part 1

Reza Rahman explores the features of the proposed JSR 299, Contexts and Dependency Injection for Java EE (CDI). When approved, it promises to be a key feature of Java EE 6. (November 2, Article)

SAML: It's Not just for Web services

SAML is an XML-based standard for exchanging authentication and authorization data between security domains. The single most important problem that SAML was created to solve is the Web browser Single Sign-On problem. Many organizations are debating whether to stay with version 1.1 or move to 2.0. This article makes observations about both options. (September 28, Article)

Programming is Also Teaching Your Team

Joe Ottinger takes a look at how people learn, and applies it to the practice of programming. He notes that understanding how people learn is an essential part of working in a programming team. (September 22, Article)

Can Java EE Deliver The Asynchronous Web?

Stephen Maryka gave us an article about the Asynchronous Web and posed a number of questions that get examined like an approach to delivering Asynchronous Web capabilities through extensions to existing Java EE technologies. (July 14, Article)

JSF Flex

JavaServer Faces Flex goal is to provide users capability in creating standard Flex components, part of flexSDK which is open sourced through MPL license, as normal JSF components. This article by Ji Hoon Kim will provide an overview of creating a simple multilingual JSF page consisting of JSF Flex tags. (June 29, Article)

The Rules of SOA - A Road to a Successful SOA Implementation

In this session Jeff explores the key characteristics of successful SOA projects. He covers some of the patterns, and anti-patterns, tool sets, and strategies that he himself learned the hard way. Last, he provides a strategy and blueprint for achieving a high likelihood of success in your SOA project. (June 23, Tech Talk)

Ari Zilka Talks About Terracotta 3.1

Ari Zilka, CTO of Terracotta, Inc., talks about the new features in Terracotta 3.1, announced during JavaOne and available now. (June 15, Tech Talk)

Enterprise Application Integration, and Spring

In this Tech Talk, Josh Long explores an integration challenge using Spring Integration and walks through the implementation, employing and expanding on the basic patterns of Enterprise Application Integration to tie together components into a function integration solution, and then demonstrates how Spring Integration helps address the integration requirements. (June 15, Tech Talk)

Google Web Toolkit: An Introduction

In this Tech Talk, David Geary teaches you: The basics of Google Web Toolkit; How to implement Ajax-enabled applications in Java; Internationalization; Hooking into the browser history mechanism; Remote procedure calls. (June 4, Tech Talk)

Just Enough Early Architecture to Guide Development

Jon Kern discusses the best architecture/technical solutions and ensure that they are repeated by all developers. By tackling the architecture up-front in a serial manner, subsequent parallel development will be much more manageable and predictable. (May 28, Tech Talk)

Productive Programmer: On the Lam from the Furniture Police

This keynote describes the frustrations of modern knowledge workers in their quest to actually get some work done, and solutions for how to guard yourself against all those distractions. Neal Ford talks about environments, coding, acceleration, automation, and avoiding repetition as ways to defeat the misguided attempts to sap your ability to produce good work. (May 26, Tech Talk)

Auto-Scaling Your Existing Web Application

Gil demonstrates how new, aggressive uses of already abundant compute capacity by common applications offer competitive value for application designers. (May 21, Tech Talk)

Automating Hibernate Mapping and Queries For Java Web Development

Chris Keene introduces WaveMaker as a new way to automate the ability to generate Hibernate classes in order to more quickly bring OR mapping into an application. (May 19, Article)

Auto-Scaling Your Existing Web Application

In this session Nati Shalom demonstrates how to take a standard Java EE web application and scale it out or down dynamically without changes to the application code. Seeing as most web applications are over-provisioned to meet infrequent peak loads, this is a dramatic change because it enables growing your application as needed, when needed, without paying for unutilized resources. (May 19, Tech Talk)

Free Book PDF Download: Mastering EJB Third Edition

Mastering EJB was one of the original and most influential EJB books in the industry. Mastering EJB III now returns with two new expert co-authors, updated for EJB 2.1 and 30% new chapters including security, integration, best practices, open source, and more.
(Book PDF Download)

Application Server Matrix

The Application Server Matrix is a detailed listing of J2EE vendors and their application server products, with information on latest version numbers, J2EE spec support and licensing, pricing, platform support, and links to product downloads and reviews.
(Application Server Comparison Matrix)

News | Blogs | Discussions | Tech talks | Patterns | Reviews | White Papers | Downloads | Articles | Media kit | About
Java Solutions
All Content Copyright ©2007 TheServerSide Privacy Policy
Site Map