Compass 0.4, Search Engine/Object Mapping engine, Released


News: Compass 0.4, Search Engine/Object Mapping engine, Released

  1. Compass is a new search engine that allows programmers to map a Java object model into a Lucene search engine. This means your searches can operate over an object model.

    Compass uses a declarative mapping technology, called Object Search Engine Mapping (OSEM), used to map object attributes into searchable metadata. An example of searching, from the web site:
    CompassHits hits = session.find("jack london");
    Author a = (Author)hits.get(0);
    Transaction support is incorporated, and the authors have indicated it has Spring support, although Spring isn't mentioned clearly on the site.
  2. There are many benefits for exposing your domain model through a search engine. The most obvious is your applications users can create rich queries using common search engine expressions. When did you last expose your domain model to a user with an SQL client?

    The second major benefit is you write less code than conventional database query. Especially with modern layered applications with abstractions between each layer! So much coding.

    Finally, with Compass its almost as easy to search multiple data models on different databases than on a single. Never a statement I'm say with conventional Relational Database centric queries.
  3. Alan,

    In your world - is the sky blue?
  4. I agree. This sounds very interesting. It is kind of a pain to have to perform a search of the database one way and of documents another.
  5. Compass and Databases[ Go to top ]

    I agree. This sounds very interesting. It is kind of a pain to have to perform a search of the database one way and of documents another.

    Compass is build using several modules, most of them based on the Compass::Core module. Compass::Core provides all the mapping and transactional support (among others), which than can be utilized by other modules.

    Compass::Gps, which I have already started to implement, aims to help and integrate with other "Data Sources", one of them is the database. The database integration is based on O/R mapping tools (Hibernate mainly). It will integrate with Hibernate 3 event system to automatically synchronize between the database and the search engine, as well as providing the ability to index the whole database (through the domain model). What do you say about not working for having a google like search engine that searches your domain model! without an effort (other than declaring the mapping)

    Another point is that if it is synchronized all the time, you can start thinking about complex sql queries and might decide to change them to search engine queries and query the search engine instead (for performance and simplicity).
  6. Grey actually, but thanks for asking ;-)

    I guess people said the same about O/R mapping when it started. I'm suggesting that exposing your domain model through a search engine has many interesting advantages.
  7. Grey actually, but thanks for asking ;-)

    Quick question.

    I am wondering if the following use-case can be easily solved in your framework:

    There are several modules in the system (portlets in a portal): news, calendar, forum, document repository, etc. Each of them have their own, independent, domain model in Java and table structure in the DB. The task is to come up with a "site-wide" search, that will search in all modules at the same time, by a keyword. The keyword (or a combination of keywords) has to be matched with different fields, for each module, according to the set priorities. Naturally, for the sake of performance, this site-wide search, better be served from one Lucene index. When this is built, module-specific search, will be possible to be run against the same index, by just filtering on module field of the index (smth like: AND module=news).

    This is possible with raw Lucene. How would it be done in Compass and would Compass provide any benefits, over Lucene, in this scenario?
  8. Most important, Lucene is an amazing search engine, and Compass provides it's features on top of Lucene (if Lucene was bad, Compass would have been bad by default, at least now Compass has a chance :-) ).
       First and major benefit, since you already have a domain model, is that you don't have to work to map your domain model to Lucene data model (Document and field - think HashMap). You need to define the mapping and that's it.
       The other major benefit is that you are abstracted from Lucene API's. If you tried to updated an entry in Lucene, you will see that you need to have an identifier, delete the entry, and add the new entry (all done using different index interfaces). Compass will do all that for you, and faster than if you were using Lucene core API's.
        Other thing is that if your portal is updated dynamically, you can use Compass to update the search engine dynamically as well, thanks to it's transactional support and it's update support.
        Last thing is, when working with Lucene, you have to write a substantial amount of code which if you were using Compass, you would have been abstracted from it (note that if you are batch indexing, Compass utilizes Lucene supreme indexing performance).
        There are a lot of other things that compass gives you (common meta-data for one, a centralized place to define all your meta data, which sounds crucial in a portal type of an application), and much more in the future (the near one). If you wish to learn more, visit the Compass site, and if you wish to continue to discuss it in more detail, you can use the mailing list at: Mailing Lists.
        A long answer for a quick question :-)
  9. Thanks,

    that's a nice answer about Compass benefits, in general but I do not see it covering my use-case. From my very vague understanding, Compass does it all by mapping Domain model with the search but in my use-case those two have to be very loosely-coupled as the site-wide search searches across the domain models of modules/portlets.

    Is that possible in Compass? Any code snippet?
  10. it is possible in compass, and it is one of the main reasons why you would use it. But, as you said, nothing speaks louder than an example.
       Let as assume that we have two portlets, one that displays emails, and one that displays news items. The domain model has an Email entity, and one of it's properties is called emailTitle, the News entity has a newsTitle.
       Defining the mapping in compass is simple as:
         <class name="eg.Email" alias="email">
                <property name="emailTitle">
         <class name="eg.News" alias="news">
                <property name="newsTitle">
        What do you get? If you persist the Email object, it's emailTitle property value will be searchable under the meta-data name: title, and the same goes for the News object. Note that both of them have different property names. But compass does more, since it can map to multiple meta-datas, as you see.
        So if I want to search for all the titles that have a certain value, I simply execute a search with the query: "title:Test", and if I want to find just the news titles I can query: "newsTitle:Test".
        It also solves your module question by the way, since the alias concept in compass can be used as the module. So searching only news can be: "+alias:news +title:Test".
       But wait there is more, what happens if you want to search ALL the meta-datas (keys)? compass does it automatically! You just search for: "Test". nice?
       Compass also acknowledge the fact that multiple domain models will map to the same meta-data. Thats why it has the common meta data concept to make your life easier.
        I left out code snippet, which makes your life even more simpler, just visit the site and read the core module documentation.
  11. Shay,

    an example speaks thousand words :)

    Thanks a lot. This is very interesting. Can you drop some words about the level of stability of the code and roadmap?
  12. The current modules of Compass are considered stabled on a beta level. The Compass project itself is considered Alpha simply because it is the first release and there are more modules to come.

        The API's that you will be using as a user are not expected to change, as well as the core mappings. More features will follow, but I expect them to enrich the API rather than change it.

        The roadmap is very aggressive, the next release (0.5) is expected to have the GPS modules with at least the Hibernate device, with more features to the core module (for example I am working now on the ability to map several aliases to the same Lucene index instead of each one having it's own index).

        As always, any defects or help that you might require, you will find that I am here and committed to help however I can, and thats why 0.4.X are there! just join the mailing lists and give it a go.
  13. Shay,

    Can an index be partitioned?

    Example: Index for a site-wide search (that you have already discussed) in a multisite environment. Each site should, better, have own index. Each index will be generated from the same modules (hence entities/tables) but each index will only have data filtered by site_id (porta_id... whatever).

    The assumption is that search should have site-scope. Searching across the sites is not needed.
  14. I am not sure that I understand, but if you mean that you want to share your mapping definitions, and have different sites (independent of each other), each with it's own index, than you can do that.

       Just define several compass.cfg.xml files, each per the site that you are using (or better yet, create the configuration programmatically, so only the connection settings need to be changed for each site - and not all the other settings and mappings) and use a different Compass instance for different sites. You can store them in a Map of some sort, with the siteId as the key, and whenever you want to work with a specific site, just fetch the Compass instance it by that site id and use it to create sessions and work with it.

    p.s. Soon the news post will disappear from the main site, I will try to monitor it more, but it is better if you sent your questions to the user mailing lists.