Archive for November, 2006

Compass 1.1 M3 Released

Tuesday, November 28th, 2006

The Compass team is pleased to announce to release of version 1.1 M3. This is the third milestone release of version 1.1, major features include:

  • Support for polymorphic relationships: Compass now supports polymorphic relationships when using OSEM. Poly relationships are automatically identified by Compass, and all the relevant mappings are detected and added. In case of one to one mappings or one to many using generics, the mappings are detected based on the class name. One to many relationship without generics can use a single reference alias mapping, with the rest of the derived mappings automatically used. There is still an option to explicitly list all the aliases using comma delimited list in the ref-alias.
  • Better cyclic mappings support: Compass now handles cyclic relationships much better, especially for component mappings. Note, since most times cyclic relationships are simply same objects referencing each other, max-depth value changed to 1 (from 5). In case of tree based cyclic relationship, max depth should probably be set to a higher value.
  • FS Transactional Log: Read committed transactions can now store most of the transaction log on the file system. This allows for bigger transactions to be executed under read committed on expense of performance. Current memory based transaction log is also supported (and is the default).
  • Runtime Settings: Certain settings can now be set on the session level, changing part of Compass behavior on a per session resolution. First runtime settings supported are the transaction log used for the current session.
  • JdbcDirectory support Oracle 9: Jdbc directory and storing the index in the database now support Oracle 9i.
  • Initial XA Support: Compass can now join an XA transaction by enlisting itself as an XA resource within a JTA transaction manager including participation in two phase commit. Resume crashed transactions is not supported.
  • Performance Improvement: Better concurrency support in Compass index cache support.
  • Several bug fixes and minor features. Full release notes are available here.

Transactional Log Support In Compass

Monday, November 27th, 2006

Up until now, Compass stored the transaction data for read committed transactions in memory. This allows for extremely fast transactions, but was bounded by the JVM memory size. Compass now supports file based transactional logs as well, allowing for bigger transactions to be executed in a read committed mode.

This should not be confused with the batch insert transaction isolation. The batch insert transaction isolation allowed for transactions to be as big/long as required, but only supported create operation (which can cause duplicates) and did not support rollbacks. With the new transaction log implementation, longer transactions can be supported, though they are still bounded by the JVM memory (to a lesser degree than the memory based transaction log). More information on the file system transactional log support can be found in the reference documentation under the Search Engine chapter.

An important additional feature that was added is support for session level settings. The first feature to support this is the transactional log feature, allowing to change the transaction log usage on the session level. Here is an example of how it can be used:

CompassSession session = compass.openSession();
session.getSettings().setClassSetting(
  RuntimeLuceneEnvironment.Transaction.TransLog.TYPE,
  FSTransLog.class);

The above code sets Compass to work in a file based transaction log mode. Note, this setting only applies on the Compass session that is going to start a Compass transaction, so if there is an already running Compass transaction, it will have no affect. If you are missing a Compass setting that should be set on the session level, please raise a Jira request for it to be supported (and it will be supported if possible). In the next version, more settings will start to be supported on the session level.

Last, a note on 1.1 M3 version. Don’t know if you noticed, but I sadly broke my one month cycle of releasing Compass versions. This happened mainly due to polymorphic relationship support in Compass which took much longer than I anticipated. 1.1 M3 is going to be released within this week, but it would be great if people downloaded it and played with it to give me feedback regarding backward compatibility with applications that already use Compass. M3 is going to be a huge step for Compass, especially in terms of OSEM completeness.

Search support: Hibernate/Lucene and the Compass at Javalobby

Monday, November 20th, 2006

Rick Ross started a discussion regarding my previous blog post. It would be really nice to hear your thoughts about it there.

Hibernate Search/Lucene

Sunday, November 19th, 2006

It seems like Hibernate have upgraded the support for Lucene integration. The first go at support full text search was pretty basic, and it looks like the next upcoming version will have better support for it (Some more information can be found here). Naturally, with Hibernate releasing such a library, the obvious question would be where does Compass fit into the picture. Let me first start with answering some of the arguments raised by Emmanuel Bernard on the Lucene mailing list:

[Emmanuel Bernard] 1. not Yet Another API to deal with your domain model. If you already use an ORM (JPA or Hibernate), you are familiar with those APIs. Using compass implies that you have to use a different set
of API to play with the object lifecycle (CRUD).Hibernate Search is integrated with the org.hibernate.Query interface, and all the CUD operations on the index are triggered from the Hibernate CUD operations.

Compass, since version 0.5, have integrated with Hibernate lifecycle event mechanism. It actually supports Hibernate from version 3.0.x till 3.2.x . Compass support for it is done through Compass GPS infrastructure, and CUD operations are mirrored to the search engine automatically. As for read / search operations, personally, I really don’t see the difference between using Compass API for searching (with the many benefits it gives) and Hibernate API (which is an extension on top of original Hibernate API).

[Emmanuel Bernard] 2. Metadata are minimal and fit particularly well through annotations, so
you don’t have yet another XML representation of ther same domain model (Compass might now have annotations support, you’ll have to check)

Compass have supported annotations since version 0.9, and they are as minimal as possible, even more simple than the Hibernate ones. With Compass, you can also use xml mapping definitions instead or in combination with the annotation support. Many developers do prefer the use of xml mappings and not annotations, with Compass they have this option.

[Emmanuel Bernard] 3. it’s all about managed objects (ie managed by the Session or the EntityManager)
Hibernate Search gives you back objects managed by the Session, so any change made to them will (by default) be synchronized with the database, this is the normal behavior of an ORM, but is not what you have from a Compass search. This approach fits well with the JBoss Seam approach of having all the application around the domain model and EJB 3.0

Compass allows you to map your domain model to the search engine. You can easily get the objects back from the hit results, and load the respective object form the Hibernate Session or Entity Manager (this can easily be abstracted away and used throughout the application with a few lines of code). This is actually one of the main benefits of Compass over the current Hibernate implementation, you don’t hit the database when searching and displaying search results. Moreover, you do not have to use object in order to display the results, but use Resources (which are Compass abstraction on top of Lucene Document).

[Emmanuel Bernard] 4. Not too much abstraction. From what I’ve heard, Compass borrow a lot of its design / classnames from Hibernate/Spring/Lucene. Compass tries to abstract those 3 techlnologies (at least Hibernate and Lucene), by providing its own infrastructure. What am trying to do with Hibernate Search is to keep the abstraction as light as possible. For advanced Lucene query you’ll have to use pure Lucene APIs, which is possible / natural with Hibernate Search

Compass main API does uses the same programming model as other ORM tools (Hibernate did not invent SessionFactory and Session afterall). The main drive behind using a similar API is the simplicity for users to adopt Compass (not necessarily used within an environment that uses ORM tool) and the applicability of it to Compass. Naturally, Compass does borrow a lot of Lucene semantics, but adds on top of them some enhancements (for example, Resource, which maps to Lucene Docuement, is also identifiable and is associated with an alias/mapping definition). As for abstracting Spring, Compass does not really abstract Spring, but integrate with it in order to simplify the development process when working within a Spring environment (similar to what Spring provides for Hibernate). Last, Compass allows the user to directly work with Lucene classes where needed (for example, getting IndexReaders and Searchers).

[Emmanuel Bernard] I do not think that all your object properties belongs to the Index, and some of them will be put in the index with information degradation (ie store year/month rather than the whole date). So I do not believe there is a bidirectional relationship between your domain model and your index documents (for size, efficiency and accuracy purpose). For that matter, Compass cannot really truly index your database backed domain model and give back the object to you. Hibernate Search can because it delegate the object hydration to Hibernate Core.

First, Compass can truly index your full domain model into the index, but I agree that many times this is not required. Compass allows full control over what gets saved into the index and what not, and in which format. Compass allows the user to work only with Resources when displaying search results, and allows to work in a pure mode where un-marshallign is not supported. But, many times the user would still like to work with the domain model, even though it is a degraded view of it, and Compass allows that by creating as much of the domain model as possible according to the mapping definitions.

So, what are Compass features compared to Hibernate Lucene/Search? Actually, this is a difficult question to answer, since the Hibernate library has something like 5% out of all of Compass features. First and foremost, you can use Compass within an Hibernate managed environment, but you can also use Compass where Hibernate is not used, with different ORM tools (JPA, OJB, JDO), and as a standalone. If we focus on a scenario where Hibernate is used, here is a short and by no means complete list of features: Transactions and atomicity of transactions (get ready for index corruption with Hibernate), much more performant, automatic indexing of the domain model based on ORM mappings and Compass mappings, ‘all’ property, component mapping (allowing to index a related class into the same Resource/Document), sub index hashing, query builder and filter builder API, contract mappings, Lucene Analyzers granularity up to the Property/Field level, built in highlighter support, built in support for Lucene extended analyzers and custom analyzers, declarative configuration over all of Compass features and many of Lucene, extensive support for converters with dynamic languages support as well, XML and Resource level mappings (in combination with OSEM), Lucene caching support and many more (I probably need another blog post for this).

At the end, I think this is good news. The fact that it has been realized that full text search makes a lot of sense in many applications, and we can see Hibernate responding. The response, I am guessing, came based on user demand, which means that users require it. Up until now, Compass has been almost on its own in simplifying full text support within applications and the publicity of Hibernate is only going to help Compass.

Simulate @CompassContext in JEE

Friday, November 10th, 2006

@CompassContext is a great feature when using Spring. It allows for simplification when using Compass API within a transactional context (like Spring managed transactions). If your application uses a different managed environment, like JEE, the @CompassContext session injection can still be simulated.

If we take JEE environment as an example, usually the Compass instance would be registered under JNDI. In order to create a CompassSession that will simplify its usage under managed environment, all that needs to be done is:

compassSession = CompassSessionTransactionalProxy.newProxy(compass);

Where compassSession is probably a class level instance (since it does not make sense to proxy CompassSession each time). Note, it is perfectly fine for CompassSession to be a class level instance, and the proxy will take cate of creating ones correctly based on the transaction context.

At New York

Monday, November 6th, 2006

I am at New York (Manhattan) for the rest of the week (flying back on Thursday). Any users of Compass are more than welcomed to drop me an email, maybe we can arrange something at one of the following nights.