Archive for January, 2008

DZone Astrosurfing

Tuesday, January 29th, 2008

I was really happy when DZone came out. I was missing something like Digg but with more focus on technology and development from (mostly) that Java point of view. So, here I was hoping that content that many people find interesting will filter its way into my RSS reader.

Sadly, astrosurfing has found its way into DZone as well. What do I mean by Astrosurfing? I mean that content created by vendors will be pumped up by the vendor and be “regarded” as important news, even though the vendor did all the voting on the content.

The most obvious example is Nikita Ivanov blog, which its content is submitted to DZone. Lets take the following post (which I find pretty silly, but thats just me) which you can find here. If you look at the votes for the link you can find the following voters dsetrakyan, mvandoornik, skh, sbob, dkharlamov, magdenko_alex which strangely enough, always vote for GridGain, and somethings even just vote for GridGain. This can be validated by going to other posts made by Nikita and applying the same logic.

Personally, I view this type of voting as spam. Naturally, other people (I wonder who ;) ) will disagree, but for me, the fact that a company finds its content interesting is irrelevant to the rest of the community.

I guess that the main question is how to tackle this. Rick and Matt are doing a wonderful job, and it is not a simple problem to solve. One option, which they started recently, is creating specific zones such as Groovy Zone with expert people to moderate it.

Another simple solution can be ignoring people who seem to be voting for the same person all the time, as well as reducing the voting power of new users. This is simple mechanism, though obviously can be circumvented, that should reduce this spam posts.

I guess the main reason digg works well and dzone falls a bit short is the fact that there are so many people voting in digg. When you only display content that was voted a thousand times, then the fact that a company such as GridGain with its 5 developers won’t be able to make such an impact. In this sense, the responsibility falls on our shoulders to vote (which I personally have not been doing as much as I should have).

As for low voting sites it is a bit hard to spam out content (without moderation). For example, lets say we suggest voting on people and that will control the actual ranking of a certain story. You still get into the problem of who votes for this people. Other types of solution gets into this “chicken and the egg” problem.

What do you say? Do you have a better solution for this problem? Does this content bothers you or do you even consider it as not spam?

Rick, Matt and the rest of the DZone (and other) people. Thanks you for all your effort, and I hope we will manage to find a solution for this…

Team America - Wonderful Quote

Sunday, January 27th, 2008

Just saw the Team America movie. Such a wonderful, nonse, ridiculous, funny movie. This quote from the movie really made my night, while coding Compass and GigaSpaces, what else :)

We’re dicks! We’re reckless, arrogant, stupid dicks. And the Film Actors Guild are pussies. And Kim Jong Il is an asshole. Pussies don’t like dicks, because pussies get fucked by dicks. But dicks also fuck assholes: assholes that just want to shit on everything. Pussies may think they can deal with assholes their way. But the only thing that can fuck an asshole is a dick, with some balls. The problem with dicks is: they fuck too much or fuck when it isn’t appropriate - and it takes a pussy to show them that. But sometimes, pussies can be so full of shit that they become assholes themselves… because pussies are an inch and half away from ass holes. I don’t know much about this crazy, crazy world, but I do know this: If you don’t let us fuck this asshole, we’re going to have our dicks and pussies all covered in shit!

Who do you replace with “We”, the “File Actors Guild”, and “Kim Jong Il”? I sure as hell have my candidates ;). Sacre bleu!.

Compass 2.0 M1 Released

Friday, January 25th, 2008

I am very pleased to announce the release of Compass 2.0 M1. Release notes can be found here, and download is here.

A lot of work has gone into this release and it includes a many of new features. Main ones include: Integration with DataGrids, support for EclipseLink, embedded support for Hibernate, Toplink and EclipseLink (similar to the current OpenJPA one) allowing to embed Compass within this ORMs, improved query string parsing, much improved all support (performance and functionality).

I will blog more mainly about the improved integration with ORM frameworks in later blogs. Enjoy!.

Compass 1.2.1 Released

Tuesday, January 22nd, 2008

Compass version 1.2.1, a bug fix release over 1.2, is released. Download here, and release notes are here.

Keyword driven search for Safari

Friday, January 18th, 2008

Just came across this little gem called Keywurl. I use keyword based navigation on Firefox all the time, its nice to have it on Safari as well (as I like Safari a bit better… ).

Compass/Lucene and DataGrids

Friday, January 18th, 2008

The upcoming Compass release (2.0 M1) now has integration with both GigaSpaces and Coherence. There is the obvious integration with Lucene/Compass of storing the Lucene index on a DataGrid and a very interesting one. So read along:

Lucene Directory

Compass has an implementation of Lucene Directory for both GigaSpaces and Coherence. Here is an example of using it with GigaSpaces:

1
2
3
IJSpace space = SpaceFinder.find("jini://*/*/space", "test");
Directory dir = new GigaSpaceDirectory(space);
// use the Directory to open IndexWriter and Searcher

The Lucene index will now be stored on the DataGrid, allowing to utilize advance DataGrid features such as partitioning and local cache support. The topology of the DataGrid is abstracted (by GigaSpaces/Coherence).

This pure implementation means that you can (and should :) ), use this feature with Compass (read along on how to do it), but it can also be used with Hibernate Search, Solr, or any other Lucene based framework.

Compass Index Store

Compass has specific support for both GigaSpaces and Coherence specific directories. Here is an example of how to configure Compass to work with a GigaSpaces based Directory:

1
2
3
4
5
6
7
8
9
10
CompassConfiguration conf = CompassConfigurationFactory.newConfiguration();
conf.setConnection("space://test:jini://*/*/space");
Comapss compass = conf.buildCompass();
 
// now we can use it
CompassSession session = compass.openSession();
CompassTransaction tr = session.beginTransaction();
session.save(new Author(1, "Jack London"));
tr.commit();
session.close();

The above code will connect to the Space (DataGrid) and index the content of the Author object into it. Note, is such a case, as in the pure Lucene mode, switching from file system directory (for example) is just a matter of configuration. Here is how it can be configured with Compass xml based configuration:

1
2
3
4
5
<compass name="default">
  <connection>
      <space indexName="test" url="jini://*/*/mySpace"/>
  </connection>
</compass>

Automatically Index/Mirror the Data Grid

DataGrids allow to store POJOs and Compass allows to index POJOs to a Search Engine. Using DataGrid features such as Write Behind (GigaSpaces Mirror support and Coherence CacheStore) we can integrate the two and have the ability of automatically mirror changes that happen in the DataGrid to the Search Engine using Compass. This ability basically allows us to index the content of the DataGrid and the ability to perform “Google like” search queries on it.

Compass comes with a CompassDataSource for GigaSpaces and CompassCacheStore for Coherence allowing to do it. A quick note in this regard: Since GigaSpaces work with pure POJOs, it is as simple as just annotating the classes with the @Searchable annotation. Coherence are built using the Map API, and the remove operation from CacheStore only accepts the keys of the Map. This means that in such a case, the key should be / hold the ids of the searchable values.

Naturally, this support is irrelevant to where you store the index. So basically, you can end up with a DataGrid holding your business data. Mirror changes done (create/update/delete operations) to the search engine using Compass. And have Compass store the content of Lucene index in another DataGrid :).

This is where Compass generic support for Object to Search Engine Mappings really shines. Any object can be mapped to the Search Engine which opens up a windows of possibilities then just to more typical ORM integration.

For more information head over to Compass reference documentation for 2.0 M1 (will be released shortly).

Final Notes

As some of you may know, I am a GigaSpaces employee. I, as well as my company, strongly believe that a product should be chosen based on its merits. This is why this feature integrates with the two most popular and production ready DataGrids available today (with ObjectGrid coming soon). In our software world, it is all about choice, and Compass users have it.

Last note regarding Coherence. Currently, the license does not allow to include the Coherence jar files within Compass distribution or SVN. This means that SNAPSHOT builds (as is the state of 2.0 M1 now) does not include the compiled version of Coherence support. If you want to get it, it is just a matter of downloading Coherence jars and building Compass (simple thing to do with pure Compass distribution). A formal release will include this support built in (as I am the one that builds it). With GigaSpaces the matter was simpler as it has a community edition.

Acquisition Day

Wednesday, January 16th, 2008

Sun just bought MySQL, and Oracle just bought BEA. Interesting times …, what do you say?

Tabs in Mac Terminal

Sunday, January 13th, 2008

What I am really missing in Mac is the ability to switch tabs using Apple+[Num Key]. I really miss it in Safari, and even more in Terminal now that it supports tabs. This is why I was really happy at finding this wonderful extension allowing to use it in Terminal.

Compass mentioned at 10 reasons to move to Grails

Sunday, January 13th, 2008

It seems like this blog entry is making some noise around the web. The last reason mentioned there is: “Search operations are based on Lucene (with a plugin)”. The Searchable plugin is really cool, and I am going to use it to start learning Grails and maybe start helping out there.

Dynamic Ranking in Compass

Friday, January 11th, 2008

I spent some time talking to a friend about Compass and the question of how one would go and do something like how Google ranks web pages with Compass. I described to him one very simple option for doing that which he was not aware of, and I though I would share it here for other Compass users as well:

So, lets take for example a simple Account object which has many possible Customers. One possible ranking algorithm when searching for an Account can be the number of Customers it has. So, the question here is how do you realize this requirement with Compass? One way to do it is using a feature in Compass that allows to map a property which will control the boost “class” value. Here is a snippet of code that shows how to do it with Compass:

1
2
3
4
5
6
7
8
9
@Searchable
public class Account {
 
     @SearchableBoostProperty
     public Integer getNumberOfCustomers() {
          return customers.size();
     }
 
}

Now, when searching for something that “hits” the Account object, its relevance will also be based on the number of customers it has. Also note that many times, especially if the number of customers can vary across a large number, mapping the values into a range of smaller values will be needed.