Lucene Open Source CMS search

Lucene Open Source CMS search

The focus on having user-friendly interface and dynamic content structure diverts most CMSes from the most important issue on a webpage: deliver content to the users.

We are not accusing any Open Source CMS vendors or communities for not designing solutions with focus on the user, but sometimes it is easier to focus on the visual impression and human interaction than "less" important functionalities, like search. While the content is growing rapidly the full text search becomes considerably slower.

To develop good search engines is a own profession. To simplify the complexity of the search engines, many tend to use tags or keyword systems. This type of navigation has been very popular among community websites like Wikipedia, Youtube and iStockphoto. Searching using keywords like tags is very fast, but cannot replace a full text search in any way. Excessive usage of tags is a way to admit that you have got too much content to be able to make a logical menu structure.

Nowadays there are examples of Open Source CMS systems that provide search based on the following SQL query:

SELECT * FROM content WHERE row LIKE '%word%'

This is a slow and very little intelligent way to make a full text search. This is where Lucene can be an attractive alternative.

What is Lucene ?

Lucene is a free and open source information retrieval API, originally implemented in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene has been ported to programming languages including Perl, C#, C++, Python, Ruby and PHP.

Suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of internet search engines and local, single-site searching.

Who can take advantages of Lucene ?

If you suffer from bad performance or to little accurate search results in your Open Source CMS Lucene can be worth a try. Many Open Source CMS already have extensions and modules to support Lucene and some have implemented Lucene as a part of their core functionality.

  • Alfresco (core)
  • Midgard (core)
  • Hippo CMS (core)
  • Sitellite (core)
  • OpenCMS (module)
  • eZ publish (module)
  • Drupal (module)

Search functionality is one of the important aspects when you choose an Open Source CMS. You can read more about other important aspects in the article “How to choose an Open Source CMS”.

Share this article

eZ publish™ copyright © 1999-2009 eZ systems as