Wednesday, June 25, 2008

Choosing Open Source Software

Of late Open Source Software (OSS) has become so critical to software companies that it is hard to imagine life without them. Although software vendors do not have code from the OSS community (although they would like to) because of licensing restrictions, their internal IT environments are replete with them. Consider this - Operating System: Linux, Development Environment: Eclipse, Issue tracking: Bugzilla, Source code control: SVN and the list goes on. A complete IT environment can be set up in matter of hours and we have multiple options at every turn.

Now let us turn our attention to a harder problem. Suppose one is releasing a product or a service. Can one use open source / free software as part of this? Obviously the first aspect to think is that of licensing. There are few tools out there that can be embedded into other software as part of a product without constraints. However the licensing constraints are not applicable for hosted services since the original software is not re-distributed for profit. So for the discussion lets move past licensing and see how do we choose open source components. The following anecdote is instructive.

Recently we were implementing a messaging system as part of which we had to have a Mail Transfer Agent (or in simpler words a server that can send, receive and deliver emails). We looked at various free options - theres tons of them some of them open source while others are without licensing costs. We eventually decided to go with qmail, primarily because we already had in house expertise on setting it up and running it. All was going very smooth until we did some high throughput performance tests. It turned out that the average delivery time for every message was increasing non-linearly with increase in load. It meant that with more load the system too significantly more time to deliver the messages. It was unacceptable.


We did a bit of profiling of the system and eventually could attribute the system behaviour to a queer problem. Its called as "The silly Qmail Syndrome" (http://qmail.jms1.net/silly-qmail.shtml). The gist of this problem is that when a very large number of messages are sent in a short period of time, qmail keeps itself busy doing only one set of tasks (either sending messages out or classifying incoming messages). This starves the other task and in turn dramatically increases the end to end delivery time. The site above gives the solution - comes to the community as a patch to the qmail code. The creators (Andre O and John S) have been kind to share this in an easily consumable form with the world. I shudder to think how our project would have been affected if we had not found this solution in time.

In conclusion I would like to go back to the point I wanted to make. When choosing Free or Open Source software to be part of critical business applications (wonder if there are there any non-critical business applications!) the most important thing one needs to look at is the community around the tool in question. The references below have more thoughts on what else to look for. In all using the right open source tools can amount to real cost savings while not compromising quality.

Some References

1 comment:

Vikram said...

Very aptly put Ajay. Being part of Sun I am realizing this all the more. Customers are cringing to pay the heavy license fees but at the same time are very scared abt going to open source.