The Role of Open Source in Grid Computing: Past, Present and Future

April 30, 2008

It is not long now until the first Open Source Grid and Cluster Conference, to be held in Oakland, California from 13-15 May 2008. This upcoming event got me thinking about the role of open source in grid and cluster computing, in the past, present, and future.

My involvement with open source dates to the early days of Globus, in the late 1990s. At that time, I (and my colleagues Carl Kesselman and Steve Tuecke) resolved that, in order to reduce barriers to grid technology adoption, Globus software should be freely available to anyone. To this end, we chose to release Globus software under a variant of the BSD Unix license. (Later we moved to the more modern Apache 2.0 license.) Those licenses are non-viral and industry friendly: they allow users to do whatever they like with the software—except sue us, as I always hasten to add.

The path less traveled by
This choice was not as obvious ten years ago as it might be now. Indeed, following the early success of Netscape, many research institutions saw Internet software as a road to riches. Thus, it took a while to persuade our employers and funders that Globus software had little value as proprietary software: that as a grid toolkit, its value would grow proportionally with overall adoption, which in turn would be accelerated by the use of an open source license.

To a significant extent, the benefits that we predicted for open source release have indeed accrued: Globus software has been widely adopted, and many groups are doing amazing things with it, building exciting applications, powerful tools, and substantial infrastructures. (I can go on at tremendous length on this topic, but I will restrain myself.)

A community of contributions
As the scope and ambition of grid deployments and applications increased, we became increasingly interested in a second potential benefit of open source, namely community contributions. Thus we created first the Globus Alliance and then the Apache-like dev.globus community governance process. The result has been many contributions, ranging from ports, bug fixes and minor enhancements to major new features and components. The dev.globus community now includes dozens of projects, covering a wide spectrum of topics beyond the original Globus services, such as metascheduling (GridWay), virtual machine management (Workspace Service), monitoring (NetLogger), data access (OGSA-DAI) and service authoring (Introduce).

Concurrently with these grid developments, we have also seen the emergence of other high-quality open source grid software, such as Unicore, and an explosion in the availability of high quality open source cluster software: for example, Rocks (cluster distribution); Ganglia and Nagios (monitoring); Condor and Sun Grid Engine (cluster and job management); MPICH (message passing); and PVFS (parallel file system). These and other systems have all leveraged the twin benefits of easy adoption and community contributions to achieve robustness, strong functionality, and a large user community. These and other systems will be highlighted at the Open Source Grid and Cluster Conference.

The consumer choice
The open source grid and cluster universe is thus both vibrant and diverse. It is quite feasible to construct complete solutions to many grid and cluster computing problems from purely open source components. Thus, users face interesting choices. Should they prefer open source or proprietary solutions?

Users, in my experience, are not typically seeking open source solutions per se. Instead, they want software that offers simple and immediate solutions, flexibility in terms of support and sustainability, and low to moderate short-term and long-term costs.

Open source software has become popular because it can reduce supply side risk and offer lower costs. These considerations also apply in the grid and cluster space. On the other hand, open source software has a reputation as requiring more integration (and thus expertise) than proprietary solutions. This concern is less true for grid and cluster, because there are fewer integrated solutions from any source. But integrated open source solutions to enterprise computing problems are appearing. For example, Univa UD’s UniCluster Express integrates Globus, Rocks, Ganglia, SGE and other technologies into a turnkey cluster scheduling and management stack.

The future for open source
Looking forward, I believe that users have an important role to play in creating and sustaining our grid and cluster technology base. If users are serious about avoiding vendor lock-in and keeping costs low, then they need to be more aggressive in supporting grid and cluster standards (and thus encouraging competition), and/or in adopting and supporting open source solutions (to ensure a vibrant open source software base).

They should also demand more from open source suppliers in terms of end-to-end solutions. There are success stories out there: in addition, to UniCluster, I can mention the work of caBIG, the MEDICUS system for sharing medical images, MPIG for distributed application execution, Taverna and Kepler for workflow, and the Virtual Data Toolkit and the LHC Computing Grid stacks for processing high energy physics data (among many others). But we need more such, so that we can expand the set of user needs addressed by turnkey solutions and thus reduce barriers to entry.

In science, where commercial solutions do not always meet unique requirements, more thought is needed on long-term sustainability of open source software. While there are some bright spots in this regard—for example, the US NSF’s Office of Cyberinfrastructure support of Globus and Condor, and the UK OMII’s support of OGSA-DAI and Taverna—the overall situation is less than ideal. A lot of money is being spent, but too much of that funding goes to projects where code is developed for some specific short-term purpose and then discarded when a project finishes. That’s too bad. A more sensible strategy would relate all projects to an overall strategy of building and sustaining a broadly useful grid computing platform. Perhaps it is time to revisit priorities and plan international cooperation aimed at meeting application needs.

Overall, it’s an exciting time for open source grid and cluster computing. Much of this progress will be presented at the Open Source Grid and Cluster Conference. If you want to learn about what software exists, find out what people are doing with that software, and engage in discussions about what comes next, you should attend! If you do, I look forward to seeing you there.