| Research Agenda for the Semantic Grid | De Roure, Jennings and Shadbolt | December 2001 |
As soon as computers are interconnected and communicating we have
a distributed system, and the issues in designing, building and deploying
distributed computer systems have now been explored over many years. This
section considers e-Science infrastructure from a distributed systems
perspective. First it positions the Grid within the bigger picture of
distributed computing, asking whether it is subsumed by current solutions. Then
we look in more detail at the requirements and currently deployed technologies
in order to identify issues for the next generation of the infrastructure.
Since much of the grid computing development has addressed the data-computation
layer, this section particularly draws upon the work of that community.
Our discussion of the data/computation layer makes certain
assumptions of the underlying network fabric, which are not addressed in detail
here:
Furthermore
we do not study storage devices nor particular processing solutions. With these
caveats, the remainder of this section is organised as follows. Section 3.1
deals with grid computing as a distributed system, section 3.2 with the
requirements at the data-computational layer, section 3.3 with technologies that
are being used at this layer, section 3.4 with grid portals, section 3.5 with
grid deployments, and section 3.6 with the open research issues.
Fundamentally, distribution introduces notions of concurrency and
locality. The traditional view of a computation involves the data, program and
computational state (memory, processor registers) coinciding at one processing
node. There are many techniques for achieving this in a distributed system and
the following are illustrative examples:
·
Remote
Procedure Calls. The code is installed in advance on the ‘server’. The client
sends parameters, a computation is invoked, and a result is returned. The model
is essentially synchronous. This
approach is exemplified by the Distributed Computing Environment (DCE) from the
Open Software Foundation (now Open Group).
·
Services[1].
Again the code is server-side and the computation runs continuously (a UNIX
daemon or Windows service), clients connect and there is a bi-directional interaction
according to a high level communications protocol.
·
Distributed
object systems. Within the object-oriented paradigm, remote method invocation
replaces remote procedure calls, usually with support for object request
brokering.
·
Database
queries. The database is running on the server, a client sends a small program
(i.e. the query) and this is executed on the server, data is exchanged.
·
Client-side
scripting. The client downloads the script from the server and executes it
locally in a constrained environment, such as a Web browser.
·
Message
passing. This type of paradigm encapsulates distributed ‘parallel’ programming
such as MPICH or PVM (note additionally that process invocation may occur
through techniques such as a remote shell). It also includes message queuing
systems.
·
Peer-to-peer.
Those devices that traditionally act as clients (the office workstation, the
home computer) can also offer services, including computational services; i.e.
computation at the ‘edge’ of the network. In principle such systems can be
highly decentralised, promoting scalability and avoiding single points of
failure.
·
Web-based
computing. This involves the use of the Web infrastructure as a fabric for
distributed applications, and closely resembles Unix services and remote procedure
calls.
Each of the above models is in widespread use and when taken
together they illustrate the current practice in distributed systems
deployment. It seems likely that grid computing will not be based on just one
of these technologies: they are the legacy with which the infrastructure must
interoperate, and although proponents of any one of the above might argue that
theirs is a universal solution, no dominant solution is apparent in current
practice. Of course they are closely inter-related and any system could be
expressed in multiple ways – different instances of a conceptual model.
So the
key question to ask in this context is will this change as grid computing
evolves? In some respects the same distributed systems issues continue to
apply. For example, although wide area networks now operate at LAN speeds, the
scale of the wide area has grown too. Availability of very high speed
networking might de-emphasise the need for locality, but the global
geographical scale of the Grid re-emphasises it; e.g. local memory access must
still be more efficient as the latency over the wider area is fundamentally
constrained by the speed of light. Data locality continues to be key to
application efficiency and performance. So although local computational resources
may use a variety of architectures (e.g. tightly coupled multiprocessors), over
the wide area the grid architecture is a classical distributed system.
What will certainly change is scale, and with it heterogeneity.
These are both perennial challenges in distributed systems, and both are
critical in the e-Science context. In fact, in many ways the “e-” prefix
denotes precisely these things, as characterised by phrases such as
‘distributed global collaborations’, ‘very large data collections’, and ‘tera-scale
computing resources’. With regards to scaling up the number of nodes,
difficulties are experienced with the computational and intellectual complexity
of the programs, with configuration of the systems, and with overcoming
increased likelihood of failure. Similarly, heterogeneity is inherent – as any
cluster manager knows, their only truly homogeneous cluster is their first
one! Furthermore, increasing scale also
involves crossing an increasing number of organisational boundaries, which
magnifies the need to address authentication and trust issues.
There is a strong sense of automation in distributed
systems; for example, when humans cannot deal with the scale but delegate to
processes on the distributed system to do so (e.g. through scripting), which
leads to autonomy within the systems. This implies a need for coordination,
which, in turn, needs to be specified programmatically at various levels –
including process descriptions. Similarly the increased likelihood of failure
implies a need for automatic recovery from failure: configuration and repair
cannot remain manual tasks.
The
inherent heterogeneity must also be handled automatically. Systems use varying
standards and system APIs, resulting in the need to port services and
applications to the plethora of computer systems used in a grid environment.
Agreed interchange formats help in general, because n converters are
needed to enable n components to interoperate via one standard, as
opposed to n² converters for them to interoperate with each other.
To
introduce the matching of e-Science requirements to distributed systems
solutions, the next three subsections describe the three most prevalent
distributed systems solutions with broad claims and significant deployment,
considering whether each might meet the needs of the e-Science infrastructure.
These systems are: distributed object systems, web-based computing and
peer-to-peer solutions. Having positioned grid computing in the general context
of distributed systems, we then turn to the specific requirements,
technologies, and exemplars of the data-computational layer.
The
Common Object Request Broker Architecture (CORBA) is an open distributed
object-computing infrastructure being standardised by the Object Management
Group [OMG]. CORBA automates many common network programming tasks such as
object registration, location, and activation; request demultiplexing; framing
and error-handling; parameter marshalling and demarshalling; and operation
dispatching. Although CORBA provides a rich set of services, it does not
contain the grid level allocation and scheduling services found in Globus (see
section 3.3.1), however, it is possible to integrate CORBA with the Grid.
The OMG
has been quick to demonstrate the role of CORBA in the grid infrastructure;
e.g. through the ‘Software Services Grid Workshop’ held in 2001. Apart from
providing a well-established set of technologies that can be applied to
e-Science, CORBA is also a candidate for a higher level conceptual model. It is
language-neutral and targeted to provide benefits on the enterprise scale, and
is closely associated with the Unified Modelling Language (UML).
One of
the concerns about CORBA is reflected by the evidence of intranet rather than
internet deployment, indicating difficulty crossing organisational boundaries;
e.g. operation through firewalls.
Furthermore, realtime and multimedia support was not part of the
original design.
While
CORBA provides a higher layer model and standards to deal with heterogeneity,
Java provides a single implementation framework for realising distributed
object systems. To a certain extent the Java Virtual Machine (JVM) with
Java-based applications and services are overcoming the problems associated
with heterogeneous systems, providing portable programs and a distributed
object model through remote method invocation (RMI). Where legacy code needs to
be integrated it can be ‘wrapped’ by Java code.
However,
the use of Java in itself has its drawbacks, the main one being computational
speed. This and other problems associated with Java (e.g. numerics and
concurrency) are being addressed by the likes of the Java Grande Forum (a
‘Grande Application’ is ‘any application, scientific or industrial, that
requires a large number of computing resources, such as those found on the
Internet, to solve one or more problems’) [JavaGrande]. Java has also been
chosen for UNICORE (see Section 3.5.3).
Thus what is lost in computational speed might be gained in terms of
software development and maintenance times when taking a broader view of the
engineering of grid applications.
Considerable imagination has been exercised in a variety of systems to overcome the Web’s architectural limitations in order to deliver applications and services. The motivation for this is often pragmatic, seeking ways to piggyback off such a pervasive infrastructure with so much software and infrastructure support. However the web infrastructure is not the ideal infrastructure in every case. There is a danger that e-Science infrastructure could fall into this trap, hence it is important to revisit the architecture.
The major architectural limitation is that information flow is always initiated by the client. If an application is ever to receive information spontaneously from an HTTP server then a role reversal is required, with the application accepting incoming HTTP requests. For this reason, more and more applications now also include HTTP server functionality (cf peer-to-peer, see below). Furthermore, HTTP is designed around relatively short transactions rather than the longer-running computations. Finally we note that typical web transactions involve a far smaller number of machines that grid computations.
When the Web is used as an infrastructure for distributed applications, information is exchanged between programs, rather than being presented for a human reader. This has been facilitated by XML (and the XML family or recommendations from W3C) as the standard intermediate format for information exchanged in process-to-process communication. XML is not intended for human readership – that it is human-readable text is merely a practical convenience for developers.
Given its ubiquitous nature as an information exchange medium, there is further discussion of the web infrastructure in Section 4.
One very plausible approach to address the concerns of scalability
can be described as decentralisation, though this is not a simple
solution. The traditional client-server model can be a performance bottleneck
and a single point of failure, but is still prevalent because decentralisation
brings its own challenges.
Peer-to-Peer (P2P) computing [Clark01a] (as implemented, for
example, by Napster [Napster], Gnutella [Gnutella], Freenet [Clarke01b] and
JXTA [JXTA]) and Internet computing (as implemented, for example, by the
SETI@home [SETI], Parabon [Parabon], and Entropia systems [Entropia]) are
examples of the more general computational structures that are taking advantage
of globally distributed resources. In P2P computing, machines share data and
resources, such as spare computing cycles and storage capacity, via the
Internet or private networks. Machines can also communicate directly and manage
computing tasks without using central servers. This permits P2P computing to
scale more effectively than traditional client-server systems that must expand
a server’s infrastructure to expand, and this ‘clients as servers’
decentralisation is attractive with respect to scalability and fault tolerance
for the reasons discussed above.
However, there are some obstacles to P2P computing:
·
PCs
and workstations used in complex P2P applications will require more computing
power to carry the communications and security overhead that servers would
otherwise handle.
·
Security
is an issue as P2P applications give computers access to other machines’
resources (memory, hard drives, etc). Downloading files from other computers
makes the systems vulnerable to viruses. For example, Gnutella users were
exposed to the VBS_GNUTELWORM virus. Another issue is the ability to
authenticate the identity of the machines with their peers.
·
P2P
systems have to cope with heterogeneous resources, such as computers using a
variety of operating systems and networking components. Technologies such as
Java and XML will help overcome some of these interoperability problems.
·
One
of the biggest challenges with P2P computing is enabling devices to find one
another in a computing paradigm that lacks a central point of control. P2P
computing needs the ability to find resources and services from a potentially
huge number of globally based decentralised peers.
A number of peer-to-peer storage systems are being developed
within the research community. In the grid context these raise interesting
issues of security and anonymity:
·
The
Federated, Available, and Reliable Storage for an Incompletely Trusted
Environment (FARSITE) serverless distributed file system [Bolosky00].
·
OceanStore,
a global persistent data store which aims to provide a ‘consistent,
highly-available, and durable storage utility atop an infrastructure comprised
of untrusted servers’ and scale to very large numbers of users. [Kubiatowicz00;
Zhuang01]
·
The
Self-Certifying File System (SFS), a network file system that aims to provide
strong security over untrusted networks without significant performance costs
[Fu00].
·
PAST
is a ‘large-scale peer-to-peer storage utility’ that aims to provide high
availability, scalability, and anonymity. Documents are immutable and the
identities of content owners, readers and storage providers are protected.
[Druschel01]
The emphasis of the early efforts in grid computing was driven by
the need to link a number of US national supercomputing centres. The I-WAY
project, a forerunner of Globus [Foster97a], successfully achieved this goal.
Today the grid infrastructure is capable of binding together more than just a
few specialised supercomputing centres. A number of key enablers have helped
make the Grid more ubiquitous, including the take up of high bandwidth network
technologies and adoption of standards, allowing the Grid to be viewed as a
viable distributed infrastructure on a global scale that can potentially support
diverse applications.
The previous section has shown that there are three main issues
that characterise computational and data grids:
At this layer, the main concern is ensuring that all the globally
distributed resources are accessible under the terms and conditions specified
by their respective service owners. This layer can consist of all manner of
networked resources ranging from computers and mass storage devices to
databases and special scientific instruments. These need to be organised so
that they provide resource-independent and application-independent services.
Examples of these services are an information service that provides uniform
access to information about the structure and state of grid resources or a
security service with mechanisms for authentication and authorization that can
establish a user’s identity and create various user credentials.
Against this background, the following are the main design
features required at this level of the Grid:
·
Administrative
Hierarchy – An administrative hierarchy is the way that each grid environment
divides itself up to cope with a potentially global extent. The administrative
hierarchy, for example, determines how administrative information flows through
the Grid.
·
Communication
Services – The communication needs of applications using a grid environment are
diverse, ranging from reliable point-to-point to unreliable multicast. The
communications infrastructure needs to support protocols that are used for
bulk-data transport, streaming data, group communications, and those used by
distributed objects. The network services used also provide the Grid with
important QoS parameters such as latency, bandwidth, reliability,
fault-tolerance, and jitter control.
·
Information
Services – A grid is a dynamic environment where the location and type of
services available are constantly changing. A major goal is to make all
resources accessible to any process in the system, without regard to the
relative location of the resource user. It is necessary to provide mechanisms
to enable a rich environment in which information about the Grid is reliably
and easily obtained by those services requesting the information. The grid
information (registration and directory) services provide the mechanisms for
registering and obtaining information about the structure, resources, services,
and status and nature of the environment.
·
Naming
Services – In a grid, like in any other distributed system, names are used to
refer to a wide variety of objects such as computers, services, or data. The
naming service provides a uniform name space across the complete distributed
environment. Typical naming services are provided by the international X.500 naming
scheme or DNS (the Internet's scheme).
·
Distributed
File Systems and Caching – Distributed applications, more often than not,
require access to files distributed among many servers. A distributed file
system is therefore a key component in a distributed system. From an
applications point of view it is important that a distributed file system can
provide a uniform global namespace, support a range of file I/O protocols,
require little or no program modification, and provide means that enable
performance optimisations to be implemented (such as the usage of caches).
·
Security
and Authorisation – Any distributed system involves all four aspects of
security: confidentiality, integrity, authentication and accountability.
Security within a grid environment is a complex issue requiring diverse
resources autonomously administered to interact in a manner that does not
impact the usability of the resources and that does not introduce security
holes/lapses in individual systems or the environments as a whole. A security
infrastructure is key to the success or failure of a grid environment.
·
System
Status and Fault Tolerance – To provide a reliable and robust environment it is
important that a means of monitoring resources and applications is provided. To
accomplish this, tools that monitor resources and applications need to be
deployed.
·
Resource
Management and Scheduling – The management of processor time, memory, network,
storage, and other components in a grid is clearly important. The overall aim
is to efficiently and effectively schedule the applications that need to
utilize the available resources in the distributed environment. From a user's
point of view, resource management and scheduling should be transparent and
their interaction with it should be confined to application submission. It is
important in a grid that a resource management and scheduling service can
interact with those that may be installed locally.
·
User
and Administrative GUI – The interfaces to the services and resources available
should be intuitive and easy to use as well as being heterogeneous in nature.
Typically user and administrative access to grid applications and services are
web based interfaces.
There are growing numbers of Grid-related projects, dealing with
areas such as infrastructure, key services, collaborations, specific
applications and domain portals. Here we identify some of the most significant
to date.
Globus [Globus] [Foster97b] provides a software infrastructure that
enables applications to handle distributed, heterogeneous computing resources
as a single virtual machine. The Globus project is a U.S. multi-institutional
research effort that seeks to enable the construction of computational grids. A
computational grid, in this context, is a hardware and software infrastructure
that provides dependable, consistent, and pervasive access to high-end
computational capabilities, despite the geographical distribution of both
resources and users. A central element of the Globus system is the Globus
Toolkit, which defines the basic services and capabilities required to
construct a computational grid. The toolkit consists of a set of components
that implement basic services, such as security, resource location, resource
management, and communications.
It is necessary for computational grids to support a wide variety
of applications and programming paradigms. Consequently, rather than providing
a uniform programming model, such as the object-oriented model, the Globus
Toolkit provides a bag of services which developers of specific tools or
applications can use to meet their own particular needs. This methodology is
only possible when the services are distinct and have well-defined interfaces
(APIs) that can be incorporated into applications or tools in an incremental
fashion.
Globus is constructed as a layered architecture in which
high-level global services are built upon essential low-level core local
services. The Globus Toolkit is modular, and an application can exploit Globus
features, such as resource management or information infrastructure, without
using the Globus communication libraries. The Globus Toolkit currently consists
of the following (the precise set depends on Globus version):
·
An
HTTP-based ‘Globus Toolkit Resource Allocation Manager’ (GRAM) protocol is used
for allocation of computational resources and for monitoring and control of
computation on those resources.
·
An
extended version of the File Transfer Protocol, GridFTP, is used for data
access; extensions include use of connectivity layer security protocols,
partial file access, and management of parallelism for high-speed transfers.
·
Authentication
and related security services (GSI)
·
Distributed
access to structure and state information that is based on the Lightweight
Directory Access Protocol (LDAP). This service is used to define a standard
resource information protocol and associated information model.
·
Monitoring
of health and status of system components (HBM)
·
Remote
access to data via sequential and parallel interfaces (GASS) including an
interface to GridFTP.
·
Construction,
caching, and location of executables (GEM)
·
Advanced
Resource Reservation and Allocation (GARA)
Legion [Legion] [Grimshaw97] is an object-based ‘metasystem’,
developed at the University of Virginia. Legion provides the software
infrastructure so that a system of heterogeneous, geographically distributed,
high performance machines can interact seamlessly. Legion attempts to provide
users, at their workstations, with a single coherent virtual machine.
Legion takes a different approach to Globus to providing to a grid
environment: it encapsulates all of its components as objects. The methodology
used has all the normal advantages of an object-oriented approach, such as data
abstraction, encapsulation, inheritance, and polymorphism. It can be argued
that many aspects of this object-oriented approach potentially make it ideal
for designing and implementing a complex environment such as a metacomputer.
However, using an object-oriented methodology does not come without a raft of
problems, many of which are tied-up with the need for Legion to interact with
legacy applications and services.
Legion
defines the API to a set of core objects that support the basic services needed
by the metasystem. The Legion system has the following set of core object
types:
·
Classes
and Metaclasses – Classes can be considered managers and policy makers.
Metaclasses are classes of classes.
·
Host
objects – Host objects are abstractions of processing resources; they may
represent a single processor or multiple hosts and processors.
·
Vault
objects – Vault objects represents persistent storage, but only for the purpose
of maintaining the state of object persistent representation.
·
Implementation
Objects and Caches – Implementation objects hide the storage details of object
implementations and can be thought of as equivalent to an executable in UNIX.
·
Binding
Agents – A binding agent maps object IDs to physical addressees.
·
Context
objects and Context spaces – Context objects map context names to Legion object
IDs, allowing users to name objects with arbitrary-length string names.
WebFlow [Akarsu98] [Haupt99] is a computational extension of the
web model that can act as a framework for wide-area distributed computing and
metacomputing. The main goal of the WebFlow design was to build a seamless
framework for publishing and reusing computational modules on the Web, so that
end users, via a web browser, can engage in composing distributed applications
using WebFlow modules as visual components and editors as visual authoring
tools. WebFlow has a three-tier Java-based architecture that can be considered
a visual dataflow system. The front-end uses applets for authoring,
visualization, and control of the environment. WebFlow uses a servlet-based
middleware layer to manage and interact with backend modules such as legacy
codes for databases or high performance simulations. WebFlow is analogous to
the Web; web pages can be compared to WebFlow modules and hyperlinks that
connect web pages to inter-modular dataflow channels. WebFlow content
developers build and publish modules by attaching them to web servers.
Application integrators use visual tools to link outputs of the source modules
with inputs of the destination modules, thereby forming distributed
computational graphs (or compute-webs) and publishing them as composite WebFlow
modules. A user activates these compute-webs by clicking suitable hyperlinks,
or customizing the computation either in terms of available parameters or by
employing some high-level commodity tools for visual graph authoring. The high
performance backend tier is implemented using the Globus toolkit:
·
The
Metacomputing Directory Services (MDS) is used to map and identify resources.
·
The
Globus Resource Allocation Manager (GRAM) is used to allocate resources.
·
The
Global Access to Secondary Storage (GASS) is used for a high performance data
transfer.
The current WebFlow system is based on a mesh of Java-enhanced web
servers (Apache), running servlets that manage and coordinate distributed
computation. This management infrastructure is implemented by three servlets:
session manager, module manager, and connection manager. These servlets use URL
addresses and can offer dynamic information about their services and current
state. Each management servlet can communicate with others via sockets. The
servlets are persistent and application-independent. Future implementations of
WebFlow will use emerging standards for distributed objects and take advantage
of commercial technologies, such as CORBA, as the base distributed object
model.
WebFlow takes a different approach to both Globus and Legion. It
is implemented in a hybrid manner using a three-tier architecture that
encompasses both the Web and third party backend services. This approach has a
number of advantages, including the ability to plug-in to a diverse set of
backend services. For example, many of these services are currently supplied by
the Globus toolkit, but they could be replaced with components from CORBA or
Legion.
Nimrod [Nimrod] is a tool designed to aid researchers undertaking
parametric studies on a variety of computing platforms. In Nimrod, a typical
parametric study is one that requires the same sequential application be
executed numerous times with different input data sets, until some problem
space has been fully explored.
Nimrod provides a declarative parametric modelling language for
expressing an experiment. Domain experts can create a plan for a parametric
computation (task farming) and use the Nimrod runtime system to submit, run,
and collect the results from multiple computers. Nimrod has been used to run
applications ranging from Bio-informatics and Operational Research, to the
simulation of business processes.
A new system called Nimrod/G [Buyya00b] uses the Globus middleware
services for dynamic resource discovery and dispatching jobs over Grids. The
user need not worry about the way in which the complete experiment is set up,
data or executable staging, or management. The user can also set the deadline
by which the results are needed and the Nimrod/G broker tries to find the
optimal resources available and use them so that the user deadline is met or
the cost of computation is kept to a minimum.
The current focus of the Nimrod/G project team is on the use of
economic theories (as discussed in section 2.1) in grid resource management and
scheduling as part of a new framework called GRACE (Grid Architecture for
Computational Economy) [Buyya00a]. The components that make up GRACE include
global scheduler (broker), bid-manager, directory server, and bid-server
working interacting with grid middleware. The GRACE infrastructure APIs that
grid tools and applications programmers can use to develop software support the
computational economy. Grid resource brokers, such as Nimrod/G, uses GRACE
services to dynamically trade with resource owners to select those resources
that offer optimal user cost or timelines criteria.
Jini [Jini] is designed to provide a software infrastructure that
can form a distributed computing environment that offers network plug and play.
A collection of Jini-enabled processes constitutes a Jini community – a
collection of clients and services all communicating by the Jini protocols. In
Jini, applications will normally be written in Java and communicate using the
Java Remote Method Invocation (RMI) mechanism. Even though Jini is written in
pure Java, neither Jini clients nor services are constrained to be pure Java.
They may include native code, acting as wrappers around non-Java objects, or
even be written in some other language altogether. This enables a Jini
community to extend beyond the normal Java framework and link services and clients
from a variety of sources.
More fundamentally, Jini is primarily concerned with
communications between devices (not what devices do). The abstraction is the
service and an interface that defines a service. The actual implementation of
the service can be in hardware, software, or both. Services in a Jini community
are mutually aware and the size of a community is generally considered that of
a workgroup. A community’s lookup service (LUS) can be exported to other
communities, thus providing interaction between two or more isolated
communities.
In Jini a device or software service can be connected to a network
and can announce its presence. Clients that wish to use such a service can then
locate it and call it to perform tasks. Jini is built on RMI, which introduces
some constraints. Furthermore, Jini is not a distributed operating
system, as an operating system provides services such as file access, processor
scheduling, and user logins.
The five key concepts of Jini are:
·
Lookup
– search for a service and download the code needed to access it;
·
Discovery
– spontaneously find a community and join;
·
Leasing
– time-bounded access to a service;
·
Remote
Events – service A notifies service B of A’s state change. Lookup can notify
all services of a new service;
·
Transactions
– used to ensure that a system’s distributed state stays consistent.
The Common Component Architecture Forum [CCA] is attempting to
define a minimal set of standard features that a high performance component
framework would need to provide, or can expect, in order to be able to use
components developed within different frameworks. Such a standard will promote
interoperability between components developed by different teams across
different institutions.
The idea of using component frameworks to deal with the complexity
of developing interdisciplinary high performance computing applications is
becoming increasingly popular. Such systems enable programmers to accelerate
project development by introducing higher-level abstractions and allowing code
reusability. They also provide clearly defined component interfaces, which
facilitate the task of team interaction. These potential benefits have
encouraged research groups within a number of laboratories and universities to
develop, and experiment with prototype systems. This has led to some
fragmentation and there is a need for interoperability standards.
There are three freely available systems whose primary focus is batching
and resource scheduling:
·
Condor
[Condor] is a software package for executing batch jobs on a variety of UNIX
platforms, in particular those that would otherwise be idle. The major features
of Condor are automatic resource location and job allocation, check pointing
and the migration of processes. These features are achieved without
modification to the underlying UNIX kernel. However, it is necessary for a user
to link their source code with Condor libraries. Condor monitors the activity
on all the participating computing resources, those machines, which are
determined to be available, are placed into a resource pool. Machines are then
allocated from the pool for the execution of jobs. The pool is a dynamic entity
– workstations enter when they become idle, and leave when they get busy.
·
The
Portable Batch System (PBS) [PBS] is a batch queuing and workload management
system (originally developed for NASA). It operates on a variety of UNIX
platforms, from clusters to supercomputers. The PBS job scheduler allows sites
to establish their own scheduling policies for running jobs in both time and
space. PBS is adaptable to a wide variety of administrative policies, and
provides an extensible authentication and security model. PBS provides a GUI
for job submission, tracking, and administrative purposes.
·
The
Sun Grid Engine (SGE) [SGE] is based on the software developed by Genias known
as Codine/GRM. In the SGE, jobs wait in a holding area and queues located on
servers provide the services for jobs. A user submits a job to the SGE, and
declares a requirements profile for the job. As soon as a queue becomes
available for execution of a new job, the SGE determines suitable jobs for the
queue and will dispatch the job with the highest priority or longest waiting
time; it will try to start new jobs in the least loaded and most suitable
queue.
The Storage Resource Broker (SRB) [SRB] has been developed at San Diego Supercomputer Centre (SDSC) to provide “uniform access to distributed storage” across a range of storage devices, via a well-defined API. The SRB supports file replication, and this can occur either offline or on-the-fly. Interaction with the SRB is via a GUI. The SRB servers can be federated. The SRB is managed by an administrator, with authority to create user groups.
A key feature of the SRB is that it supports metadata associated with a distributed file system, such as location, size and creation date information. It also supports the notion of application-level (or domain-dependent) metadata, specific to the content and not generalisable across all data sets.
In contrast with traditional network file systems, SRB is attractive for grid applications in that it deals with large volumes of data, which can transcend individual storage devices, because it deals with metadata and takes advantage of file replication.
A web portal allows application scientists and researchers to
access resources specific to a particular domain of interest via a web interface.
Unlike typical web subject portals, a grid portal may also provide access to
grid resources. For example a grid portal may authenticate users, permit them
to access remote resources, help them make decisions about scheduling jobs,
allowing users to access and manipulate resource information obtained and
stored on a remote database. Grid portal access can also be personalised by the
use of profiles which are created and stored for each portal user. These
attributes, and others, make grid portals the appropriate means for e-Science
users to access grid resources.
The NPACI HotPage [Hotpage] is a user portal that has been
designed to be a single point-of-access to computer-based resources. It
attempts to simplify access to resources that are distributed across member
organisations and allows them to be viewed as either an integrated grid system
or as individual machines.
The two key services provided by the HotPage include information
and resource access and management services. The information services are
designed to increase the effectiveness of users. It provides links to:
·
User
documentation and navigation;
·
News
items of current interest;
·
Training
and consulting information;
·
Data
on platforms and software applications;
·
Information
resources, such as user allocations and accounts.
Another key service offered by HotPage is that it provides status
of resources and supports an easy mechanism for submitting jobs to resources.
The status information includes:
·
CPU
load/percent usage;
·
Processor
node maps;
·
Queue
usage summaries;
·
Current
queue information for all participating platforms.
HotPage’s interactive web-based service offers secure transactions
for accessing resources and allows the user to perform tasks such as command
execution, compilation, and running programs.
The SDSC grid port toolkit [GridPort] is a reusable portal toolkit
that uses HotPage infrastructure. The two key components of GridPort are the
web portal services and the application APIs. The web portal module runs on a
web server and provides a secure (authenticated) connectivity to the Grid. The
application APIs provide a web interface that helps end-users develop
customised portals (without having to know the underlying portal
infrastructure). GridPort is designed to allow the execution of portal services
and the client applications on separate web servers. The GridPortal toolkit
modules have been used to develop science portals for applications areas such
as pharmacokinetic modeling, molecular modelling, cardiac physiology, and
tomography.
The Grid Portal Collaboration is an alliance between NCSA, SDSC
and NASA IPG [NLANR]. The purpose of the Collaboration is to support a common
set of components and utilities to make portal development easier and allow
various portals to interoperate by using the same core infrastructure (namely
the GSI and Globus).
Example portal capabilities include the following:
·
Running
simulations either interactively or submitted to a batch queue.
·
File
transfer including: file upload, file download, and third party file transfers
(migrating files between various storage systems).
·
Querying
databases for resource/job specific information.
·
Maintaining
user profiles that contains information about past jobs submitted, resources
used, results information and user preferences.
The portal architecture is based on a three-tier model, where a
client browser securely communicates to a web server over a secure sockets (via
https) connection. The web server is
capable of accessing various grid services using the Globus infrastructure. The
Globus toolkit provides mechanisms for securely submitting jobs to a Globus
gatekeeper, querying for hardware/software information using LDAP, and a secure
PKI infrastructure using GSI.
The portals discussion in this sub-section highlights the
characteristics and capabilities that are required in grid environments. Portals are also relevant at the information
and knowledge layers and the associated technologies are discussed in sections
4 and 5.
Having reviewed some of the most significant projects that are
developing grid technologies, in this section we now describe some of the major
projects that are deploying grid technologies. In particular we identify their
aims and scope, to indicate the direction of current activities.
The main objective of NASA’s Information Power Grid (IPG) [IPG] is
to provide NASA’s scientific and engineering communities with a substantial
increase in their ability to solve problems that depend on the use of
large-scale and/or distributed resources. The project team is focused on
creating an infrastructure and services to locate, combine, integrate, and manage
resources from across NASA centres, based on the Globus Toolkit. An important
goal is to produce a common view of these resources, and at the same time
provide for distributed management and local control. It is useful to note the
development goals:
·
Independent
but consistent tools and services that support a range of programming
environments used to build applications in widely distributed systems.
·
Tools,
services, and infrastructure for managing and aggregating dynamic collections
of resources: processors, data storage/information systems, communications
systems, real-time data sources and instruments, as well as human
collaborators.
·
Facilities
for constructing collaborative, application-oriented workbenches and problem
solving environments across NASA, based on the IPG infrastructure and
applications.
·
A
common resource management approach that addresses areas such as systems
management, user identification, resource allocations, accounting, and
security.
·
An
operational grid environment that incorporates major computing and data
resources at multiple NASA sites in order to provide an infrastructure capable
of routinely addressing larger scale, more diverse, and more transient problems
than is currently possible.
The EU
Data Grid [DataGrid] has three principal goals: middleware for managing the
grid infrastructure, a large scale testbed involving the upcoming Large Hadron
Collider (LHC) facility, and demonstrators (production quality for high energy
physics, also earth observation and biology). The DataGrid middleware is
Globus-based.
The GriPhyN (Grid Physics Network) [GriPhyN] collaboration, led by
the universities of Florida and Chicago, is composed of a team of experimental
physicists and information technology researchers who plan to implement the
first Petabyte-scale computational environments for data intensive science. The
requirements arise from four physics experiments involved in the project.
GriPhyN will deploy computational environments called Petascale Virtual Data
Grids (PVDGs) that meet the data-intensive computational needs of a diverse
community of thousands of scientists spread across the globe.
The Particle Physics Data Grid [PPG] is a consortium effort to
deliver an infrastructure for very widely distributed analysis of particle
physics data at multi-petabyte scales by hundreds to thousands of physicists.
It also aims to accelerate the development of network and middleware
infrastructure for data-intensive collaborative science.
The EuroGrid [EuroGrid] is a project funded by the European
Commission. It aims to demonstrate the use of grids in selected scientific and
industrial communities, address the specific requirements of these communities,
and highlight their use in the areas of biology, meteorology and computer-aided
engineering.
The objectives of the EuroGrid project include the support of the
EuroGrid software infrastructure, the development of software components, and
demonstrations of distributed simulation codes from different application areas
(biomolecular simulations, weather prediction, coupled CAE simulations,
structural analysis, real-time data processing).
The EuroGrid software is UNICORE (UNiform Interface to COmputing
REsources) [UNICORE] which has been developed for the German supercomputer
centres. It is based on Java-2 and uses Java objects for communication, with
the UNICORE Protocol Layer (UPL) handling authentication, SSL communication and
transfer of data; Unicore pays particular attention to security.
The computation layer is probably the layer with the most software
technology that is currently available and directly useable. Nevertheless, as
this section has highlighted, distributed applications still offer many
conceptual and technical challenges. The requirements in section 3.2 express
some of these, and we highlight the following as key areas for further work:
Although we believe the service-oriented approach will prove to be
most useful at the information and knowledge layers, it is interesting to note
that it can be applied to the various systems discussed in this section. In
particular, it is compatible with the CORBA model, with the way Java is used as
a distributed environment, and with Globus. We expect to see this trend
continue in most of the significant grid technologies. We note that layered and
service-based architectures are common in descriptions of the systems discussed
in this section, notably Globus and IPG.
For example, IPG has a ‘grid common services layer’. Determining the core grid services in a
service-oriented architecture is also a research issue.
Java and
CORBA address several of the above issues and provide a higher level conceptual
model, in both cases based on distributed object systems. However most
currently deployed systems use Globus, with the Java-based UNICORE system an
interesting alternative. A key question in all of this work, however, is what
happens as scale increases? None of the systems considered in this section have
demonstrated very large scale deployments, though some are set to do so.
Pragmatically, some solutions will scale reasonably well, and the Web has
demonstrated that scalability can be achieved; e.g. consider search engines.
However we have significant concern about the need to cross organisational
boundaries and the obstacles this will impose, relating to system management
and security.
Many of these research issues are the subjects of working groups
in the Global Grid Forum [GGF].
[1] This notion of services is different from that described in section 2, although it can be regarded as a particular technical implementation of these ideas.