Research Agenda for the Semantic
Grid:
A Future
e-Science Infrastructure
David De Roure, Nicholas Jennings and Nigel Shadbolt
Executive
Summary
e-Science offers a promising
vision of how computer and communication technology can support and enhance the
scientific process. It does this by enabling scientists to generate, analyse,
share and discuss their insights, experiments and results in a more effective
manner. The underlying computer infrastructure that provides these facilities
is commonly referred to as the Grid. At this time, there are a number of grid
applications being developed and there is a whole raft of computer technologies
that provide fragments of the necessary functionality. However there is
currently a major gap between these endeavours and the vision of e-Science in
which there is a high degree of easy-to-use and seamless automation and in
which there are flexible collaborations and computations on a global scale. To
bridge this practice–aspiration divide, this report presents a research agenda
whose aim is to move from the current state of the art in e-Science
infrastructure, to the future infrastructure that is needed to support the full
richness of the e-Science vision. Here the future e-Science research
infrastructure is termed the Semantic Grid (Semantic Grid to Grid is meant to
connote a similar relationship to the one that exists between the Semantic Web
and the Web).
In more detail, this
document analyses the state of the art and the research challenges that are
involved in developing the computing infrastructure needed for e-Science. In so
doing, a conceptual architecture for the Semantic Grid is presented. This
architecture adopts a service-oriented perspective in which distinct
stakeholders in the scientific process provide services to one another in
various forms of marketplace. The view presented in the report is holistic,
considering the requirements of e-Science and the e-Scientist at the
data/computation, information and knowledge layers. The data, computation and
information aspects are discussed from a distributed systems viewpoint and in
the particular context of the Web as an established large scale infrastructure.
A clear characterisation of the knowledge grid is also presented. This characterisation builds on the emerging
metadata infrastructure with knowledge engineering techniques. These techniques
are shown to be the key to working with heterogeneous information and also to
working with experts and establishing communities of e-Scientists. The
underlying fabric of the Grid, including the physical layer and associated
technologies, is outside the scope of this document.
Having completed the
analysis, the report then makes a number of recommendations that aim to ensure
the full potential of e-Science is realised and that the maximum value is
obtained from the endeavours associated with developing the Semantic Grid.
These recommendations relate to the following aspects:
·
The
research issues associated with the technical and conceptual infrastructure of
the Semantic Grid;
·
The
research issues associated with the content infrastructure of the Semantic
Grid;
·
The
bootstrapping activities that are necessary to ensure the UK’s grid and
e-Science infrastructure is widely disseminated and exemplified;
·
The
human resource issues that need to be considered in order to make a success of
the UK e-Science and grid efforts;
·
The
issues associated with the intrinsic process of undertaking e-Science;
· The future strategic activities that need to be undertaken to maximise the value from the various Semantic Grid endeavours.
Contents
2.1 Justification of a Service-Oriented
View
2.2.1 Service Owners and Consumers
as Autonomous Agents
2.3 A Service-Oriented View of the
Scenario
3.1 Grid Computing as a Distributed System
3.1.1 Distributed Object Systems
3.1.2 The Web as an Infrastructure
for Distributed Applications
3.2 Data-Computational Layer Requirements
3.3 Technologies for the Data-Computational Layer
3.3.4 Nimrod/G Resource Broker and GRACE
3.3.6 The Common Component Architecture Forum
3.3.7 Batch/Resource Scheduling
3.4.2 The SDSC Grid Port Toolkit
3.4.3 Grid Portal Development Kit
4.1 Technologies for the
Information Layer
4.1.2 Expressing Content and
Metacontent
4.1.4 Towards an Adaptive
Information Grid
4.1.5 The Web as an e-Science
Information Infrastructure
4.1.6 Information Requirements of
the Infrastructure
4.3 Information Layer Aspects of the
Scenario
5.2 Ontologies and the Knowledge
Layer
5.3 Technologies for the Knowledge
Layer
5.4 Knowledge Layer Aspects of the Scenario
6.1 Technical and Conceptual
Infrastructure
6.6 Future Proposed Directions