Notes
Outline
Research Agenda for the Semantic Grid:
A Future e-Science Infrastructure
David De Roure
Nicholas Jennings
Nigel Shadbolt
About the authors
David De Roure
Distributed Systems
Nigel Shadbolt
Advanced Knowledge Technologies
Nick Jennings
Agent Based Computing
Involved in three testbeds, two IRCs, TAG, ATF
Overview of talk
Motivation
Service oriented architectures
Distributed systems
Semantic Grid
Recommendations
Motivation
Vision of e-Science with high degree of easy to use and seamless automation, with flexible collaborations and computations on a global scale
Gap between vision of e-Science and current endeavours
Three layer model compelling but much hand-waving about knowledge layer
Concern about scalability assumptions
Lack of holistic approach – Grid starting at socket on wall
Need for universal architecture
Aim
A Research Agenda aiming to move from the current state-of-the-art in e-Science infrastructure to the future infrastructure that is needed to support the full richness of the e-Science vision.
Caveat: This is not a blueprint for all e-Science infrastructure research!  The absence of recommendations doesn’t mean there are no research issues.
Why Semantic Grid?
The Semantic Grid is to the Grid what the Semantic Web is to the Web
Which begs a question…
What is the Semantic Web
“The Semantic Web is an extension of the current Web in which information is given a well-defined meaning, better enabling computers and people to work in cooperation.  It is the idea of having data on the Web defined and linked in a way that it can be used for more effective discovery, automation, integration and reuse across various applications.  The Web can reach its full potential if it becomes a place where data can be processed by automated tools as well as people”
- TBL
Document
Commissioned for Core e-Science Programme
Not a (or the) roadmap J
Aim to bridge communities
Not a comprehensive survey, e.g. physical layer and comms out of scope
Draft distributed in July to TAG
Samizdat publication was influential
Completed in December after further comment cycle
Final version ready for formal distribution December 2001
Document structure
Uses 3 layer model as a narrative (and sociological?) structure, not an architecture!
Introduction
Service-oriented architectures
Data/Computation
Information
Knowledge
Recommendations
Scenario
Architectural
Influences
Slide 12
Slide 13
Slide 14
Slide 15
Slide 16
Slide 17
Notion of Services
Abstract characterisation of some data or processing capabilities
Service of providing data on sea temperatures for last 50 years
Service of correlating sea temperature from data source 1 with annual rainfall from data source 2 and making prediction for coming year
Service of finding all data on climate change over past 20 years and being continually informed as new data becomes available
Service of finding all users who are interested in climate change
Notion of Services
Services have owner(s)
Owners set access conditions
Universal vs. Restricted access
Free vs. Paid for
Services have consumers
Service Level Agreement (contract) specifies relationship between service owners and service consumers
Services can be created and composed
Slide 20
The Grid: A Service Marketplace (1)
The Grid: A Service Marketplace (2)
The Grid: A Service Marketplace (3)
Key Functions
Service Owner
Service creation
Service advertisement
Service level agreement creation
Service delivery
Service Consumer
Service discovery
Service location
Service level agreement creation
Service result receipt
Agent Technology:
A Canonical View
Distributed Systems
One-stop shop solutions?
Web-based computing?
Transactions aren’t large scale (or long-running)
Content is large scale but QoS is very poor
Web is powerful because it is so simple
Client-server constraint vs async message passing
Simplistic failure handling
Crossing firewalls is a relevant issue
One-stop shop solutions?
Distributed object systems eg CORBA?
Proven intranet solution
Software engineering paradigm eg UML
Not accepted as an internet computing solution
Peer-to-peer?
Self-organisation attractive
Concerns over heterogeneity and security
Resource discovery even more important
Hybrid solutions desirable (as well as inevitable)
Requirements
Broadly we need intranet features/services on an internet scale; e.g. reliable secure messaging, information consistency, directory services
The document catalogues a number of example technologies and deployments in order to establish requirements
Identifies research issues in line with GGF
Particular concern over need to cross organisational boundaries
Information Layer
Web is designed for information distribution
Useful as content infrastructure (see later!)
But beware well known problems:
Version control
Provenance
Quality
Pull model
Considerable activity in search, including distributed search
Links to databases community e.g. query decomposition and routing
Information Layer
Content and metacontent
XML(S)
RDF(S)
Need shared vocabularies
Semantic Web
“Semantic Web is not AI” - TBL
Some open research issues but this level established
Vision of metadata everywhere
What will motivate the tools for metadata creation?
Web services
Instantiation of service-oriented model
XML Protocol
Web Services Description Language
Universal Description Discovery and Integration
Workflow description
Web Services Flow Language
XLang
Other proposals emerging
Many issues discussed earlier apply to Web Services!
Service composition example
‘Live’ Information Systems
i.e. not just publishing things at each other J
Examples
Live data and visualisation from experimental kit
Live video feeds
Videoconferencing
IRC, MUDs, chat rooms
Collaborative Virtual Environments
Access Grid
Also need metadata
How do these scale?
Pervasive Information Systems
Smart laboratory
Mediated spaces
Devices e.g. Electronic lab book
Presence
Of e-scientist and kit in virtual world
Of virtual world in physical world (augmented reality)
Research Issues
Strategies for e-Science content-types
Digital rights management in e-Science context
Provenance, for
Reuse
Repeat
Legal evidence
Adaptation and personalisation
Metadata in collaborative events
Collaboration for large scale community
Workflow description and enaction
Access control in context of automation
Semantic Grid
Knowledge layer
The aim of the knowledge layer is to act as an infrastructure to support the management and application of scientific knowledge to achieve particular types of goal and objective.
Note sheer scale of information content – we need abstracted and annotated content
Knowledge life cycle
Acquire – make info usable (could be tacit)
Model – bridges gap between acquisition and use
Retrieve – finding knowledge, or a subset
Reuse – rather than reaquire
Publish – right knowledge to right person at right time
Maintain – updating (and discarding)
Ontologies
Current work is extending RDF with concepts from knowledge representation languages
Shared vocabulary with rules/axioms for inference
DAML, DAML+OIL (UK strength)
W3C Web Ontology working group has been created
There are many kinds of ontologies
Domain, Task, Quality, Value, Personalisation, Argumentation
Need to be developed and maintained
Tools exist to work with emerging standards
Knowledge services
Services needed for Semantic Grid:
Knowledge discovery (e.g. patterns in information sets)
Clustering and indexing
Ontology mapping
Dynamic annotation
Summarisation
Visualisation
Monitoring, diagnosis, assessment
Slide 42
Slide 43
Problem Solving Environments
Involve knowledge aspects
Exploiting resources to solve particular problems
Approach problem in terms of application area semantics
Provide semi-automation
Potentially supported by reasoning services
Long history in knowledge-based systems
Could incorporate collaboration e.g. Access Grid
Slide 45
Research Issues
Languages and infrastructures for knowledge services (e.g. DAML-S)
Methods for large-scale ontologies
Distributed annotation services
Knowledge capture tools as plugins
Dynamic linking, visualisation, navigation
Retrieval based on  annotations
Routine natural language processing
Internet-based reasoning services
Promotion of collaboration, e.g. in Access Grid
Recommendations
Recommendations
Technical and conceptual infrastructure
Grid toolkits
Smart Laboratories
Service-oriented architectures
Agent-based approaches
Network philosophies
Trust and provenance
Recommendations
Content infrastructure
Metadata and annotation
Knowledge technologies
Integrated media
Content presentation
Recommendations
Bootstrapping activities
Starter kits
Exemplar and reference sites
Use cases
Recommendations
Human Resource issues
Community building
System support
Training
Recommendations
e-Science intrinsics
E-Science workflow and collaboration
Pervasive e-Science
Recommendations
Future proposed directions
Core computer science research
Extension to e-Anything
e-Health
e-Learning
e-Society
e-Business
…