Tao F.1, Chen L.1, Cox S.J.3, Shadbolt N.R.1, Puleston C.2, Goble C.2
1 School of Electronics and Computer Science
University of Southampton, UK
{lc, nrs, ft}@ecs.soton.ac.uk
2 Department of Computer Science
University of Manchester, UK
{carole, colin.puleston}@cs.man.ac.uk
3 School of Engineering Sciences
University of Southampton, UK
sjc@soton.ac.uk
Semantic Web technologies are evolving the Grid towards the Semantic Grid [2] to yield an intelligent grid which allows seamless process automation, easy knowledge reuse and collaboration within a community of practice. We discuss our endeavours in this direction in the context of Grid enabled optimisation and design search in engineering (“Geodise” project) [3]. In our work we have developed a semantics-based Grid-enabled computing architecture for Geodise. The architecture incorporates a service-oriented distributed knowledge management framework for providing various semantic and knowledge support. It uses ontologies as the conceptual backbone for information-level and knowledge-level computation. We also describe ontological engineering work and a service-oriented approach to ontology deployment. We present several application examples that show the benefit of semantic support in Geodise.
E-Science [1] offers a promising vision of future large scale science over the Internet where the sharing and coordinated use of diverse resources in dynamic, distributed virtual organisations is commonplace. The Grid [4] has been proposed as a fundamental computing infrastructure to support the vision of e-Science. Convergence between the Grid and recent developments in web service technologies [5] [11] [12] have seen Grid technologies evolving towards an Open Grid Services Architecture (OGSA) [6]. This sees the Grid as providing an extensible set of services and it enables rapid assembly and disassembly of such services into transient confederations in various ways so that tasks wider than that enabled by the individual components can be accomplished. At this time, a number of Grid applications have been developed [3] [7] [26] and there is a whole raft of middleware that provide core Grid functionality such as Globus [9] and Condor [10]. However there is currently a major gap between these endeavours and the vision of e-Science in which there is a high degree of easy-to-use and seamless automation and in which there are flexible collaborations and computations on a global scale. It has been commonly agreed [2] that the realisation of the e-science vision will rely on how the heterogeneous resources of the Grid, which include data, information, hardware (clusters, servers etc.), software (computation codes), capabilities, and knowledge on how to use these assets, can be effectively described, represented, discovered, pre/post-processed, interchanged, integrated and eventually reused to solve problems.
Semantic Grid [2], a future e-Science infrastructure, has been proposed to bridge the practice and aspiration divide of the Grid. The Semantic Grid aims to support the full richness of the e-Science vision by considering the requirements of e-Science and the e-Scientist throughout their use of Grid resources. The enabling technologies that evolve the Grid to the Semantic Grid are the Semantic Web [13] [14] and advanced knowledge technologies [15]. The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. It is the idea of having data on the Web defined and linked in a way that it can be used for more effective discovery, automation, integration, and reuse across various applications. Advanced knowledge technologies are concerned with the process of scientific knowledge management on the Grid in terms of a life cycle of knowledge-oriented activity that ranges over knowledge acquisition, modelling, retrieval, reuse, publishing and maintenance.
In this paper we will illustrate how Semantic Grid technologies are being exploited to assist engineers in engineering design search and optimisation (EDSO). The goal is two folds: the first is to expose EDSO resources with relevant metadata, common vocabulary and shared meaning so that they can be shared and reused seamlessly. The second is to enable EDSO on the Grid by making use of semantic information in these EDSO resources. In the following we first introduce an integrated architecture for Grid-enabled design search and optimisation in engineering. The distinguishing feature of the architecture is the incorporation of the knowledge and ontology components, which migrates the Grid towards the Semantic Grid. Section 3 describes the core underlying technique of the Semantic Grid, i.e. ontology engineering work including ontology development, representation and deployment. Section 4 presents several application examples, which demonstrate how semantic support can facilitate EDSO in various aspects of EDSO processes on the Grid. Finally some initial conclusions are drawn from our work on Geodise.
Grid enabled optimisation and design search in engineering (Geodise [3]) is one of the UK e-Science pilot projects. It is intended to enable engineers to carry out engineering design search and optimisation by seamless access to a state-of-the-art collection of optimisation and search tools, industrial strength geometry modelling and meshing tools (ProE, Gambit) and analysis codes (FLUENT), and distributed computing and data resources on the Grid. To achieve the above objective, Geodise utilise two latest technologies, the Semantic Web and advanced knowledge technologies, to aid engineers in the design process by semantic support and exploiting EDSO domain knowledge, thus enabling new designs to be developed more rapidly, or at lower cost. This requirement suggests that a knowledge infrastructure be developed to support the distributed management and application of scientific knowledge on the Grid. Figure 1 shows the Geodise architecture under the Semantic Grid paradigm. This architecture consists of four main components including the Geodise portal, the application service provider, and the optimisation and computation modules. The application service provider caters for both design and analysis tools integrated with support for databases that provide information about previous designs.

Figure 1: The semantic Grid architecture for engineering design
search and optimisation
The optimisation component provides a variety of optimisation algorithms by which each design may be evaluated in terms of a selected objective function. The computation component calculates values for the objective function that is being optimised. All these components are viewed and implemented as web/Grid services and physically distributed. The user front end of Geodise is the Geodise portal, which allows users to locate and compose services they require, seeking advice as necessary.
Though the four modules described above form the main fabric of the Geodise architecture- i.e. the data, computation and applications- the components that are central to providing knowledge and intelligence for the grid, and hence play a key role in the evolution of the Grid towards the Semantic Grid, are: the ontology, knowledge repository and the intelligent systems. The ontology component provides a shared, explicit specification of the conceptualisation for the EDSO domain. It consists of common vocabularies to represent domain concepts and the relationships between them. EDSO ontologies allow engineers to describe EDSO resources in a semantically consistent way so that they can be shared and processed by both machines and humans. Ontologies lay down the foundation on which seamless access to heterogeneous distributed resources on the Grid can be achieved. The knowledge repository component is intended to expose accumulated design expertise and/or practices to designers so that new design runs can be conducted based on previous design experience. The EDSO knowledge repository contains the intellectual and knowledge-based assets of the EDSO domain. These assets include domain dependent, problem specific expertise embodied in a set of semantically enriched resources which have been produced and archived by the EDSO designers during previous design runs and can subsequently be reused in various ways to enhance their design capabilities in the future.
Intelligent systems aim to provide knowledge-based decision-making support for engineers to develop new designs. This may be done in the analysis codes and resources modules, for example, through an intelligent application manager that makes use of intelligence based on domain knowledge and an intelligent resource provider that makes use of intelligence on top of Grid infrastructure and/or middleware. In Geodise we initially concentrate on exploiting EDSO domain knowledge to facilitate problem solving. Knowledge-based support for decision-making can be provided at multiple knowledge intensive points of the design process and at multiple levels of granularity such as at the process level (what should be done next after a previous task), component level (if the next task is optimisation, which methods or algorithms should be chosen from among a suite of 40+ optimisers), or parameter level (if a genetic algorithm optimiser is selected, how to set the control parameters such as population size).
To realise the three key components, i.e. the ontology, knowledge repository and intelligent systems that underpin the idea of the Semantic Grid, we have proposed and developed an integrated service-oriented framework for distributed knowledge management [27] as shown in Figure 2. In this framework, knowledge about a specific domain is acquired, modelled and represented using a variety of techniques and formalisms. It includes ontologies, knowledge bases and other domain related information. This knowledge is then saved in a knowledge warehouse (or repository). All activities related to knowledge consumption and supply are realised as knowledge services. Users are provided with a community knowledge portal as the entrance point. The knowledge portal facilitates the use of knowledge with different levels of access control. The framework has a layered modular structure with each component dealing with a specific aspect of the knowledge engineering process in a co-ordinated way. For example, ontologies can be built from knowledge acquisition, and further used to create knowledge bases or to do semantic annotation. These knowledge bases or their associated annotation archives, having been semantically enriched, can then be exploited by the services. These services have mechanisms for querying or searching semantic content so as to facilitate knowledge publishing, use/reuse and maintenance.

Figure 2: The service-oriented knowledge management framework
As the enabling knowledge infrastructure for the Semantic Grid, the service-oriented knowledge management framework covers all aspects of the knowledge management lifecycle. However, as can be seen from Figure 1, ontologies play a central role for the success of the Semantic Grid and its applications. It serves as a conceptual backbone for automated information access, sharing and reuse, and also enabling semantic-driven knowledge processing on the Semantic Grid [16]. Therefore this paper focuses on the ontological engineering and the use of semantics for e-Science in the context of Geodise. In Geodise ontologies have been created that capture the concepts and terms of the design process, in other words, the common vocabulary used by design engineers to describe what they do. In turn these ontologies are used to describe problem setup, database schemas, computation algorithms, design processes and design results with rich semantics. With the semantic information in place, there will be no communication barriers for people and soft agents. Resources will be transparent for authorised users so that they can be seamlessly shared and aggregated for use. The benefit of conducting EDSO on the Semantic Grid is that engineers, in particular designers wishing to leverage previous expert use of the system, are able to share not only computational resources but also the wider knowledge of the community.
We have carried out extensive knowledge acquisition for the EDSO domain using the CommonKADS knowledge engineering methodology [25] and the PC PACK toolkit [31] [27]. The acquired knowledge is modelled as either ontologies or rules in knowledge bases. A set of ontologies has been built to conceptualise the characteristics of the EDSO domain using the OilEd ontology editor [29]. For example, Figure 3 shows the EDSO task ontology. The left panel displays the hierarchical structure of the EDSO tasks, plus all other information types that are relevant to EDSO tasks. The right panel is used to define an individual task by specifying its properties. The definition of a property is actually to establish relationships among concepts within one or multiple ontologies.

Figure 3: EDSO task ontology
Since components in the Geodise architecture are web/Grid services, concepts in the task ontology are in fact different types of services. An instance of a task is a service specified for accomplishing a designated function. This makes us able to adopt DAML-S web services description framework to describe a (EDSO task) service's properties and functionality. To build EDSO-specific task ontology, we have specialised the high-level concepts of DAML-S with terms from EDSO domain ontologies while preserving the DAML-S service description structure such as service profile, service model and service grounding. This makes the EDSO task ontology consistent in both structural description and content semantics. It in turn guarantees that EDSO task ontology can be shared and understood in EDSO community, thus facilitating dynamic automated service discovery and composition. EDSO ontologies are represented in an expressive markup language with well-defined semantics such as DAML+OIL [17] and OWL [18]. DAML+OIL takes an object-oriented-like approach, with the characteristics of the domain being described in terms of classes and properties. It builds upon existing Web standards, such as XML and RDF, and is underpinned by the expressive description logic. It supports the classification of concepts based on their property description - a description-based reasoning capability. Ontological reasoning can be used for subsumption and/or consistency checking. It can also be used as a concept match-maker. For instance, we can retrieve sets of ontological concepts matching some arbitrarily defined queries through classification and subsumption reasoning. Ontological reasoning provides a foundation for semantics-based service discovery as will be seen later when we use EDSO task ontology to perform service composition. Figure 4 shows a fragment of the EDSO task ontology in DAML+OIL.

Figure 4: EDSO task ontology - the fragment of the geometry design task
We have developed ontology services to facilitate the deployment of the EDSO ontologies in Geodise. Ontology services are implemented as a typical SOAP-based web service independent of any specific domain. Therefore it can access any DAML+OIL ontology that is available over the Internet. The ontology service consists of four components: an underlying data model that holds the ontology (the knowledge model) and allows the application to interact with it through a well-defined API, an ontology server that provides access to concepts in an underlying ontology data model and their relationships, the FaCT reasoner [30] that provides reasoning capabilities and a set of user APIs that interface user’s applications and the ontology. By using the service’s APIs and the FaCT reasoner, common ontological operations, such as subsumption checking, retrieving definitional information, navigating concept hierarchies, and retrieving lexical information, can be performed when required.
As a standard web service, ontology service itself is a type of knowledge asset and can be accessed, shared, and reused using the service's WSDL. It has been developed using Java technologies and deployed using Apache Tomcat and Axis technologies.
Semantic support can be delivered in GEODISE PSE through the following application scenarios and exemplars.
Content in the software manual for the script editor is processed and enriched using a pre-defined ontology. This is demonstrated in Figure 5 where instances of command usages are generated manually in Protégé 2000 based on the usage ontology and the corresponding usage entry in the Gambit command manual, which is a tool for generating meshes from a geometry. Each Gambit command can operate with a set of keywords and parameters in certain syntax and grammar. In Geodise, engineers need to edit these domain scripts frequently with the guidance from the manual to make sure that scripts are correct.

Figure 5: Building Gambit command ontology
The ontology assisted domain script editor makes use of the pre-built command usage instances and colorizing the scripts syntax. It also provides real-time context sensitive hinting and auto-completion as illustrated in Figure 6. All these functionalities operate by consuming the semantically enriched content - the Gambit command usage instances.
Since the editor can load in any ontology, it is domain independent and has the potential to assist script editing in any other domain as long as the corresponding ontology is available.

Figure 6: Ontology assisted Gambit script editing
We have also demonstrated not only hints on parameters to configure a command, but also horizontal suggestions of `next steps' based on expert knowledge. To develop this further would require additional knowledge capture.
In Geodise, Matlab structure as shown in Figure 7 is used to capture all necessary metadata about a problem and produce a high level human readable abstraction. As a problem pre-set, the Matlab structure captures all problem definition related information embedded in the STEP file, such as all the possible design parameters which may be varied to modify the design.

Figure 7: MatLab structure for Geometry
The analysts can also change the default value as necessary. The result is an instance of a problem setup in the problem profile. The analyst can also load existing instances from the problem profile repository to carry on analyzing work conducted previously.

Figure 8: XML Schema of EDSO problem setup
However, in a collaborative environment, different CAD designers may use different metadata to describe a problem in the Matlab structure. This may cause inconsistency and inhibit sharing of previously generated CAD designs. By using ontology, this can be avoided. Also, the Ontologies can be maintained separately at a centralized place, as demonstrated in Figure 8 and used in the construction of a Matlab structure to describe a problem: CAD designers can interact with a set of ontology driven forms demonstrated in 9 and Figure 10, which are automatically generated based on a controlled set of vocabularies and relationships specified in the ontology. Once the form is finished by the CAD designers, an instance of the component description is ready. This instance is passed to the following phases where it can be loaded again by analysts who, according to design requirements, can further specify the desired design variables by manipulating (e.g. checking off some parameters) the list of design parameters, or by changing the range and default value of some parameters, etc. We call this analyst operation `problem setup'. These happen in a similar GUI and once this is finished, we have an instance that represents a particular concrete problem setup. Note that examples here are based on XML/Schema so far and are only for demonstrating the scenario. The auto-GUI rendering uses Jaxfront [38] and the semantics are expressed in XML schema using XML spy [39].

Figure 9: Instances of design variables

Figure 10: Instance of problem setup
Scientific activities often involve constructing a workflow. We have developed a Workflow Construction Environment (WCE) as shown in Figure 10 for Geodise, which is intended to (1) exploit the semantically enriched services for semantic-based service discovery and reuse, (2) generate semantic workflows for the use of future problem solving, and (3) provide knowledge-based advice [28] on service composition.
In the service-oriented Grid computing paradigm this process amounts to semantically enrich workflow instances and components so that resources on the Grid can be discovered and assembled easily. In other words, the successful orchestration of component services into a valid workflow specification is heavily dependent on bodies of domain knowledge as well as semantically enriched service descriptions.
Semantic service description is undertaken using ontologies accessed via the ontology services. The process of specifying semantic service descriptions is carried out in two steps. Firstly, domain ontologies, such as the task ontology and the function ontology, are created. Then, the domain specific service ontology is built using concepts from the domain ontologies. The semantic descriptions of domain-specific services are actually instances of concepts from the service ontology. Semantic service descriptions are stored in the Semantic Service Description component.
The main components for semantic resource enrichment, discovery and reuse are the Component (Service) Editor (the middle right panel), Ontology Browser (the left panel) and the Workflow Editor (the middle panel). The Component (Service) Editor is a frame-like data-storage structure. It is used to specify a service description for service discovery or to define a service directly by filling in the required data fields. The structure of the Component Editor is dynamically generated in accordance with the service ontology, thus semantically enriching the service when the service is defined. The Ontology Browser displays ontologies that provide service templates for workflow construction. Workflows are built in the Workflow Editor in which users either discover an appropriate service via semantic service matching or specify a semantically enriched service afresh. These services are connected in a semantic-consistent way to form a workflow.
Each time a workflow is constructed for a particular design problem, it can be archived to form a semantically enriched problem/solution within a knowledge repository. This facilitates the re-use of previous designs, while avoiding the overhead of manually annotating the solution.

Figure 11: Knowledge guided workflow composition
In this paper, we have introduced the Semantic Grid architecture for engineering design search and optimisation. In particular, we presented an integrated service-oriented distributed knowledge management framework, which begins to migrate the Grid to the Semantic Grid. We have made full use of the latest semantic web technologies and developed mechanisms and tools to provide semantic support for EDSO. While the context of present research is design search and optimisation, the underlying infrastructure and approaches could be applied to many other types of Grid application. We believe that the Semantic Grid holds great promise for resource sharing and seamless automation and flexible collaboration on a widely distributed scale. Up to now various ontologies have been built and we have developed a number of scenarios and applications to demonstrate the how these ontologies can be reused in the EDSO domain to maximise semantic web potentials in the grid computation.
It becomes clear through the work, that exploiting semantics is not only desirable but necessary and viable for e-Science on the Grid.