An increasing number of matchers are now capable of deriving other mapping relations than equivalence relations, such as subsumption, disjointness or named relations. This is a necessity given that we need to compute alignments between ontologies at different granularity levels or between ontologies that elaborate on non-equivalent elements. However, the evaluation of such mappings was only partially addressed by earlier editions of OAEI and there was not a gold standard to measure the effectiveness of existing matchers. The goals of this track are as follows. We wish to get a better insight into the types of non-equivalence mappings that state-of-the-art tools produce and report on existing methods for computing any of this type of "ordered/directed" mappings. We will evaluate the submitted mappings to get an understanding of the precision/recall values that current tools achieve and provide insights to the limitations and potential of existing methods. The track aims also to report on "appropriate" evaluation methods and measures for these type of mappings. Finally, we expect to enrich existing gold standards so as to produce benchmark series that will be useful to the community. The track is organized around two tasks. The first task, focuses on a gold standard based evaluation for a given dataset that has been derived from the OAEI'06 benchmark series and from the OAEI'06 consensus workshop. The second task, called open-ended evaluation, aims to get an insight into the non-equivalence mappings produced over all available data sets.
Gold-Standard based Evaluation Concerning the evaluation of subsumption relations the track provides two datasets for the participants:
Artificial Ontologies Corpus: The first one is the benchmark series corpus of 2006. Specifically, the ontologies are dealing with bibliografic references. For each pair of ontologies we have created a gold standard containing only subsumption relations.
Real World Ontologies Corpus: The second dataset is composed of all combinations per 2 of the consensus workshop track ontologies of OAEI 2006 . Specifically, the ontologies are dealing with conference organization. For each pair of ontologies two gold standards containing subsumption relation are provided, each one created in isolation by a different domain expert.
Both datasets can be downloaded from here.
Open-ended Evaluation For this task, we accept alignments containing non-equivalence mappings derived from any of the OAEI'09 dataset. The entire alignment obtained by a given tool should submitted (i.e., containing also equivalence relations). The alignments shold be submitted in the format described under the "Execution phase" heading of the main OAEI'09 page.
Gold-Standard based Evaluation We will use precision, recall and f-measure wrt the gold standard. Participants will have to provide an alignment containing only subsumption relations between classes for each pair of ontologies in each dataset. Specifically, the format of the alignment should be as follows:
<?xml version='1.0' encoding='utf-8'?>
<rdf:RDF xmlns='http://knowledgeweb.semanticweb.org/heterogeneity/alignment'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:xsd='http://www.w3.org/2001/XMLSchema#'>
<Alignment>
<xml>yes</xml>
<level>0</level>
<type>11</type>
<onto1>http://oaei.ontologymatching.org/2006/benchmarks/101/onto.rdf</onto1>
<onto2>http://oaei.ontologymatching.org/2006/benchmarks/103/onto.rdf</onto2>
<uri1>http://oaei.ontologymatching.org/2006/benchmarks/101/onto.rdf</uri1>
<uri2>http://oaei.ontologymatching.org/2006/benchmarks/103/onto.rdf</uri2>
Open-ended Evaluation Depending on the number of submitted mappings/systems, evaluation might focus only on a subset, probably those mappings that are returned by multiple systems. In a first instance, we will perform an automatic evaluation of the mappings by using recently developed techniques at the OU. In a second instance, we will ask participants to manually validate a small set of mappings (not more than 100-200 relations). They will be supported by a GUI based tool in this task.
We would like to thank Ondrej Svab Zamazal for contributing to the creation of the dataset.
For the gold-standard based evaluation: George Vouros, Vassilis Spiliopoulos, University of the Aegean, Greece
For the open-ended evaluation: Marta Sabou, Mathieu d'Aquin, Open University, UK