À mon Ami, François…
et à mes Amours
(Hélène, David et Dominic)
JEAN BRODEUR
INTEROPÉRABILITÉ DES DONNÉES GÉOSPATIALES :
ÉLABORATION DU CONCEPT DE
PROXIMITÉ GÉOSÉMANTIQUE
Thèse présentée
à la faculté des études supérieures de l’Université Laval
pour l’obtention du grade de Philosophiæ Doctor (Ph.D.)
Département des sciences géomatiques
FACULTÉ DE FORESTERIE ET DE GÉOMATIQUE UNIVERSITÉ LAVAL
QUÉBEC
JANVIER 2004
© Jean Brodeur, 2004
RÉSUMÉ
Depuis 1990, le nombre de bases de données géospatiales augmente tant au Canada
qu’ailleurs dans le monde. Ces bases de données accessibles sur Internet représentent les
phénomènes géographiques de manière semblable mais non identique. Cet état de fait
cause des problèmes pour repérer les données géospatiales répondant aux besoins exacts
des utilisateurs et pour les intégrer dans des ensembles cohérents. À cet effet,
l'interopérabilité des données géospatiales facilite l’intégration des données sur un plan
informatique mais, à ce jour, ne résout toujours pas les problèmes sémantiques. Cette
thèse a pour but d’identifier, de définir et d’expliquer les éléments de la proximité
sémantique, spatiale et temporelle pour le repérage des données géospatiales qui
répondent au besoin particulier d’un utilisateur. Elle propose un cadre conceptuel
d’interopérabilité des données géospatiales qui est dérivé du processus de communication
entre deux individus. Elle développe la notion de proximité géosémantique qui qualifie la
similitude entre un concept géospatial d’un agent et une représentation conceptuelle
géospatiale d’un message. Enfin, elle présente un prototype où des agents logiciels
communiquent entre eux. Ces agents utilisent des fonctions qui profitent de la notion de
proximité géosémantique pour générer et reconnaître les représentations conceptuelles
géospatiales des messages à partir des concepts géospatiaux de leur ontologie respective.
Jean Brodeur, étudiant Date
Dr Yvan Bédard, directeur Date Dr Bernard Moulin, codirecteur Date
RÉSUMÉ
Depuis 1990, le nombre de bases de données géospatiales augmente tant au Canada
qu’ailleurs dans le monde. Ces bases de données accessibles sur Internet représentent les
phénomènes géographiques de manière semblable mais non identique, ce qui cause des
problèmes pour repérer les données géospatiales répondant aux besoins exacts des
utilisateurs et pour les intégrer dans des ensembles cohérents. À cet effet,
l'interopérabilité des données géospatiales facilite l’intégration des données sur un plan
informatique mais, à ce jour, elle ne résout toujours pas les problèmes sémantiques. Cette
thèse a pour but d’identifier, de définir et d’expliquer les éléments de la proximité
sémantique, spatiale et temporelle pour le repérage des données géospatiales qui
répondent au besoin particulier d’un utilisateur. On entend ici par repérage la recherche,
l’identification, la sélection et l’extraction de données géospatiales de sources externes.
Cette thèse a été réalisée en cinq étapes : une recherche préliminaire incluant une revue
de travaux précurseurs à cette thèse, l’élaboration d’un cadre conceptuel
d’interopérabilité, la formalisation de la proximité géosémantique, le développement d’un
prototype et l’expérimentation, pour terminer avec la rédaction de la présente thèse.
La revue de littérature nous amène à comparer l’interopérabilité des données géospatiales
à un processus de communication entre des êtres humains. Dans un processus de
communication, deux individus échangent une grande quantité d’information de manière
interopérable. Le deuxième chapitre de cette thèse consiste en une synthèse, en fonction
des objectifs de la présente thèse, des sujets suivants : cybernétique, communication,
cognition, base de données distribuées, interopérabilité, hétérogénéité, similitude
sémantique et ontologie.
iii
La troisième chapitre élabore un cadre conceptuel d’interopérabilité des données
géospatiales dérivé du processus de communication entre deux individus. Un agent
utilisateur de données désirant de l’information géospatiale soumet une requête dans son
vocabulaire à un agent fournisseur de données. Celui-ci interprète la requête, lui assigne
une signification, identifie les données qu’il possède et les envoie à l’agent utilisateur qui
s’assure qu’elles répondent à sa requête.
La notion de proximité géosémantique, formalisée au quatrième chapitre, établit la
similitude entre un concept géospatial d’un agent et une représentation conceptuelle
géospatiale d’un message. Dans le processus de communication, un concept similaire à
une représentation conceptuelle est utilisé pour lui attribuer une signification. La
proximité géosémantique constitue une fonction de raisonnement des agents pour
produire et reconnaître les messages qu’ils échangent.
Un prototype qui valide la faisabilité informatique de cette notion de proximité
géosémantique fait l’objet du cinquième chapitre de cette thèse. Les résultats obtenus à
l’aide d’ontologies sur le réseau routier et le réseau hydrographique de cinq spécifications
de données géospatiales démontrent l’efficacité et le potentiel de la notion de proximité
géosémantique.
Jean Brodeur, étudiant Date
Dr Yvan Bédard, directeur Date Dr Bernard Moulin, codirecteur Date
AVANT-PROPOS
Cette thèse est le fruit de plus de trois ans de travail assidu auquel plusieurs personnes
m’ont apporté un support de tout instant dans les moments éprouvants comme dans les
moments excitants. Je désire ici leur témoigner toute ma gratitude.
Je voudrais d’abord exprimé ma profonde reconnaissance au Dr Yvan Bédard, directeur
de recherche. Yvan a démontré un intérêt soutenu à mon projet depuis le tout début, alors
que ce n’était qu’une idée, jusqu’à sa conclusion. Il y a cru en moi et a su maintenir en
moi l’intérêt de réaliser cette thèse. Je lui exprime ici mes plus vifs remerciements pour
tout le temps consacré, ses précieux conseils prodigués et son amitié. Merci Yvan!
Je veux aussi remercier le Dr Bernard Moulin, co-directeur de cette thèse, et le Dr
Geoffrey Edwards, conseiller. Dès leur implication dans cette thèse, leurs commentaires
judicieux m’ont demandé un dépassement à tous points de vue pour donner un sens, une
signification voire même une âme à cette thèse. Merci Bernard! Merci Geoffrey!
J’aimerais souligner le support du personnel (particulièrement Suzie et Marie-Josée) ainsi
que l’encouragement de mes confrères étudiants (Pierre, Marc et Rodolphe) du
laboratoire de SIRS du Centre de recherche en géomatique de l’Université Laval. Merci à
chacun!
Cette thèse n’aurait pas pu voir le jour sans l’intérêt et le support de mon employeur, le
ministère Ressources naturelles Canada, plus particulièrement le Centre d’information
topographique de Sherbrooke (CITS), tout au long des travaux. Le support inconditionnel
v
des directeurs du CITS (Yves et Denis) ainsi que de tous mes collègues et ami(e)s du
CITS fut grandement apprécié. Sincère merci à chacun personnellement! Ma profonde
gratitude s’adresse particulièrement à mon ami François Massé †. François, même après
avoir quitté ce monde brusquement, tu es demeuré près de moi, tu m’as accompagné et tu
m’as encouragé durant tous mes travaux comme d’habitude. François, là où tu es de
l’autre côté, je t’exprime ma plus profonde reconnaisssance pour cette présence que tu
me manifestes et pour l’amitié qui nous lie. C’est avec beaucoup d’émotion, François,
que je te dédie cette thèse!
Je voudrais souligner aussi la contribution financière du réseau de centres d’excellence en
géomatique GÉOIDE, projet DEC#2 (Conceptualisation des fondations technologiques
pour la prise de décision à l’aide du World Wide Web).
Enfin, je désire aussi exprimer ma plus profonde reconnaissance à mon épouse, Hélène,
et à mes enfants, David et Dominic. Leur présence à mes côtés, leur support, leurs
encouragements mais surtout leur amour ont été pour moi une source constante
d’inspiration. Mes Amours, c’est avec tout mon cœur que je vous dédie aussi cette thèse!
Merci aussi à mes parents Rachel et Gilbert. P.S. Gilbert, tu as cru en moi durant toute
cette aventure, on peut maintenant dire mission accomplie. Merci papa!
Quatre articles composent les chapitres 3, 4 et 5 de cette thèse ainsi que l’annexe C. Moi,
Jean Brodeur, auteur de cette thèse, ai réalisé l’ensemble de la recherche. J’ai rédigé les
manuscrits des quatre articles et, de ce fait, en suis l’auteur principal. Le Dr Yvan
Bédard, le Dr Bernard Moulin ainsi que le Dr Geoffrey Edwards ont contribué aux
articles par la revue des manuscrits et par l’apport de leurs commentaires judicieux. Le
premier article intitulé « Revisiting the concept of geospatial data interoperability within
the scope of human communication processes » a été publié dans le Volume 7, Numéro 2,
de la revue Transactions in GIS (TGIS) en mars 2003. Le manuscrit du second article,
« Geosemantic proximity for geospatial data interoperability », est présentement en
revue par l’auteur principal pour être enrichi et, ensuite, être présenté à une revue
internationale. Le manuscript du troisième article « A geosemantic proximity -based
vi
prototype for interoperability of geospatial data » a été soumis à la revue Computers,
Environment and Urban Systems en juin 2003. Il est actuellement en révision par l’auteur
principal pour soumettre la version finale à l’éditeur de la revue. Enfin, le dernier article,
« Geosemantic Proximity to Improve Geospatial Information Discovery in a Wireless
Environment », a été publié au printemps 2003 dans le Volume 57, Numéro 1, de la revue
Geomatica; un numéro spécial sur le thème « Internet and Mobile Geospatial
Information Management ».
TABLE DES MATIÈRES
Résumé................................................................................................................................. i
Résumé................................................................................................................................ ii
Avant-propos...................................................................................................................... iv
Table des matières............................................................................................................. vii
Liste des tableaux............................................................................................................... xi
Liste des figures ................................................................................................................ xii
Chapitre 1 : Introduction..................................................................................................... 1
1.1 Problématique d’accès aux données géospatiales............................................... 4
1.2 Objectifs de cette thèse ....................................................................................... 5
1.3 Méthodologie de recherche................................................................................. 6
1.4 Présentation de la thèse..................................................................................... 10
1.5 Références......................................................................................................... 10
Chapitre 2 : L’interopérabilité sémantique, spatiale et temporelle : un parallèle avec le
processus de communication ............................................................................................ 16
2.1 L’interopérabilité et la communication entre systèmes .................................... 17
2.2 La perception, la connaissance et le raisonnement cognitif de l’être humain .. 22
2.3 La notion d’ontologie et la description des phénomènes.................................. 24
2.3.1 L’Ontologie, un point de vue philosophique ............................................ 25
2.3.2 Une ontologie, un point de vue informatique ........................................... 26
2.4 L’hétérogénéité des données, un frein à l’interopérabilité................................ 31
2.4.1 Hétérogénéité des systèmes ...................................................................... 31
2.4.2 Hétérogénéité syntaxique.......................................................................... 32
2.4.3 Hétérogénéité structurelle ......................................................................... 33
viii
2.4.4 Hétérogénéité sémantique......................................................................... 34
2.5 Principales approches d’interopérabilité sémantique........................................ 36
2.5.1 Fédération de données............................................................................... 36
2.5.2 Similitude sémantique............................................................................... 38
2.5.3 Le modèle Semantic Formal Data Structure ............................................ 44
2.5.4 Le modèle Matching-Distance.................................................................. 45
2.6 Discussion ......................................................................................................... 47
2.7 Références......................................................................................................... 48
Chapitre 3 : L’interopérabilité des données géospatiales : proposition d’un cadre
conceptuel ......................................................................................................................... 53
3.1 Résumé de l’article ........................................................................................... 53
3.2 Abstract ............................................................................................................. 54
3.3 Introduction....................................................................................................... 55
3.4 Interoperability and the human communication process .................................. 58
3.4.1 Communication process............................................................................ 58
3.4.2 Perception and cognition........................................................................... 59
3.4.3 Ontology and conceptual modelling for database development ............... 61
3.4.4 Context...................................................................................................... 63
3.4.5 Semantic proximity................................................................................... 64
3.5 A conceptual framework of geospatial data interoperability............................ 65
3.6 Ontology of geospatial data interoperability..................................................... 69
3.6.1 The five ontological phases of geospatial data interoperability................ 71
3.6.2 Levels of ontology .................................................................................... 72
3.7 Relationship between concept and conceptual representation.......................... 76
3.8 Geosemantic proximity..................................................................................... 79
3.9 Conclusion ........................................................................................................ 83
3.10 References......................................................................................................... 84
Chapitre 4 : La proximité géosémantique, une composante de l’interopérabilité des
données géospatiales......................................................................................................... 92
4.1 Résumé de l’article ........................................................................................... 92
4.2 Abstract ............................................................................................................. 93
ix
4.3 Introduction....................................................................................................... 94
4.4 Geospatial data interoperability and geosemantic proximity ............................ 96
4.4.1 Semantic similarity of geospatial data ...................................................... 97
4.4.2 Identity of geographic phenomena............................................................ 98
4.4.3 Boundary of geoConcept and geoConceptRep ....................................... 100
4.4.4 Geosemantic proximity and topology..................................................... 101
4.5 Formalisation of geoConcept and geoConceptRep in relation with
the context ................................................................................................................... 106
4.6 Geosemantic proximity................................................................................... 113
4.6.1 Description of GsP.................................................................................. 114
4.6.2 Examples................................................................................................. 124
4.7 Prototype ......................................................................................................... 128
4.8 Conclusion ...................................................................................................... 132
4.9 References....................................................................................................... 133
Chapitre 5 : Expérimentation de l’interopérabilité sémantique des données
géospatiales et de la proximité géosémantique : présentation du GsP Prototype........... 140
5.1 Résumé de l’article ......................................................................................... 140
5.2 Abstract ........................................................................................................... 141
5.3 Introduction..................................................................................................... 141
5.4 Geospatial data interoperability and communication ..................................... 143
5.5 The GsP Prototype.......................................................................................... 150
5.5.1 Architecture............................................................................................. 150
5.5.2 Implementation ....................................................................................... 157
5.5.3 Experimentation...................................................................................... 162
5.6 Conclusion ...................................................................................................... 176
5.7 References....................................................................................................... 178
Chapitre 6 : Conclusion .................................................................................................. 183
6.1 Sommaire ........................................................................................................ 183
6.2 Discussion ....................................................................................................... 186
6.3 Conclusions..................................................................................................... 188
6.4 Perspectives de recherche ............................................................................... 189
x
6.5 Références....................................................................................................... 191
Annex A : Query about the street geoConcept encoded in an XML document ............. 193
Annex B : Answer with the road geoConceptRep encoded in an XML document ........ 195
Annex C : La proximité géosémantique au service de la découverte
d’information géospatiale dans un environnement sans fils ........................................... 203
C.1 Résumé de l’article ......................................................................................... 203
C.2 Abstract ........................................................................................................... 204
C.3 Introduction..................................................................................................... 204
C.4 Background ..................................................................................................... 206
C.5 Geospatial Data Interoperability on the Web.................................................. 209
C.6 Geosemantic Proximity and the Web ............................................................. 211
C.6.1 Examples................................................................................................. 214
C.7 Experiments .................................................................................................... 216
C.8 Conclusion ...................................................................................................... 217
C.9 References....................................................................................................... 218
Bibliographie................................................................................................................... 221
LISTE DES TABLEAUX
Tableau 1 : Représentations spatiales de la norme ISO 19107......................................... 33
Tableau 2 : Nature des conflits structurels de données géospatiales ................................ 35
Table 3: Examples of phenomena abstracted differently in independent topographical
databases ................................................................................................................... 57
Table 4: Examples of geoConcepts recognizing geoConceptReps, both of different
ontologies. ............................................................................................................... 175
LISTE DES FIGURES
Figure 1 : Diagramme d’activités détaillant la méthode de recherche................................ 9
Figure 2 : Modèle du processus communication .............................................................. 19
Figure 3 : Modèle d’interaction entre le référent, le signifiant et le signifié .................... 19
Figure 4 : Modèle de connaissances communes ............................................................... 21
Figure 5 : Approche modale des images mentales............................................................ 23
Figure 6 : Approche amodale des images mentales.......................................................... 23
Figure 7 : Métamodèle de répertoire de données géospatiales ......................................... 29
Figure 8 : Différentes structures du concept rue............................................................... 34
Figure 9 : Architecture de fédération de données en cinq niveaux................................... 37
Figure 10 : Exemple de réseau de connaissances ............................................................. 40
Figure 11 : Architecture trois tiers du SFDS .................................................................... 44
Figure 12: A conceptual framework of geospatial data interoperability .......................... 70
Figure 13: The three levels of ontology............................................................................ 74
Figure 14: Ontology of geospatial data interoperability ................................................... 75
Figure 15: Geosemantic space .......................................................................................... 80
Figure 16: Various semantics of a building’s geometric representation........................... 80
Figure 17: Geosemantic proximity Predicates .................................................................. 82
Figure 18: UML object class diagram of geoConcept and geoConceptRep ................... 107
Figure 19: UML object class diagram of property types ................................................ 109
Figure 20: UML object class diagram describing the types of intrinsic properties ........ 110
Figure 21: UML object class diagram describing the types of extrinsic properties ....... 112
Figure 22: The context of an abstraction K..................................................................... 113
Figure 23: Intersection between context of K and context of L ...................................... 115
xiii
Figure 24: GsP predicates............................................................................................... 116
Figure 25: Prototype principle ........................................................................................ 129
Figure 26: Network of geoConcepts ............................................................................... 130
Figure 27: Architecture of the GsP Prototype ................................................................ 153
Figure 28: Object structure of a concept......................................................................... 153
Figure 29: UML class diagram of GEOABSTRACTION, GEOCONCEPT, and
GEOCONCEPTREP............................................................................................... 155
Figure 30: The agent window ......................................................................................... 158
Figure 31: The agent manager interface ......................................................................... 159
Figure 32: Example of the prototype operation .............................................................. 161
Figure 33: Extract of the class road of the data dictionary
of the NTDB road network ..................................................................................... 163
Figure 34: Road network UML class diagrams .............................................................. 167
Figure 35: Hydrographic network UML class diagrams ................................................ 172
Figure 36: Observed success rates – Road Network....................................................... 174
Figure 37: Observed success rates – Hydrographic Network ......................................... 174
Figure C1: A Framework for Geospatial Data Interoperability.......................................210
Figure C2: UML Class Diagram Describing Phenomenon, Abstraction, Context,
Properties, and their Relationships...........................................................................212
Figure C3: Intersection between context of K and context of L ......................................213
Figure C4: The Sixteen Predicates of Geosemantic Proximity Relationships.................215
CHAPITRE 1
INTRODUCTION
L’intérêt pour les données géographiques numériques et les systèmes d’information
géographique (SIG) remonte aux années 1960–1970 avec le Canada Geographic
Information System (CGIS) qui est considéré comme l’un des premiers SIG réalisés
(Coppock et Rhind, 1991). Le CGIS servait à analyser les données du Canada Land
Inventory. Depuis, plusieurs organisations désirant profiter du potentiel de la technologie
des SIG pour les analyses complexes de données géospatiales1 et pour l’automatisation de
leurs activités impliquant des données géospatiales ont élaboré et mis en place des bases
de données géospatiales qui répondent à leurs besoins spécifiques. Les SIG sont
maintenant présents dans les activités de saisie, d’emmagasinage, d’exploitation et de
distribution de données géospatiales. Toutes ces activités ont fortement contribué au
développement et à la définition actuelle des SIG.
La technologie des SIG est aujourd’hui largement répandue. Nous constatons une
croissance importante du nombre de bases de données géospatiales. Par exemple au
1 Par géospatiale, nous entendons un caractère spatial appuyé d’une référence géographique. Dans cette
thèse, nous parlons de données géospatiales et de bases de données géospatiales pour faire référence aux
données et bases de données géographiques numériques.
2
Canada, nous avons la Base nationale de données topographiques (BNDT) produite par
Ressources naturelles Canada pour la cartographie nationale et les applications SIG
(Ressources naturelles Canada, 1996), les fichiers cartographiques numériques (FCN)
produits par Statistique Canada pour les fins de recensement (Statistique Canada, 1997),
les librairies VMap produites pour des besoins militaires (VMap, 1995), ainsi que de
multiples bases de données géospatiales à plus grande échelle produites par les provinces
canadiennes (BC Ministry of Environment Lands and Parks (Geographic Data BC), 1992;
New Brunswick, 2000; OBM, 1996; P.E.I. Geomatics Information Centre; Québec,
2000). Ces bases de données géospatiales représentent habituellement les mêmes
phénomènes géographiques de manière semblable, mais non identique. Par exemple :
- waterbody , coastline , lake , river/stream , Lac ;
- vegetation , wooded area , vineyard , orchard , milieu boisé , zone
boisée;
- wetland , marsh , swamp , marsh/fen , milieu humide , marais;
- road , limited access road , autoroute , rue , chemin , artère, route
collectrice, chemin local, chemin municipal;
- railroad , railroad siding/railroad spur , railLine , voie ferrée , chemin de
fer, triage de chemin de fer.
L’intérêt que d’autres utilisateurs de SIG manifestent pour les bases de données
géospatiales existantes incite les organismes producteurs à distribuer leurs données
géospatiales. Au début des années 1990, les principaux enjeux de la distribution de
données géospatiales touchaient spécifiquement les formats normalisés de distribution
des données géographiques (ex. SIF, DXF, ARCEXPORT, CCOGIF, SAIF, DIGEST,
SDTS, DLG) et les supports physiques de distribution (ex. ruban magnétique, disquette,
cassette, etc.). Aujourd’hui, la démocratisation d’Internet et du World Wide Web permet
aux producteurs de données géospatiales d’offrir leurs données en ligne. À cet effet, les 2 Description des pictogrammes spatiaux : :0D ; :1D ; :2D ; :géométrie multiple ; :géométrie
alternative (voir Bédard, Y, et M-J Proulx 2002 Perceptory Web Site. WWW Document,
http://sirs.scg.ulaval.ca/Perceptory)
3
gouvernements favorisent le déploiement d’infrastructures de données géospatiales pour
simplifier l’accès aux données géospatiales. Tel est le cas au Canada avec l’infrastructure
canadienne de données géospatiales (ICDG) (GeoConnections, 2002) et aux État-Unis
avec le National Spatial Data Infrastructure (NSDI) (FGDC, 2002). Maintenant, les
utilisateurs ont accès à une multitude de bases de données géospatiales et naviguent sur
Internet à la recherche des données géospatiales qui répondent à leurs besoins
spécifiques.
Cette plus grande accessibilité aux différentes bases de données géospatiales donne aux
utilisateurs de SIG plus de liberté quant au choix des données. En ce sens, les utilisateurs
recourent maintenant à des données géospatiales de plusieurs bases de données et les
fusionnent pour faire ressortir de l’information que la consultation indépendante de
chaque base de données ne peut fournir. La fusion automatique des données géospatiales
de plusieurs bases de données demeure toutefois un défi important.
La notion d’interopérabilité des données géospatiales (McKee et Buehler, 1998) a été
proposée au début des années 1990 pour simplifier et renforcer le partage, la réutilisation
et l’intégration des données géospatiales. L’objectif de l’interopérabilité des données
géospatiales est de réaliser la pleine intégration dynamique des systèmes et des bases de
données géospatiales résidant sur des nœuds distincts d'un réseau en les considérant
comme un système unique (Bédard, 1998). Par l’interopérabilité des systèmes et des
données géospatiales, les utilisateurs accèderont en ligne à de multiples bases de données
à partir d'un guichet unique qui se comportera comme une base de données unique et
virtuelle (Fuller, 1999). L’Open GIS Consortium Inc. (OGC), le comité technique 211 de
l’Organisation internationale de normalisation (ISO/TC 211), les organismes
gouvernementaux, la communauté de chercheurs en géomatique ainsi que les entreprises
en géomatique ont contribué ensemble à établir les bases actuelles de l’interopérabilité
des données géospatiales. Éventuellement, l’interopérabilité de données géospatiales
permettra l’accès, l’intégration à la volée et l’analyse des données de sources multiples
nonobstant les différences qui les caractérisent.
4
L’hétérogénéité syntaxique, structurelle, sémantique, spatiale et temporelle des données,
discutée plus en détails à la section 2.4 de cette thèse, constitue un obstacle majeur à
l’interopérabilité des données géospatiales (Bishr, 1997; Charron, 1995; Laurini, 1998;
Ouksel et Sheth, 1999; Sheth, 1999). Bien qu’on note des progrès substantiels en ce qui
concerne l’hétérogénéité syntaxique et structurelle, l’hétérogénéité sémantique, spatiale et
temporelle reste encore un défi majeur (Egenhofer et al., 1999; Ouksel et Sheth, 1999;
Rodriguez, 2000). Elle se caractérise par la différence de signification qui existe entre les
représentations de phénomènes géospatiaux. Elle contribue à la difficulté de repérer et
d’intégrer des données géospatiales qui répondent aux besoins spécifiques des
utilisateurs. Une base de données géospatiales doit saisir la signification de la requête
d’un utilisateur de données afin de répondre exactement à sa demande et, à l’inverse,
l’utilisateur doit être en mesure de comprendre la réponse et de vérifier qu’elle
correspond bien à sa requête (Sheth, 1999) pour qu’il y ait interopérabilité sémantique,
spatiale et temporelle.
1.1 Problématique d’accès aux données géospatiales
Comme nous l’avons mentionné, les utilisateurs de données géospatiales interagissent de
plus en plus avec plusieurs bases de données pour obtenir des données qui répondent à
leurs besoins spécifiques. Toutefois, ils doivent connaître le vocabulaire exact de chaque
base de données géospatiales pour formuler leurs requêtes et obtenir les données désirées.
Une problématique double émerge de cette situation. Le premier problème est de repérer
les classes d’objets, les attributs, les représentations géométriques et temporelles des
bases de données qui fournissent les données géospatiales d’un ensemble de phénomènes
qui répondent au besoin précis d’un utilisateur. Le second problème est d’intégrer les
données géospatiales obtenues des bases de données dans un ensemble-cible cohérent en
présentant les données géospatiales dans un vocabulaire compréhensible par l’utilisateur.
La présente thèse cible précisément le problème de repérage et d’obtention de données
géospatiales. On entend par repérage la recherche, l’identification, la sélection et
l’extraction de données géospatiales de sources externes. Le problème d’intégration des
données géospatiales ne fait pas partie de la portée de cette thèse.
5
Ce type de problèmes se rencontre fréquemment. Par exemple, les producteurs de
données géospatiales tels que le Centre d’information topographique de Sherbrooke du
ministère Ressources naturelles Canada collaborent de plus en plus avec d’autres
producteurs de données tels que Statistique Canada, Élection Canada, Base de données
toponymiques du Canada, Levés officiels du Canada, Canada Aeronautical Chart
(CANAC), les provinces (ex. Colombie-Britannique, Nouvelle-Écosse, Ontario) et les
agences fédérales (ex. Office des transports du Canada (OTC)) pour élaborer des
entrepôts de données géospatiales à partir de données géospatiales existantes. C’est aussi
le cas des provinces et des regroupements de municipalités (municipalité régionale de
comté, comté, etc.) qui utilisent des données des municipalités, des services 911 qui
utilisent des données de provenances multiples et de la population en général qui utilise
plusieurs bases de données géospatiales accessible sur le Web.
1.2 Objectifs de cette thèse
Cette thèse étudie la notion de proximité géosémantique dans le but d’identifier, de
définir et d’expliquer les éléments de la proximité sémantique, spatiale et temporelle qui
interviennent dans le repérage des données géospatiales répondant au besoin particulier
d’un utilisateur, dans le cadre de l’interopérabilité des données géospatiales. L’hypothèse
de départ de cette thèse s’énonce comme suit : « le concept de proximité géosémantique
contribuerait à repérer des concepts géospatiaux qui répondent aux besoins spécifiques
d’un utilisateur ».
De façon plus précise, le premier objectif de cette thèse est de proposer un cadre
conceptuel d’interopérabilité des données géospatiales qui intègre la notion de proximité
géosémantique. Le deuxième objectif est de définir la notion de proximité
géosémantique. La proximité géosémantique qualifie la similitude sémantique, spatiale et
temporelle entre des abstractions de phénomènes géospatiaux à l’aide de la description de
leurs contextes respectifs et contribue au repérage des classes d’objets qui s’apparentent
au besoin d’un utilisateur. Elle adapte l’approche topologique de limite et d’intérieur des
6
données spatiales d’Egenhofer (Egenhofer, 1993; Egenhofer et Franzosa, 1991) au
problème de proximité sémantique, spatiale et temporelle des données géospatiales. Le
troisième objectif est de valider la faisabilité informatique de cette notion à l’aide d’un
prototype et de son expérimentation à l’aide d’ontologies sur le réseau routier et le réseau
hydrographique de bases de données géospatiales de ministères fédéraux et de provinces
canadiennes. Enfin, le quatrième et dernier objectif est d’évaluer l’approche proposée et
d’identifier ses forces et ses faiblesses.
1.3 Méthodologie de recherche
La méthode retenue pour réaliser cette thèse se divise en cinq étapes (Figure 1). La
première étape consiste en la recherche préliminaire et regroupe la recherche
bibliographique, la définition du projet de recherche et l’examen de doctorat.
La recherche bibliographique couvre un ensemble de thèmes qui sont étroitement liés au
problème d’interopérabilité des données géospatiales. Plus spécifiquement, la recherche
bibliographique a porté sur les thèmes suivants : base de données distribuées, base de
données fédérées, interopérabilité, interopérabilité sémantique, hétérogénéité dans les
bases de données, similitude sémantique, proximité sémantique, distance sémantique,
réseau sémantique, ontologie, cybernétique, communication, sciences cognitives et
langage naturel.
Bien que l’interopérabilité des données géospatiales soit assez bien couverte dans la
littérature (Abel et al., 1998; Arctur et al., 1998; Bishr, 1998; Fuller, 1999; Goodchild et
al., 1999; Herring, 1999; ISO/TC 211, 2002; Laurini, 1998; McKee, 1999; McKee et
Buehler, 1998; Nebert, 1999; Open GIS Consortium Inc., 2002; Vckovski et al., 1999;
Voisard et Schweppe, 1998), l’interopérabilité sémantique des données géospatiales
demeure encore un sujet marginal dans l’ensemble. On retrouve deux thèses de doctorat
sur l’interopérabilité sémantique des données géospatiales. La première réalisée par Y.
Bishr en 1997 propose le Semantic Formal Data Structure (SFDS) comme solution à
l’interopérabilité sémantique (Bishr, 1997). Le SFDS se base sur une médiation de
7
contextes entre le modèle externe de bases de données géospatiales et un modèle d’une
fédération de bases de données (Proxy Context). La seconde thèse réalisée par M. A.
Rodriguez en 2000 suggère le modèle Matching-Distance (MD) pour évaluer
quantitativement par une distance conceptuelle la similitude sémantique entre deux
classes d’entités spatiales (Rodriguez et al., 1999).
L’interopérabilité sémantique est toutefois mieux couverte dans le domaine de
l’informatique à travers les bases de données distribuées, les fédérations de bases de
données, les ontologies, etc. Lehmann (1992) et Sowa (1987) traitent des modèles et des
structures de réseaux sémantiques qui associent des concepts. Frankhauser et al. (1991) et
Frankhauser et Neuhold (1992) proposent une approche quantitative pour déterminer une
distance conceptuelle entre deux concepts. Kashyap et Sheth (1996), Kashyap et Sheth,
(1998), Sheth (1999) et Sheth et Kashyap (1992) analysent les différents aspects de
l’hétérogénéité dans les bases de données et présentent une approche qualitative pour
exprimer la proximité sémantique entre classes d’objets. L’approche de Sheth et Kashyap
tient compte plus spécifiquement du contexte de chaque classe d’objets. Le contexte est
considéré comme un élément fondamental de l’interopérabilité sémantique (Ouksel et
Sheth, 1999). On constate aussi dans la littérature que la notion d’ontologie (Gruber,
1993; Guarino, 1998) associée à la description des concepts est intimement reliée au
problème d’interopérabilité sémantique.
Toutefois, il est pertinent de revoir la notion de cybernétique (Campbell, 1982; Weiner,
1950) incluant la théorie de la communication (Darnell, 1971; Schramm, 1971) qui, tout
comme l’interopérabilité, se préoccupe de l’interaction entre systèmes. La cybernétique et
la théorie de la communication nous amènent à étudier les aspects de perception, de
connaissance et de raisonnement plus spécifiquement traités en sciences cognitives
(Barsalou, 1999; Kosslyn, 1980; Lakoff, 1987; Pylyshyn, 1981) ainsi que des éléments du
langage naturel (Cherry, 1978; Denes et Pinson, 1971; Sowa, 1984); ces aspects sont
directement associés au processus de communication entre les êtres humains. Enfin, nous
abordons la notion d’ontologie, telle que perçue par certains philosophes, qui se
préoccupe de la description de la réalité dans son ensemble. Cette revue de littérature fait
8
un tour d’horizon des éléments qui ont trait à l’interopérabilité sémantique et permet de
positionner notre notion de proximité géosémantique dans un cadre conceptuel
d’interopérabilité des données géospatiales.
La définition du projet de recherche, initialement présentée dans (Brodeur, 2001), résume
la recherche bibliographique (aussi présentée au chapitre 2), identifie le problème
poursuivi dans cette recherche (section 1.1), énonce l’objectif du projet (section 1.2) et
décrit la méthode de recherche.
La deuxième étape (Figure 1) repose sur l’élaboration du cadre conceptuel
d’interopérabilité dans lequel nous étudions la proximité géosémantique. Comme les êtres
humains communiquent naturellement de façon interopérable, notre cadre conceptuel
d’interopérabilité est dérivé du processus de communication entre les êtres humains en
incluant des éléments propres au raisonnement cognitif. Le résultat de cette étape fait
l’objet d’un premier article, présenté au chapitre 3, qui décrit le cadre conceptuel
d’interopérabilité des données géospatiales que nous proposons et introduit la notion de
proximité géosémantique.
La troisième étape (Figure 1) a pour objet de formaliser la notion de proximité
géosémantique. Cette notion s’inspire de la notion de proximité sémantique élaborée par
Kashyap et Sheth (1996) et Sheth et Kashyap (1992) qui est basée sur la comparaison de
deux classes d’objets en tenant compte de leur contexte respectif. Elle tient compte des
propriétés intrinsèques et extrinsèques entre deux abstractions pour décrire leur similitude
et leur différence suivant une matrice à neuf intersections faisant intervenir leur intérieur,
leur limite et leur extérieur. Dans cette thèse, toutefois, nous nous limitons à l’évaluation
de la similitude par la correspondance des propriétés intrinsèques et extrinsèques suivant
une matrice à quatre intersections similairement au modèle topologique des données
spatiales d’Egenhofer (Egenhofer, 1993; Egenhofer et Franzosa, 1991). Tout comme à la
9
Figure 1 : Diagramme d’activités détaillant la méthode de recherche
deuxième étape, un deuxième article faisant l’objet du chapitre 4 décrit la notion de
proximité géosémantique.
10
La quatrième étape de la méthodologie poursuivie repose sur l’élaboration d’un prototype
informatique pour valider la faisabilité informatique de la notion de proximité
géosémantique. Le prototype est développé à l’aide d’agents logiciels programmés en
Java™ qui communique entre eux en XML. Il intègre la notion de proximité
géosémantique à notre cadre conceptuel d’interopérabilité. Une expérience a ensuite été
conduite à l’aide du prototype et de répertoires de données géospatiales servant
d’ontologies d’application. Les répertoires de données géospatiales ont été élaborés avec
les spécifications de données sur le réseau routier et le réseau hydrographique de cinq
bases de données géospatiales existantes. Un troisième article, présenté au chapitre 5, fait
état du prototype et de l’expérience conduite.
Enfin la cinquième étape consiste à consolider le tout dans cette thèse.
1.4 Présentation de la thèse
Trois articles réalisés tout au long des travaux de recherche constituent le cœur de cette
thèse. Avant de présenter ces trois articles, nous revoyons au chapitre 2 les notions qui
supportent notre cadre conceptuel d’interopérabilité des données géospatiales et notre
notion de proximité géosémantique. Le chapitre 3 consiste en un premier article qui
propose notre cadre conceptuel d’interopérabilité. Au chapitre 4, nous retrouvons un
deuxième article qui présente notre approche de proximité géosémantique. Nous
exposons au chapitre 5 un troisième article portant sur notre prototype informatique
qu’on nomme le GsP Prototype. Au chapitre 6, nous concluons et présentons les
éléments de recherche à être considérés lors de travaux futurs.
1.5 Références
Abel, D J, B C Ooi, K L Tan, et S H Tan 1998 Towards Integrated Geographical
Information Processing. International Journal of Geographic Information
Science, 12(4): 334-371
11
Arctur, D, D Hair, G Timson, E P Martin, et R Feagas 1998 Issues and Prospects for the
Next Generation of the Spatial Data Transfer Standard (SDTS). International
Journal of Geographic Information Science, 12(4): 403-425
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y 1998 Analyse et conception de systèmes d’information à référence spatiale.
Notes de cours sur l’interopérabilité, Québec, Département des sciences
géomatiques, Université Laval
Bédard, Y, et M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Bishr, Y 1998 Overcoming the Semantic and Other Barriers to GIS Interoperability.
International Journal of Geographic Information Science, 12(4): 299-314
Brodeur, J 2001 Interopérabilité des données géospatiales : Élaboration du concept de
proximité sémantique, spatiale et temporelle. Québec, Département des sciences
géomatiques, Université Laval
Campbell, J 1982 Grammatical Man: Information, Entropy, Language, and Life. New
York, Simon and Schuster
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Cherry, C 1978 On Human Communication: a Review, a Survey, and a Criticism.
Cambridge, Massachusetts, The MIT Press
Coppock, J T, et D W Rhind 1991 The History of GIS. In D J Maguire, M F Goodchild,
and D W Rhind (eds) Geographical Information Systems. New York, Longman
Scientific and Technical: 21-43
12
Darnell, D K 1971 Information Theorie. In J A DeVito (ed) Communication: Concepts
and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 37-45
Denes, P B, et E N Pinson 1971 The Speech Chain. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 3-11
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273
Egenhofer, M, et R D Franzosa 1991 Point-Set Topological Spatial Relations.
International Journal of Geographic Information Science, 5(2): 161-174
Egenhofer, M J, J Glasgow, O Günther, J R Herring, et D J Peuquet 1999 Progress in
Computational Methods for Representing Geographical Concept. International
Journal of Geographic Information Science, 13(8): 775-796
FGDC 2002 NSDI. USGS, Web Page Document, http://www.fgdc.gov/nsdi/nsdi.html
Frankhauser, P, M Kracker, et E Neuhold 1991 Semantic vs. Structural Resemblance of
Classes. SIGMOD Record, 20(4): 59-63
Frankhauser, P, et E J Neuhold 1992 Knowledge Based Integration of Heterogenous
Databases. In Proceedings of IFIP WG2.6 Database Semantics Conference on
Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 155-175
Fuller, G W 1999 A Vision for a Global Geospatial Information Network (GGIN).
Photogrammetric Engineering & Remote Sensing, 65(5): 524-538
GeoConnections 2002 Canadian Geospatial Data Infrastructure (CGDI) architecture.
Electronic Document, http://www.geoconnections.org/architecture/index_e.html
Goodchild, M, M Egenhofer, R Fegeas, et C Kottman (eds) 1999 Interoperating
Geographic Information Systems. Boston, Massachusetts, Kluwer Academic
Publishers
Gruber, T R 1993 A Translation Approach to Portable Ontology Specification. Stanford,
California, Knowledge Systems Laboratory Technical Report KSL 92-71
Guarino, N 1998 Formal Ontology and Information Systems. In Proceedings of Formal
Ontology in Information Systems (FOIS '98). Amsterdam, IOS Press: 3-15
Herring, J R 1999 The OpenGIS Data Model. Photogrammetric Engineering & Remote
Sensing, 65(5): 585-588
13
ISO/TC 211 2002 Geographic information/Geomatics. Web Page Document,
http://www.statkart.no/isotc211/
Kashyap, V, et A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Kashyap, V, et A Sheth 1998 Semantic Heterogeneity in Global Information Systems: the
Role of Metadata, Context and Ontologies. In M P Papazoglou, and G Schlageter
(eds) Cooperative Information Systems-Trends and Directions. San Diego, CA,
Academic Press: 139-178
Kosslyn, S M 1980 Image in Mind. Cambridge, Massachusetts, Harvard University Press
Lakoff, G 1987 Women, Fire, and Dangerous Things - What Categories Reveal about the
Mind. Chicago, The University of Chicago Press
Laurini, R 1998 Spatial Multi-Database Topological Continuity and Indexing: a Step
Towards Seamless GIS Data Interoperability. International Journal of
Geographic Information Science, 12(4): 373-402
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50
McKee, L 1999 The Impact of Interoperable Geoprocessing. Photogrammetric
Engineering & Remote Sensing, 65(5): 565-566
McKee, L, et K Buehler (eds) 1998 The OpenGIS Guide. Wayland, Massachusetts,
OpenGIS Consortium Inc.
Nebert, D 1999 Interoperable Spatial Data Catalogs. Photogrammetric Engineering &
Remote Sensing, 65(5): 573-575
New Brunswick 2000 Guide d’utilisation de la Base de données topographiques
numériques (BDTN) du Nouveau-Brunswick. Fredericton, New Brunswick,
Services Nouveau-Brunswick
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources
Open GIS Consortium Inc. 2002 OpenGIS Specifications. Open GIS Consortium Inc.,
Web Page Document, http://www.opengis.org/ogcSpecs.htm
14
Ouksel, A M, et A Sheth 1999 Semantic Interoperability in Global Information Systems:
A Brief Introduction to the Research Area and the Special Section. Sigmod
Record, 28(1): 5-12
P.E.I. Geomatics Information Centre User’s Guide to Digital and Hardcopy property and
Basemap Products. Charlottetown, P.E.I., Provincial Treasury - Taxation &
Property Records Division
Pylyshyn, Z W 1981 The Imagery Debate: Analogue Media Versus Tacit Knowledge.
Psychological Review, 88(1): 16-45
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l’information géographique, CD
Document
Ressources naturelles Canada 1996 Base nationale de données topographiques - normes
et spécifications. Sherbrooke, Québec, Centre d’information topographique –
Sherbrooke
Rodriguez, A, M J Egenhofer, et R D Rugg 1999 Assessing Semantic Similarities Among
Geospatial Feature Class Definition. In Proceedings of Interoperating
Geographic Information Systems (Interop '99). Berlin, Springer-Verlag Lecture
Notes in Computer Science 1580: 189-202
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine
Schramm, W 1971 The Nature of Communication Between Humans. In W Schramm, and
D F Robert (eds) The Process and Effects of Mass Communication. Champaign-
Urbana, IL, University of Illinois Press: 3-53
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29
Sheth, A, et V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
15
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312
Sowa, J F 1984 Chapter 7: Limits of Conceptualisation. In Conceptual Structures:
Information Processing in Mind Machine. Reading, Massachusetts, Addision-
Westley Publishing Company: 339-351
Sowa, J F 1987 Semantic Networks. In S C Shapiro (ed) Encyclopedia of Artificial
Intelligence. New York, John Wiley & Sons
Statistique Canada 1997 Fichiers numériques des limites et fichiers numériques
cartographiques, Recensement de 1996 (guide de référence). Ottawa, Ministère
de l’Industrie
Vckovski, A, K Brassel, et H J Schek (eds) 1999 Interoperating Geographic Information
Systems (Interop '99). Berlin, Springer-Verlag
VMap 1995 Vector Map (VMap), Level 1. Bethesda, MD, U.S. National Imagery and
Mapping Agency Mil-V-89033
Voisard, A, et H Schweppe 1998 Abstraction and Decomposition in Interoperable GIS.
International Journal of Geographic Information Science, 12(4): 315-333
Weiner, N 1950 The Human Use of Human Beings: Cybernetics and Society. Boston,
Houghton and Mifflin
CHAPITRE 2
L’INTEROPÉRABILITÉ SÉMANTIQUE, SPATIALE
ET TEMPORELLE : UN PARALLÈLE AVEC
LE PROCESSUS DE COMMUNICATION
Les êtres humains communiquent entre eux naturellement de manière interopérable (i.e.
capable d’échanger et d’utiliser l’information qui est échangée) puisqu’ils arrivent à se
comprendre dans leurs rapports de tous les jours. C’est donc avec la revue du processus
de communication entre les êtres humains que nous amorçons ce chapitre sur l’étude de
l’interopérabilité sémantique, spatiale et temporelle des données géospatiales. Le
processus de communication nous amène ensuite à aborder certains aspects du
fonctionnement cognitif de l’être humain, plus spécifiquement la perception, l’abstraction
et la représentation des phénomènes que nous observons. Après, nous revoyons les
notions d’ontologie telles qu’utilisées en philosophie et en intelligence artificielle. Ces
notions sont associées à la représentation des phénomènes. Comme il est habituel que des
bases de données géospatiales distinctes représentent les mêmes phénomènes
géographiques différemment, nous examinons aussi les différentes facettes de
l’hétérogénéité inhérente aux données. Nous complétons ce chapitre avec une revue des
principales solutions proposées pour l’interopérabilité sémantique, spatiale et temporelle.
17
2.1 L’interopérabilité et la communication entre systèmes
C’est avec la cybernétique et la théorie de l’information (Shannon, 1948; Weiner, 1950)
dont les premiers fondements remontent à aussi loin que 1948, que l’étude du processus
de communication débute. La cybernétique est le domaine qui s’intéresse aux aspects de
contrôle qu’un système (un être humain, une machine, etc.) exerce sur son environnement
afin de maintenir l’ordre, l’organisation et l’équilibre. Par exemple, un conducteur
automobile utilise constamment des processus de contrôle pour s’assurer que
l’automobile qu’il conduit demeure sur la bonne voie, ne dépasse pas la limite de vitesse
et se dirige dans la bonne direction. Le conducteur contrôle l’automobile avec le volant,
l’accélérateur et le frein de l’automobile pour la maintenir dans un état d’équilibre
dynamique. Un bibliothécaire s’assure du bon classement des livres et de leur rangement
dans les rayons appropriés d’une bibliothèque afin que les utilisateurs retrouvent les
livres recherchés. Un gestionnaire de base de données modélise les concepts, structure les
données et s’assure de la cohérence des bases de données qu’il maintient afin que les
utilisateurs retrouvent les données qu’ils recherchent. Comme nous le constatons dans ces
exemples, l’information est une notion intimement associée à l’ordre et l’équilibre. Une
mesure de l’information correspond à l’entropie d’un système (Campbell, 1982). Elle
représente le contenu qu’un système échange avec d’autres systèmes tout en considérant
la manière dont le système s’adapte aux autres systèmes (Weiner, 1950). Le
fonctionnement adéquat d’un système dépend de l’information qu’il reçoit et de
l’information qu’il communique.
L’échange d’information entre des systèmes consiste essentiellement en un problème de
communication effective et de contrôle. Un processus de communication comprend
fondamentalement trois composantes : une source d’émission, un message et une
destination (Figure 2) (Schramm, 1971a; Shannon, 1948).
La source d’émission peut correspondre à un individu, une organisation, un journal, une
station de radio ou de télévision, un site Web, une base de données, un système SIG, etc.
Les connaissances qu’une source d’émission possède ne peuvent pas être communiquées
18
directement. Elles sont formulées sous la forme de messages qui sont envoyés vers une
destination donnée. Considérant le cas où la source et la destination sont deux personnes,
la source choisit d’abord l’information qu’elle veut communiquer à sa destination.
Ensuite, elle l’adapte en fonction d’une situation particulière et de son destinataire. Puis,
elle structure les éléments d’information entre eux et les encode dans un message.
Le message est formé de signaux pouvant prendre la forme d’encre sur du papier,
d’ondes sonores dans les airs, d’ondes visibles dans la fibre optique, de courant électrique
dans le réseau téléphonique ou Internet, etc. (Harrison, 1971). Après avoir encodé le
message, la source le dépose dans le canal de communication vers le destinataire. À ce
moment, le message est libéré de la signification que lui attribuait initialement la source.
La source n’a donc plus de contrôle sur le message. Par conséquent, le message n’a
aucune signification en soi dans le canal de communication. Il agit en tant que médiateur
entre la source et le destinataire. Le canal de communication se divise en sous-canaux où
on retrouve un canal primaire et des canaux secondaires. Le canal primaire achemine les
signaux de forme principale, par exemple les signaux verbaux lors d’une discussion entre
deux individus. Les canaux secondaires ou « back channels » acheminent des signaux de
forme accessoire, par exemple des signaux non verbaux dans une discussion entre deux
individus tels les signes de tête, les gestes des mains, les soupirs, les intonations de la
voix, etc. (Harrison, 1971).
C’est au destinataire, lorsqu’il reçoit le message, que revient la responsabilité de le
décoder et de lui donner une signification particulière. Le destinataire dispose
essentiellement des connaissances qu’il possède pour décoder le message. Il utilise les
connaissances que les signaux du message évoquent en lui pour attribuer une
signification particulière au message. La communication est effective si la signification
que le destinataire donne au message correspond à la signification initiale que la source
accordait au message lors de son dépôt dans le canal de communication.
19
Figure 2 : Modèle du processus communication (de Schramm, 1971b)
Comme nous venons de le dire, la signification d’un message correspond au sens que les
signaux évoquent tant à la source qu’à la destination. Le sens donné aux signaux
représente le lien qu’une personne fait entre les signaux et les phénomènes que les
signaux désignent. Odgen et Richards ainsi que plusieurs autres illustrent à l’aide d’un
triangle l’interaction qui existe entre un phénomène (i.e. l’objet, ce à quoi nous référons,
le référent), un signal qui exprime le phénomène dans un processus de communication
(le signe ou le signifiant) et le sens induit chez la personne (le signifié) (Figure 3)
(Cherry, 1978; Daconta et al., 2003; Eco, 1988a; Eco, 1988b). Bien qu’il existe un lien
entre le référent et le signifiant qui le représente, ce lien n’est qu’indirect. Ce lien se
concrétise par le sens qu’un individu accorde au signifiant, lequel est associé au référent.
Le référent et le signifiant agissent ensemble pour susciter un sens chez l’individu.
Référent
Signifiant Signifié
Figure 3 : Modèle d’interaction entre le référent, le signifiant et le signifié
(Cherry, 1978; Eco, 1988a)
20
Le processus de communication inclut aussi le feed-back (Weiner, 1950), également
appelé rétroaction. Le feed-back est un mécanisme de contrôle qui permet à la source
d’ajuster ses messages en fonction des résultats obtenus lors de messages précédents. Par
exemple, un individu A se rendant à l’hôtel X situé sur la rue Y dans la ville Z interroge un
individu B comme suit : « pouvez-vous m’indiquer le chemin pour l’hôtel X? ». Si A
estime que la réponse obtenue de B est correcte, il peut alors considérer que B a bien
compris la question. Par contre, si B répond qu’il ne connaît pas l’hôtel X, alors A peut
ajuster sa question en demandant le chemin pour se rendre à la rue Y de la ville Z. Le
feed-back correspond ici au mécanisme de contrôle qui permet à l’individu A de s’assurer
que son message est bien compris par B et d’ajuster son message dans le cas contraire.
Une source de bruit peut interférer dans la bonne transmission d’un message. Le bruit qui
se mêle au message crée alors un désordre dans l’ensemble des signaux. Il augmente le
niveau d’incertitude dans la bonne transmission du message et, conséquemment, réduit le
niveau d’information que le message véhicule. Un message influencé par une source de
bruit devient plus complexe, voire même impossible à décoder. La redondance dans les
signaux habituellement considérée comme une inefficience dans le message prend ici une
toute autre importance. Elle ajoute un facteur d’assurance dans la transmission d’un
message et accroît la capacité de compréhension du message (Schramm, 1971b). En ce
sens, la redondance augmente le niveau d’information d’un message. La redondance peut
prendre plusieurs formes telles que des règles dans la construction d’un message (ex. les
règles de grammaire dans le langage naturel font que des lettres, des mots ou des
séquences de mots sont prédictibles et, en ce sens, redondants) (Campbell, 1982), le bit
de parité dans un octet et le nombre total de pages d’un bordereau de transmission de
télécopie. La redondance aide à ce qu’un message atteigne son destinataire avec le
minimum de distorsion (Darnell, 1971).
Deux personnes engagées dans un processus de communication travaillent constamment
à maintenir de l’ordre dans leurs échanges pour se comprendre mutuellement. Un
processus de communication désordonné fonctionne difficilement, voire même pas du
21
tout. Dans un processus de communication, la source génère des messages à partir de ses
propres connaissances, tout comme le destinataire qui les déchiffre aussi à partir de ses
propres connaissances. Les connaissances de la source et du destinataire proviennent de
l’observation directe des signaux que dégagent les phénomènes, de l’observation
indirecte de signaux obtenus des capteurs artificiels (photographies, images satellites,
etc.) et des signaux interprétés provenant d’autres personnes. C’est grâce à leurs
connaissances communes (en anglais commonness (Schramm, 1971a)) que la source et le
destinataire maintiennent l’ordre et l’équilibre dans un processus de communication
(Figure 4) et réussissent à se comprendre. Un destinataire qui n’a aucune connaissance en
commun avec la source d’où provient un message, ne peut pas déchiffrer les signaux du
message correctement. Par exemple, le terme pont prend une signification différente pour
un dentiste et un commandant de navire. Si le dentiste ne sait pas qu’un pont correspond
à un plancher d’un navire et si le commandant de navire ne sait pas que pont représente
une prothèse qui remplace une ou plusieurs dents absentes, alors il sera difficile pour eux
de communiquer avec ce mot puisqu’il ne représente rien de commun aux deux individus.
Schramm (1971a) et Bédard (1986) illustrent les connaissances communes entre la source
et le destinataire sensiblement de la même manière (Figure 4). Les ellipses de la Figure 4
correspondent aux modèles cognitifs de la source et du destinataire, respectivement.
L’union des deux ellipses représente l’ensemble des connaissances de la source et du
destinataire alors que leur intersection représente les connaissances communes
spécifiquement.
Figure 4 : Modèle de connaissances communes (de Schramm, 1971a)
22
2.2 La perception, la connaissance et le raisonnement cognitif de l’être humain
Comme nous le mentionnions précédemment, le modèle cognitif de l’être humain se
développe à partir de l’observation directe et indirecte des phénomènes. L’être humain
perçoit les phénomènes à partir de ses sens (la vue, l'ouïe, l'odorat, le toucher et le goût)
et de moyens technologiques qui accroissent ses capacités naturelles d’observation (ex.
satellite d’observation de la Terre, caméra numérique, radar, etc.). La perception de l’être
humain donne lieu à des états perceptuels (Barsalou, 1999). Un état perceptuel est un état
du cerveau constitué d’une représentation neuronale inconsciente de l’input physique des
sens et d’une expérience consciente optionnelle. Dans toute la complexité de l’état
perceptuel d’un phénomène, l’attention sélective de l’être humain ne retient qu’un sous-
ensemble significatif de l’état perceptuel en fonction du contexte dans lequel le
phénomène est observé. Ce sous-ensemble est conservé sous forme d’images mentales
dans la mémoire à long terme des personnes.
Une image mentale, appelée aussi symbole perceptuel (Barsalou, 1999), constitue une
représentation abstraite d’un phénomène. C’est le concept dans la mémoire d’une
personne qui est associé au phénomène. La forme que prend une image mentale dans la
mémoire des personnes suscite encore un débat de fond (Pylyshyn, 1981; Pylyshyn, en
impression). Deux approches sont reconnues dans la littérature : l’approche modale et
l’approche amodale.
L’approche modale considère que les images mentales sont conservées dans une structure
qui avoisine celle de l’état perceptuel. Une image mentale constitue une représentation
analogue à une illustration d’un phénomène (Kosslyn, 1980; Kosslyn, 1981). Par
conséquent, le système neural d’une personne conserve une image approximative du
phénomène (ex. une illustration de la forme et de la couleur). La personne utilise par la
suite cette image mentale pour reconnaître le phénomène (Figure 5).
23
Figure 5 : Approche modale des images mentales (de Barsalou, 1999)
L’approche amodale estime que les images mentales prennent une forme descriptive
(Pylyshyn, 1981; Pylyshyn, en impression). Une partie de l’état perceptuel d’un
phénomène est transformée en propositions qui décrivent l’ensemble des propriétés
significatives de l’objet. L’approche amodale fut inspirée de développements en logique,
en statistique, en mathématique et en informatique desquels plusieurs langages de
représentation sont issus tels que les feature lists, les frames, les schemata et les semantic
nets. Alors qu’une représentation modale d’une chaise correspond à une sorte d’image en
mémoire (Figure 5), sa représentation amodale consiste en une description du type « une
chaise = un dossier + un siège + 4 pattes » (Figure 6).
Figure 6 : Approche amodale des images mentales (de Barsalou, 1999)
Récemment, Barsalou (1999) a décrit un symbole perceptuel comme un enregistrement
de l’activation neuronale sous–jacente à la perception. L’information perçue est
essentiellement emmagasinée de manière qualitative et fonctionnelle. Un symbole
perceptuel n’est pas indépendant des autres existant dans la mémoire à long terme. Il joue
24
un rôle d’attracteur à l’intérieur d’un réseau pour regrouper et connecter d’autres
symboles perceptuels similaires. En ce sens, la littérature nous réfère aux réseaux
connexionnistes (en anglais connectionist networks) (Barsalou, 1999; Laakso et Cottrell,
2000).
Les symboles perceptuels sont structurés et emmagasinés sous forme de concepts. Un
concept constitue une sorte de simulateur qui produit différentes conceptualisations (ou
représentations conceptuelles) de lui-même, chacune adaptée à un contexte différent. Un
concept regroupe les connaissances et les processus qui représentent correctement un
type d’entités ou d’évènements (Barsalou, 1999). Par exemple, le concept water area
peut être un simulateur des représentations conceptuelles waterbody, coastline, lake,
river/stream et lac (voir Tableau 3, Section 3.3). Un concept qui correspond à une
catégorie est aussi capable de reconnaître ses membres. En ce sens, une entité appartient à
une catégorie si cette catégorie peut produire une simulation satisfaisante de cette entité.
Dans le même ordre d’idée, deux individus qui représentent une même catégorie
différemment peuvent avoir la capacité de simuler la représentation de l’autre.
2.3 La notion d’ontologie et la description des phénomènes
Comme nous le constatons, le concept est une notion importante en cognition. C’est lui
qui maintient la connaissance sur un phénomène ou une catégorie de phénomènes.
Habituellement, l’architecte d’une base de données géospatiales précise et structure
l’ensemble des concepts pour lesquels la base de données emmagasine des données. Les
concepts et leur définition permettent de peupler la base de données, de la gérer ainsi que
d’utiliser les données de manière adéquate. Dans cette section, nous revoyons la notion
d’ontologie telle que présentée par certains philosophes ainsi que celle utilisée en
informatique qui, toutes deux, concernent la description des phénomènes et des concepts.
25
2.3.1 L’Ontologie, un point de vue philosophique
L’existence, la connaissance et la description de l’Être et de la Vérité sont des questions
qui préoccupent les philosophes au moins depuis Aristote. Les philosophes étudient ces
questions à travers l’Ontologie. L’Ontologie correspond à la description du monde
(Peuquet et al., 1998) ainsi qu’à un modèle et une théorie abstraite du monde (Smith et
Mark, 1999). C’est la science d’Être, du « qu’est-ce que », du type d’entités, des
propriétés, des catégories et des relations qui composent la réalité (Lehmann, 1992;
Peuquet et al., 1998; Smith et Mark, 1999). En philosophie, on suppose qu’il existe une
seule « vraie » réalité à décrire, et par conséquent, qu’il existe une seule Ontologie
indépendante de tout langage. L’Ontologie est décrite soit par une abstraction des
éléments formels qui caractérisent tous les domaines scientifiques (appelée Ontologie
formelle) soit par un énoncé des conditions nécessaires et suffisantes qui décrivent une
sorte d’entité d’un domaine donné (appelée Ontologie matérielle) (Peuquet et al., 1998).
L’Ontologie formelle correspond à l’étude des structures qui sont partagées entre les
différents domaines scientifiques par exemple l’étude de l’identité et de la différence,
l’étude de l’unité et de la pluralité, l’étude des propriétés et des relations, l’étude des
parties et du tout, l’étude de la mesure et de la qualité. L’étude des limites des objets dans
l’espace que l’on retrouve dans (Casati et al., 1998; Smith, 1994; Smith et Mark, 1999;
Smith et Varzi, 2000) est un exemple d’Ontologie formelle. Ces auteurs reconnaissent
deux types de limites : bona fide et fiat.
Une limite bona fide signifie une démarcation franche ou physique entre deux objets qui
se caractérise par une différence qualitative et physique (Smith, 1994). Ce type de limites
s’observe chez des phénomènes comme les bâtiments, les tours, les pistes de course ou
d’envol, les ponts, etc.
Une limite fiat correspond à une démarcation humaine. C’est une limite de nature
théorique, mathématique, artificielle ou virtuelle qui n’a aucune relation avec la
description physique d’un objet (Smith, 1994). On utilise ce type de limites pour décrire
26
par exemple une limite administrative, la limite entre deux étendues d’eau (ex. entre le
Fleuve Saint-Laurent et le Golfe Saint-Laurent), la limite d’un rapide à l’intérieur d’un
cours d’eau et la limite entre des zones adjacentes de peuplement forestier. Les concepts
et catégories peuvent aussi être considérés comme des représentations de phénomènes de
nature fiat puisqu’ils sont définis de manière purement théorique. De ce fait, les notions
topologiques de limite, d’intérieur, de contact, de séparation et de continuité peuvent être
étendues pour exprimer la similitude entre les concepts et les catégories (Smith et Mark,
1999). Smith (1994) et Smith et Mark (1999) sont d’avis que les objets de limites fiat
possèdent leur propre limite et, en ce sens, le contact et la séparation des objets
s’établissent par la coïncidence en tout ou en partie de leurs limites respectives.
L’Ontologie matérielle se rapporte à l’étude des phénomènes d’un domaine particulier
(ex. les phénomènes naturels, les phénomènes sociaux ou les phénomènes institutionnels,
etc.). De ce fait, l’Ontologie matérielle correspond mieux à la notion d’ontologie que l’on
retrouve en intelligence artificielle.
2.3.2 Une ontologie, un point de vue informatique
En intelligence artificielle, Gruber (1993) définit une ontologie comme une
« spécification explicite d’une conceptualisation » (traduction de l’auteur). C’est une
définition du vocabulaire qui représente une certaine connaissance. Une ontologie inclut
des définitions sur les classes, les relations, les fonctions, etc. spécifiques à un domaine.
D’autres auteurs adhèrent sensiblement à cette définition d’ontologie:
- c’est la couche qui permet de définir les concepts de la réalité (Kashyap et Sheth,
1996);
- c’est un vocabulaire spécifique et des relations utilisées pour décrire certains
aspects de la réalité, et un ensemble d’hypothèses explicites en rapport avec la
signification entendue du vocabulaire (Ouksel et Sheth, 1999);
- c’est une manière qu’un agent perçoit le monde, les éléments qui le composent et
les processus qui représentent l’interaction entre les éléments (Mackay, 1999);
27
- c’est une hiérarchie de mots-clé, un schéma, un dictionnaire de métadonnées, une
terminologie complexe réalisée dans un langage conceptuel (ex. UML) (Sycara et
al., 1999).
Cependant, Guarino (1998) raffine la définition de Gruber comme suit : « une ontologie
est une théorie logique qui représente la signification voulue d’un vocabulaire, i.e. son
consentement ontologique envers une conceptualisation particulière du monde »
(traduction de l’auteur). Dans cette définition, Guarino établit une relation de
consentement entre une ontologie et la conceptualisation du monde. La conceptualisation
du monde est indépendante de toutes langues au même titre que l’Ontologie (en
philosophie) alors qu’une ontologie dépend de la langue.
Une ontologie constitue une base de connaissances qui conserve la description d’un
ensemble de concepts avec toutes leurs propriétés (incluant les propriétés spatiales et
temporelles) ainsi que les relations qui existent entre les concepts. Elle est habituellement
réputée exacte par une communauté d’utilisateurs (Guarino, 1998). Concrètement, une
ontologie peut prendre la forme d’un réseau sémantique, d’une taxonomie ou
classification, d’un Thésaurus, d’un modèle conceptuel ou d’un répertoire de données.
Un réseau sémantique est une structure de nœuds et d’arcs interconnectés qui représente
une forme de connaissance (Lehmann, 1992; Sowa, 1987). Une structure devient
sémantique lorsqu’elle attribue une signification aux arcs et aux nœuds. Les nœuds
correspondent à des unités conceptuelles et les arcs, aux relations entre les unités.
(Lehmann, 1992) inventorie différentes formes de réseaux sémantiques dont les frame
systems, les graphes relationnels, les structures hiérarchiques et les graphes logiques (ex.
les graphes conceptuels).
Une taxonomie est une classification de concepts en catégories d’un domaine particulier.
Elle décrit de manière hiérarchique les concepts et les catégories qui relèvent de
catégories plus générales.
28
Un Thésaurus consiste en une collection de concepts qui constitue une ontologie (Meta
Data Coalition, 1999). C’est un vocabulaire contrôlé et organisé dans un ordre connu,
avec des types de relations spécifiques : équivalence (used for/use), hiérarchique
(broader term/narrower term) et association (related term/related term) (Milstead, 1998).
Les concepts qui composent un Thésaurus sont essentiellement représentés par des
termes (i.e. des mots).
Dans le contexte des bases de données, un modèle conceptuel représente une partie de la
réalité de manière simplifiée et abstraite pour un besoin particulier. Il résulte d’une
analyse centrée sur les données qui sont d’intérêt pour les utilisateurs de la base de
données (Bédard, 1999a; Simsion, 2001). Un modèle conceptuel est le fruit d’une
réflexion qui détermine les éléments de la base de données pour fournir l’information
nécessaire aux utilisateurs (Collongues et al., 1987; Simsion, 2001). Il sert aussi à la
documentation, au développement et à la communication des données d’une base de
données (Bédard, 1999b). Un modèle conceptuel représente et organise un ensemble de
concepts sous forme de catégories, de classes d’objets, de propriétés, de relations
(incluant les généralisations et les agrégations), de rôles, de contraintes, de
comportements, de géométries, de temporalités, etc. dans un formalisme lexical (ex.
EXPRESS, formalisme Bakus-Naur) ou graphique (ex. Entité-Relation, UML).
Les répertoires de données constituent une autre forme d’ontologie. On entend par
répertoire de données l’ensemble des métadonnées qui documentent tant la sémantique
que la structure d’une base de données. Un répertoire de données comprend un
dictionnaire de données et un modèle conceptuel (Brodeur et al., 2000; Gal, 1999; Sycara
et al., 1999). Brodeur et al. (2000) propose un métamodèle de répertoire de données
adapté au contexte des données géospatiales représenté à l’aide d’un diagramme de classe
UML (Figure 7). Ce métamodèle présente les composantes nécessaires à la description de
concepts géospatiaux comprenant les classes d’objets, les relations entre les classes
d’objets (association, dépendance, généralisation), les caractéristiques descriptives,
spatiales et temporelles, les domaines de valeurs d’attributs, les opérations, les
contraintes, etc.
29
Figure 7 : Métamodèle de répertoire de données géospatiales (de Brodeur et al., 2000)
30
Comme les phénomènes sont habituellement abstraits de manières différentes d’une base
de données à l’autre, leurs descriptions peuvent être plus ou moins détaillées selon les
besoins, le contexte et l’expérience des architectes de la base de données. En ce sens, la
littérature reconnaît trois niveaux de granularité dans les ontologies : les ontologies
globales, les ontologies de domaine et les ontologies d’application (Guarino, 1998).
Une ontologie globale définit des concepts à un haut niveau d’abstraction de manière
indépendante des domaines ou des applications spécifiques. Elle vise une représentation
exhaustive des concepts de la réalité. On peut comparer une ontologie globale à un
dictionnaire qui donne la signification de concepts pour une utilisation générale. Wordnet
(http://termiumplus.bureaudelatraduction.gc.ca) et Le grand dictionnaire terminologique
(http://www.granddictionnaire.com) sont des exemples d’ontologies globales.
Une ontologie de domaine décrit les concepts qui sont communs à une communauté
d’information ou à un champ d’activité. Elle spécialise la signification et l’utilisation des
concepts des ontologies globales pour des utilisations restreintes. Ce type d’ontologie
peut se comparer à un lexique qui inventorie les termes spécifiques à une science ou à
une technique. Le National Standards for the Exchange of Digital Topographic Data -
Volume II – Topographic Codes and Dictionary of Topographic Features (Canadian
Council on Surveying and Mapping, 1984) est un exemple d’ontologie de domaine.
Une ontologie d’application inventorie les concepts spécifiques à une utilisation
particulière. Elle peut se comparer à un glossaire que l’on retrouve à la fin d’un ouvrage
qui précise le sens accordé à certains termes et expressions utilisés dans l’ouvrage. Dans
le monde des bases de données géospatiales, une ontologie d’application correspond aussi
à un modèle conceptuel de données, un dictionnaire de données ou à une spécification de
produit de données géographiques. Par exemple, les Normes et spécifications de la Base
nationale de données topographiques du Canada (BNDT) (Ressources naturelles Canada,
1996) incluent une ontologie qui décrit les concepts utilisés pour représenter un ensemble
de phénomènes topographiques.
31
2.4 L’hétérogénéité des données, un frein à l’interopérabilité
Le problème d’interopérabilité des données géospatiales consiste à établir une
communication effective dans l’ensemble des bases de données géospatiales et des
utilisateurs de données géospatiales. Toutefois, les bases de données géospatiales et
utilisateurs présentent un certain nombre d’hétérogénéités les uns des autres qui limitent
leur capacité d’interopérer. (Bishr, 1997; Ouksel et Sheth, 1999; Sheth, 1999)
décomposent l’hétérogénéité qui existe entre les bases de données en quatre niveaux :
l’hétérogénéité des systèmes, l’hétérogénéité syntaxique, l’hétérogénéité structurelle et
l’hétérogénéité sémantique.
2.4.1 Hétérogénéité des systèmes
Puisque les bases de données résident souvent sur des systèmes différents, les premiers
efforts réalisés pour établir l’interopérabilité entre des bases de données fut d’établir
l’interconnexion des systèmes. Les travaux qui ont mené aux réseaux de communication
entre les systèmes et aux protocoles de communication (ex. Ethernet, TCP/IP, RPC, FTP,
HTTP, etc.) permettent aujourd’hui de connecter des systèmes fonctionnant à partir de
systèmes d’exploitation différents (ex. Windows, Unix, VMS, OS/400, Mac OS, Linux,
etc.). Les protocoles de communication définissent l’ensemble des règles qui permettent
aux différents systèmes de communiquer entre eux et de partager tant des fichiers de
données que des ressources.
De plus, l’interconnexion entre des systèmes de gestion de bases de données (SGBD)
permet maintenant de partager non seulement des fichiers de données entre les systèmes
mais aussi des données entre les bases de données. Plusieurs applications accèdent
maintenant à des données emmagasinées dans divers SGBD sur des systèmes différents.
Ces applications utilisent les Structured Query Language (SQL) et des outils comme
l’Open Database Connectivity (ODBC) ou le Java Database Connectivity (JDBC) pour
se connecter et communiquer avec les SGBD.
32
Avec l’émergence de solutions sur l’interconnexion des systèmes hétérogènes durant les
années 1990, les efforts consentis pour réaliser l’interopérabilité se sont depuis orientés
sur les problèmes d’hétérogénéité des données.
2.4.2 Hétérogénéité syntaxique
L’étude de l’hétérogénéité syntaxique concerne spécifiquement la représentation
physique des données (Ouksel et Sheth, 1999), i.e. les signes et l’ordre des signes dans un
message (Cherry, 1978). La syntaxe établit les signes et définit les règles pour ordonner
les signes dans un message. Dans l’hétérogénéité syntaxique, on se préoccupe de la forme
du message plutôt que du contenu. Dans un contexte d’interopérabilité, il appert qu’un
système puisse dans un premier temps décoder les signes d’un message pour ensuite les
comprendre. L’hétérogénéité syntaxique peut être comparée à la langue que deux
individus utilisent pour communiquer. Un individu qui ne connaît pas les signes chinois
(oral ou écrit) ne pourra pas comprendre un individu qui communique essentiellement en
chinois; il lui sera impossible de décoder les signes à cause de leur hétérogénéité
syntaxique.
Dans les données géospatiales, nous observons l’hétérogénéité syntaxique lorsque des
systèmes en interaction n’ont aucun format commun d’échange de données géospatiales
(CCOGIF, ArcExport, MID/MIF, DXF, Shape, etc.). L’approche de l’OGC et de
l’ISO/TC 211 pour remédier à l’hétérogénéité syntaxique entre les SIG est de normaliser
une syntaxe pour permettre la communication des données géographiques. Entre autre,
ces organisations travaillent activement à la définition du Geography Markup Language
(GML) (ISO/TC 211, 2003b; Open GIS Consortium Inc., 2001).
L’hétérogénéité syntaxique se remarque aussi dans les différentes formes utilisées pour
représenter de l’information géographique (Bishr, 1997). Fondamentalement, on retrouve
deux formes de représentation de l’information géographique : la forme matricielle et la
forme vectorielle. La forme matricielle consiste en une mosaïque régulière de cellules
(appelées aussi pixels) auxquelles différentes valeurs sont attribuées pour représenter un
33
thème (ex. l’essence des arbres du couvert forestier, le relief, les précipitations, etc.). La
forme vectorielle utilise des représentations géométriques telles que le point, la ligne, la
surface et le volume pour la description géométrique des données. Pour simplifier la
syntaxe des données vectorielles, l’ISO/TC 211 a par exemple normalisé un ensemble de
représentations géométriques vectorielles pour les SIG (ISO/TC 211, 2003a). Le tableau
1 résume les principales représentations géométriques de cette norme.
Tableau 1 : Représentations spatiales de la norme ISO 19107
2.4.3 Hétérogénéité structurelle
L’hétérogénéité structurelle se préoccupe des différences dans la modélisation des
données. Par exemple, le concept rue peut être modélisé comme une valeur de l’attribut
classification de la classe d’objets route. Il peut aussi être représenté comme une sous-
classe de la classe d’objets route (Figure 8). Bishr (1997), Charron (1995) et Sheth et
Kashyap (1992) classifient la nature des conflits propres à l’hétérogénéité structurelle de
manière similaire. Nous regroupons au tableau 2 les conflits structurels des données
géospatiales sous quatre volets : concept, propriété, géométrie et temporalité.
34
Figure 8 : Différentes structures du concept rue
2.4.4 Hétérogénéité sémantique
L’hétérogénéité sémantique correspond à la différence de signification entre les concepts.
Telle qu’illustré à la Figure 3, la signification d’un concept s’établit par le lien fait entre
le signifiant et le référent. La différence entre les modèles cognitifs de deux individus
qui, par exemple, associent des signaux identiques à des phénomènes différents et des
signaux différents aux mêmes phénomènes, illustre bien l’hétérogénéité sémantique.
Puisque les modèles cognitifs se développent par l’observation de phénomènes dans un
contexte particulier, le contexte joue donc un rôle important dans l’hétérogénéité
sémantique des concepts. Il devient nécessaire de considérer le contexte dans lequel les
phénomènes sont observés pour résoudre l’hétérogénéité sémantique entre les concepts.
Les ontologies sont reconnues pour maintenir la signification accordée aux concepts
d’une base de données. Dans les bases de données géospatiales, les ontologies décrivent
un ensemble de concepts avec leur définition, leurs propriétés, leur géométrie et leur
temporalité selon le contexte duquel ils sont abstraits. L’évaluation de la similitude
sémantique entre deux concepts de deux ontologies (ex. railroad de VMap et railLine de
BC Digital Base Line Mapping, voir Tableau 3, Section 3.3) vise à résoudre
l’hétérogénéité sémantique entre les concepts. Bishr (1997) affirme que l’hétérogénéité
sémantique est la principale barrière au partage de données géospatiales.
35
Tableau 2 : Nature des conflits structurels de données géospatiales
36
2.5 Principales approches d’interopérabilité sémantique
Les toutes premières bases de données étaient développées dans plusieurs cas à petite
échelle à l’intérieur de départements d’organisations de toutes tailles. Les organisations
ont vite compris l’importance de ces bases de données dans l’ensemble de leurs activités :
« data is a corporate resource » (Sheth, 1999). C’est sur cette base que le partage,
l’échange et la mise en commun des données (en d’autres mots l’interopérabilité des
données) se sont développés. Cette section revoit des approches qui ont marqué le
développement de l’interopérabilité sémantique des données géospatiales et qui sont
précurseurs à la présente thèse. Plus spécifiquement, elle traite de l’approche de
fédération de données (Sheth et Larson, 1990), de la notion de similitude sémantique
(Kashyap et Sheth, 1996; Sheth et Kashyap, 1992), du modèle Semantic Formal Data
Structure (Bishr, 1997) et du modèle Matching-Distance (Rodriguez, 2000).
2.5.1 Fédération de données
L’idée des fédérations de données est de permettre à une communauté d’utilisateurs
l’accès à plusieurs bases de données comme si elles n’en formaient qu’une seule. Une
fédération a pour objet de mettre en commun des données de plusieurs bases de données
indépendantes. Comme chaque base de données est habituellement implantée dans un
SGBD particulier et possède sa propre structure de données, une fédération apporte une
solution pour l’hétérogénéité syntaxique et structurelle des bases de données
participantes.
Sheth (1999) présente une architecture de fédération de bases de données en cinq niveaux
(Figure 9). Au niveau inférieur, on retrouve les différentes bases de données avec leurs
modèles conceptuels respectifs (en anglais local schema). Les modèles conceptuels des
bases de données peuvent êtres exprimés dans des formalismes variés (ex. E/R, UML,
etc.). Ensuite figurent les schémas de composants (en anglais component schema). Ce
sont des traductions des modèles conceptuels des bases de données qui se conforment au
37
modèle conceptuel de la fédération. Au troisième niveau, nous retrouvons les vues
externes de bases de données (en anglais export schema). Les vues externes identifient les
données des bases de données qui sont rendues disponibles dans la fédération. Une vue
externe agit comme un filtre sur une base de données pour contrôler l’accès aux données
et les transactions aux bases de données. Le modèle conceptuel de la fédération aussi
appelé le modèle canonique se retrouve au quatrième niveau (Sheth, 1999; Tari, 1992).
C’est le résultat de l’intégration des vues externes des bases de données qui composent la
fédération (Spaccapietra et al., 1992; Tari, 1992). Le modèle de la fédération coordonne
les transactions qui lui sont soumises. Il reçoit les transactions de la fédération et les
distribue aux différentes bases de données. Au niveau supérieur, nous avons les vues
externes (en anglais external schema) de la fédération. Ces vues sont des définitions
d’ensembles de données formulés pour des utilisateurs ou des applications spécifiques.
Elles correspondent d’une certaine manière à une spécification de produit.
Figure 9 : Architecture de fédération de données en cinq niveaux
(de Sheth, 1999)
Les fédérations se sont développées selon deux tendances : fédération couplée fortement
et fédération couplée faiblement. Les fédérations couplées fortement fournissent une
architecture robuste et stable. L’intégration des données est forte, mais aussi très rigide.
38
Une fédération couplée fortement exige une administration de système imposante pour
assurer l’intégrité des modèles et des données. Ce type de fédération convient
particulièrement bien lorsque les objectifs de la fédération sont clairement définis, que le
nombre de bases de donnée locales est faible et que la fédération contrôle bien les bases
de données locales participantes (Kahng et McLeod, 1998). Les fédérations couplées
faiblement ont une architecture dynamique. Chaque base de données de la fédération est
plus autonome, ce qui permet plus de flexibilité dans la participation à la fédération. Dans
une fédération couplée faiblement, le modèle canonique fournit une intégration partielle
des bases de données participantes (Kahng et McLeod, 1998). Une fédération couplée
faiblement exige plus d’investissement de la part des utilisateurs des données.
2.5.2 Similitude sémantique
La notion de similitude sémantique entre concepts est étudiée dans plusieurs disciplines,
notamment en psychologie, en cognition et en intelligence artificielle. Elle exprime les
ressemblances et les différences qui existent entre deux concepts. La notion de similitude
fut développée de plusieurs façons. Rodriguez (2000) dresse un éventail de diverses
approches proposées pour évaluer la similitude sémantique entre concepts.
La similitude sémantique fut étudiée entre autres dans les feature-based models. Ce sont
des approches qui évaluent la similitude sémantique entre deux concepts (a et b)
quantitativement à l’aide d’une distance conceptuelle (D). Certaines approches
déterminent cette distance conceptuelle par la comparaison des propriétés des deux
concepts. C’est le cas du contrast model (Équation 1) et du ratio model (Équation 2) de
Tversky (Tversky, 1977). Ces modèles évaluent la distance conceptuelle en comparant
les propriétés communes et distinctives des concepts. D’autres approches évaluent la
)()()(),( ABfBAfBAfbaD −−−−∩= βαθ où θ, α et β ≥ 0 (Équation 1)
)()()()(),(
ABfBAfBAfBAfbaD
−+−+∩∩=
βαθθ où θ, α et β ≥ 0 (Équation 2)
39
similitude sémantique à l’aide d’une distance dans un espace sémantique Euclidien
multidimensionnel (Équation 3) (Rips et al., 19973).
−= ∑=
n
ddbda XXbaD
1
2
,,),( (Équation 3)
Où : d est une dimension
X est une coordonnée sémantique
La similitude sémantique entre concepts a aussi été traitée à travers les réseaux
sémantiques. Frankhauser et al. (1991) et Frankhauser et Neuhold (1992) utilisent des
réseaux de connaissances (en anglais knowledge networks) pour quantifier la distance
entre deux concepts dans un même réseau. Cette distance représente le parcours le plus
court qui lie deux concepts. Un coefficient dans l’intervalle [0,1] est attribué à chaque
lien qui unit deux concepts. Ce coefficient exprime le poids attribué à la relation entre les
deux concepts. La similitude sémantique entre deux concepts non adjacents (c.-à-d.
séparés par un ou plusieurs concepts dans le réseau) est déterminée en tenant compte pour
chaque lien de la relation impliquée (association a, généralisation g et spécialisation s) et
du poids de la relation. Les relations sont analysées par paire et en ordre pour déterminer
la nature de la relation résultante (a, g ou s), la fonction de distance à utiliser (d1, d2 ou
d3) et la priorité d’évaluation des fonctions de distance. En utilisant le réseau de
connaissances de la Figure 10, la distance du parcours passerelle – sentier – route – rue
correspond à d2 (d2(.7,.6),.9) = .378.
a g s a g s a g s
a a a a a d2 d2 d2 a 3 1 1
g a g a g d2 d3 d1 g 1 2 3
s a a s s d2 d1 d3 s 1 3 2
Relation résultante
Fonction de distance
Priorité d’évaluation
40
Où les fonctions de distance sont :
( ) ( )1,0max,1 −+= βαβαd
( ) αββα =,2d (Équation 4)
( ) ( )βαβα ,min,3 =d
Figure 10 : Exemple de réseau de connaissances
Dans une approche basée sur la comparaison d’abstractions à partir d’un contexte
spécifique, Kashyap et Sheth (1996), Kashyap et Sheth (1998) et Sheth et Kashyap
(1992) introduisent la notion de proximité sémantique pour déterminer qualitativement la
similitude sémantique entre deux objets. La proximité sémantique compare deux objets
en fonction du domaine des classes d’objets et de l’état des objets selon un contexte
donné et un type d’appariement des domaines des classes d’objets :
semPro (O1, O2) = <ctx, app, (D1,D2), (E1,E2)> (Équation 5)
où :
Oi : un objet,
ctx : le contexte de comparaison
app : le type d’appariement
Di : le domaine de Oi
Ei : l’état de Oi
41
Plusieurs auteurs identifient le contexte (ctx) comme une notion fondamentale en
interopérabilité sémantique. Ouksel et Naiman (1993) introduisent le contexte comme la
connaissance nécessaire pour raisonner sur un autre système. Sciore et al. (1992)
définissent le contexte comme la signification, le contenu, l’organisation et les propriétés
des données. Guha (1990) présente le contexte comme une caractéristique associée à un
sous-ensemble d’une ontologie. Sheth et Kashyap (1992) mentionnent que le contexte
peut être vu comme (1) l’association d’une classe d’objets à une base de données ou à
une application, (2) la participation d’une classe d’objets dans une relation, (3)
l’association d’une classe d’objets à une vue externe des données (en anglais external
schema) ou à un modèle de données (en anglais internal schema), ou (4) une collection
nommée des domaines des classes d’objets. Le contexte est le principal médium pour
représenter l’essence même d’un concept, c.-à-d. le phénomène ou l’évènement auquel il
fait référence (Ouksel et Sheth, 1999). Kashyap et Sheth (1996) et Kashyap et Sheth
(1998) proposent une représentation partielle du contexte sous la forme d’une collection
de coordonnées «contextuelles» et de valeurs :
Contexte = <(C1, V1), (C2, V2), …, (Cn, Vn)>
Où :
Ci : un rôle, un aspect du contexte
Vi : valeur attribuée au contexte (peut être une variable)
Exemple :
ContexteRoute = <(classification, {autoroute, principale, secondaire,
rue}), (revêtement, {pavée, non pavée}), (support de la route, {au
sol, autre})>
Un défi en interopérabilité sémantique est de comparer des représentations de
phénomènes en tenant compte du contexte.
42
L’appariement des domaines (app) des objets est associé à la composante structurelle. Il
décrit la relation qui existe entre les domaines des deux objets. Kashyap et Sheth (1996)
et Kashyap et Sheth (1998) définissent huit relations d’appariement :
Appariement total 1-1 À chaque valeur du domaine d’une classe d’objets
correspond une valeur du domaine de l’autre
classe d’objets et vice versa.
Appariement partiel M-1 Une valeur du domaine d’une classe d’objets
correspond à plusieurs valeurs du domaine de
l’autre classe d’objets; certaines valeurs peuvent
ne pas correspondre.
Généralisation/spécialisation Le domaine d’une classe d’objets est la
généralisation ou la spécialisation du domaine
d’une autre classe d’objets; les domaines des deux
classes d’objets sont la généralisation ou la
spécialisation du domaine d’une troisième classe
d’objets.
Agrégation Le domaine d’une classe d’objets est une
collection de domaines d’autres classes d’objets.
Dépendance fonctionnelle Les valeurs du domaine d’une classe d’objets
dépendent des valeurs du domaine de l’autre
classe d’objets.
Quelconque N’importe laquelle des relations définies ci-haut.
Aucune Aucune correspondance entre les domaines des
classes d’objets.
Le domaine d’une classe d’objets (Di) consiste en l’ensemble des valeurs qu’un objet
peut prendre. Lorsqu’une classe d’objets comporte plusieurs propriétés, le domaine de la
classe d’objets correspond à un sous-ensemble du produit croisé de toutes les valeurs que
les propriétés peuvent prendre, c.-à-d. aux différentes combinaisons acceptées de valeurs
43
des propriétés. Par exemple, le domaine de route est Droute = ({autoroute, pavée, au sol},
{autoroute, non pavée, au sol}, {principal, pavée, au sol}, …, {rue, non pavée, autre}).
L’état d’un objet (Ei) décrit la représentation de l’objet dans la base de données. C’est
l’extension d’un concept ou d’un phénomène. Par exemple, l’état d’un objet route est
Eroute = (rue, pavée, au sol)
En s’appuyant sur leur définition de la proximité sémantique, Kashyap et Sheth (1996) et
Kashyap et Sheth (1998) élaborent une suite de cinq prédicats indépendants pour qualifier
la proximité sémantique entre deux classes d’objets :
Incompatibilité sémantique Proximité sémantique où les domaines des classes
d’objets n’ont aucune correspondance possible.
C’est la dissimilitude sémantique, la disjonction
des classes d’objets.
Ressemblance sémantique Proximité sémantique où les domaines des classes
d’objets n’ont aucune correspondance mais où les
objets jouent un rôle similaire. Ils ont une
connotation commune.
Pertinence sémantique Proximité sémantique où un appariement
quelconque des domaines des classes d’objets
existe dans certains contextes. Il y a un rapport
entre les classes d’objets indépendamment du
type d’appariement.
Relation sémantique Proximité sémantique où un appariement partiel,
une généralisation ou une agrégation existe entre
les domaines des objets dans tous les contextes.
C’est une forme de chevauchement entre les
classes d’objets.
Équivalence sémantique Proximité sémantique la plus forte entre les
classes d’objets. Les deux classes représentent les
44
mêmes phénomènes. Il existe un appariement
total des domaines des classes d’objets dans tous
les contextes. L’équivalence sémantique
représente l’égalité des classes d’objets.
2.5.3 Le modèle Semantic Formal Data Structure
Bishr (1997) étudie plus spécifiquement l’interopérabilité sémantique des données
géospatiales pour résoudre le problème de partage de l’information. Il propose le modèle
Semantic Formal Data Structure ou SFDS. Ce modèle est dérivé de l’architecture de
fédération de données présentée plus tôt. Toutefois, l’approche du SFDS traite chaque
domaine d’application de façon indépendante (ex. utilisation du sol, géologie, etc.).
L’architecture du SFDS (Figure 11) se divise en trois tiers. Au tiers inférieur, on retrouve
les bases de données et leur modèle conceptuel respectif. On a au tiers intermédiaire la
vue externe de données qui représente les éléments de la base de données qui sont rendus
accessibles aux utilisateurs. Une description du contexte est associée à la vue externe. Le
médiateur de contexte figure au troisième tiers. Il est composé d’une ontologie commune,
d’un modèle fédéré et d’une description du contexte associé au modèle fédéré.
Figure 11 : Architecture trois tiers du SFDS (de Bishr, 1997)
45
Les classes du modèle fédéré et des vues externes sont couplées à un concept de
l’ontologie. L’ontologie consiste essentiellement en une hiérarchie d’hypernymes et
d’hyponymes qui définissent un vocabulaire. Une classe d’une vue externe et une classe
du modèle fédéré couplées au même concept de l’ontologie sont considérées similaires.
Les conflits structurels peuvent alors être résolus entre ces classes. Dans l’approche du
SFDS, un utilisateur peut soumettre une requête à la fédération selon le vocabulaire d’une
vue externe et l’intermédiaire du médiateur de contexte. Le contexte est défini dans cette
approche comme un ensemble de définitions de catégories, de définitions de classes et de
descriptions géométriques :
Contexte = (définition de catégories ∧ définition intentionnelle
de classe ∧ description géométrique)
Exemple :
ContexteRoute = ∀route (définition = “Voie de circulation spécialement
aménagée pour le déplacement de véhicules automobiles.”
∧ classification ∈ {autoroute, principale, secondaire, rue} ∧
revêtement ∈ {pavée, non pavée} ∧ support ∈ {au sol,
autre} ∧ géométrie = )
2.5.4 Le modèle Matching-Distance
Rodriguez (2000) a étudié la similitude sémantique entre classes d’objets pour la
recherche et l’intégration de données géospatiales provenant de sources multiples et
répondant à un besoin spécifique. Elle propose le modèle Matching-Distance (MD) pour
évaluer quantitativement la similitude sémantique entre deux classes d’objets spatiaux.
L’évaluation de la similitude sémantique s’appuie sur la description de classes d’objets
obtenues d’ontologies. Le modèle MD ne tient toutefois pas compte des propriétés
géométriques et temporelles des classes d’objets (Rodriguez, 2000).
46
Dans le modèle MD, la similitude sémantique entre deux classes d’objets est évaluée par
une distance sémantique (D). Cette distance sémantique est une somme pondérée des
distances sémantiques obtenues de la comparaison des parties (p), des fonctions (f) et des
attributs (a) entre deux classes d’objets :
( ) ( ) ( )baDbaDbaDbaD aaffpp ,,,),( ⋅+⋅+⋅= ωωω (Équation 6)
Pour la mesure de la distance sémantique, le modèle MD adapte le ratio model (Équation
2) (Tversky, 1977) :
( ) ( )( ) ABbaBAbaBABA
baD−⋅−+−⋅+∩
∩=
,1,),(
αα (Équation 7)
Le facteur α exprime la profondeur des classes d’objets par rapport à un concept commun
à l’intérieur d’une structure hiérarchique de relations de généralisation (IS_A) et
d’agrégation (Part/Whole) :
( )
( )( ) ( ) ( )
( )( ) ( ) ( )
>−
≤
=
bulbdbuladbad
bulad
bulbdbuladbad
bulad
ba
..,..,,
..,1
..,..,,
..,
,α (Équation 8)
Dans le modèle MD, le contexte précise l’intention de l’utilisateur et, par conséquent, le
domaine d’application. Il est exprimé par une opération représentée par un verbe qui
s’applique sur un ensemble de classes d’objets; chaque classe d’objets est représentée par
un nom. Par exemple, un utilisateur peut s’intéresser aux classes d’objets qui se
rapportent au contexte { }( )véloautomobilecirculerC ,= qui caractérise l’ensemble des
phénomènes où l’on peut circuler en automobile et à vélo.
47
Dans le modèle MD, la similitude sémantique est évaluée tant sur le plan de la variabilité
(c.-à-d. sur les aspects distinctifs) que sur le plan de la ressemblance (c.-à-d. sur les
aspects ressemblants) entre les classes d’objets. La différence dans l’évaluation de la
distance sémantique sur les plans de la variabilité et de la ressemblance se reflète dans la
manière de calculer les poids ωi (Équation 6) dans lequel le contexte joue un rôle
important.
2.6 Discussion
L’interopérabilité des données géospatiales peut être considérée comme un processus de
communication entre les utilisateurs et les bases de données. Chaque utilisateur et chaque
base de données possèdent une représentation des phénomènes géospatiaux qui lui est
propre. Pour un utilisateur ou une base de données, cette représentation des phénomènes
constitue son ontologie. Dans un processus de communication, nous voyons une
ontologie comme une base de connaissances à partir de laquelle une source formule les
messages qu’elle envoie et de laquelle un destinataire reconnaît les messages qu’il reçoit.
La proximité sémantique constitue une opération sur deux représentations de phénomènes
qui exprime leur similitude sémantique. Dans la présente thèse, nous intégrons la
proximité sémantique au processus de communication. La proximité sémantique est
utilisée par la source pour formuler des messages à partir des concepts qui lui sont
propres. Elle est aussi utilisée par le destinataire pour identifier les concepts pour
attribuer une signification au message.
Plusieurs des approches mentionnées dans ce chapitre optent pour une mesure
quantitative pour évaluer la proximité sémantique. Considérant que l’être humain
raisonne surtout de manière qualitative, nous croyons qu’une telle approche serait plus
appropriée pour exprimer la proximité sémantique des données géospatiales. De plus,
l’approche qualitative semble mieux adaptée au problème de la présence de
l’hétérogénéité dans les bases de données. L’approche de proximité sémantique
développée dans cette thèse se base sur le modèle topologique d’intérieur et de limite
48
pour comparer qualitativement le contexte entre un concept et une représentation
conceptuelle. Elle s’inscrit dans les approches basées sur le contexte.
2.7 Références
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
Bédard, Y 1986 A Study of the Nature of Data Using a Communication-based
Conceptual Framework of Land Information. Ph.D. Dissertation, University of
Maine
Bédard, Y 1999a Principles of Spatial Database Analysis and Design. In P A Longley,
M F Goodchild, D J Maguire, and D W Rhind (eds) Geographical Information
Systems: Principles, Techniques, Applications and Management. New York, John
Wiley and Sons, Inc.: 413-424
Bédard, Y 1999b Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Brodeur, J, Y Bédard, et M J Proulx 2000 Modelling Geospatial Application Databases
using UML-based Repositories Aligned with International Standards in
Geomatics. In Proceedings of Eighth ACM Symposium on Advances in
Geographic Information Systems (ACMGIS) ACM Press: 39-46
Campbell, J 1982 Grammatical Man: Information, Entropy, Language, and Life. New
York, Simon and Schuster
Canadian Council on Surveying and Mapping 1984 National Standards for the Exchange
of Digital Topographic Data: Topographic Codes and Dictionary of Topographic
Features. Ottawa, Topographical Survey Division, Surveys and Mapping Branch,
Energy, Mines and Resources Canada
Casati, R, B Smith, et A C Varzi 1998 Ontological Tools for Geographic Representation.
In N Guarino (ed) Formal Ontology in Information Systems. Amsterdam, IOS
Press: 77-85
49
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Cherry, C 1978 On Human Communication: a Review, a Survey, and a Criticism.
Cambridge, Massachusetts, The MIT Press
Collongues, A, J Hugues, et B Laroche 1987 Merise - Méthode de conception. Paris,
Bordas
Daconta, M C, L J Obrst, et K T Smith 2003 The Semantic Web: A Guide to the Future of
XML, Web Services, and Knowledge Management. Indianapolis, Indiana, Wiley
Publishing, Inc.
Darnell, D K 1971 Information Theorie. In J A DeVito (ed) Communication: Concepts
and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 37-45
Eco, U 1988a Le signe. Bruxelles, Éditions Labor
Eco, U 1988b Sémiotique et philosophie du langage. France, Presses Universitaires de
France
Frankhauser, P, M Kracker, et E Neuhold 1991 Semantic vs. Structural Resemblance of
Classes. SIGMOD Record, 20(4): 59-63
Frankhauser, P, et E J Neuhold 1992 Knowledge Based Integration of Heterogenous
Databases. In Proceedings of IFIP WG2.6 Database Semantics Conference on
Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 155-175
Gal, A 1999 Semantic Interoperability in Information Services: Experience with
CoopWARE. Sigmod Record, 28(1): 8
Gruber, T R 1993 A Translation Approach to Portable Ontology Specification. Stanford,
California, Knowledge Systems Laboratory Technical Report KSL 92-71
Guarino, N 1998 Formal Ontology and Information Systems. In Proceedings of Formal
Ontology in Information Systems (FOIS '98). Amsterdam, IOS Press: 3-15
Guha, R V 1990 Micro-theories and contexts in Cyc. I. Basic issues. Austin, Texas,
Micro-electronics and Computer Technology Corporation Technical Report ACT-
CYC-129-90
50
Harrison, R P 1971 Other Ways of Packaging Information. In J A DeVito (ed)
Communication: Concepts and Processes. Englewood Cliffs, New Jersey,
Prentice-Hall Inc: 88-103
ISO/TC 211 2003a ISO 19107:2003 Geographic Information - Spatial Schema. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003b WD 19136 Geographic Information - Geography markup language.
Geneva, Switzerland, International Organization for Standardization
Kahng, J, et D McLeod 1998 Dynamic Classificational Ontologies: Mediation of
Information Sharing in Cooperative Federated Database Systems, Context and
Ontologies. In M P Papazoglou, and G Schlageter (eds) Cooperative Information
Systems-Trends and Directions. San Diego, CA, Academic Press: 179-203
Kashyap, V, et A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Kashyap, V, et A Sheth 1998 Semantic Heterogeneity in Global Information Systems: the
Role of Metadata, Context and Ontologies. In M P Papazoglou, and G Schlageter
(eds) Cooperative Information Systems-Trends and Directions. San Diego, CA,
Academic Press: 139-178
Kosslyn, S M 1980 Image in Mind. Cambridge, Massachusetts, Harvard University Press
Kosslyn, S M 1981 The Medium and the Message in Mental Imagery: A Theory.
Psychological Review, 88(1): 46-66
Laakso, A, et G Cottrell 2000 Content and Cluster Analysis: Assessing Representational
Similarity in Neural Systems. Physical Psychology, 13(1): 47-76
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50
Mackay, D S 1999 Semantic Integration of Environmental Models for Application to
Global Information Systems and Decision Making. Sigmod Record, 28(1): 7
Meta Data Coalition 1999 Knowledge Management Model: Knowledge Description.
Milstead, J L 1998 NISO Z39.50: Standard for Structure and Organization of Information
Retrieval Thesauri. In Proceedings of Taxonomic Authority Files Workshop: 9
Open GIS Consortium Inc. 2001 Geography Markup Language (GML) 2.0. Wayland,
Massachusetts, Open GIS Consortium Inc.
51
Ouksel, A, et C Naiman 1993 Coordinating context build-ing in heterogeneous
information systems. Journal of Intelligent Information Systems, 3: 151–183
Ouksel, A M, et A Sheth 1999 Semantic Interoperability in Global Information Systems:
A Brief Introduction to the Research Area and the Special Section. Sigmod
Record, 28(1): 5-12
Peuquet, D, B Smith, et B Brogaard 1998 The Ontology of Fields. In Proceedings of
Summer Assembly of the University Consortium for Geographic Information
Science
Pylyshyn, Z W 1981 The Imagery Debate: Analogue Media Versus Tacit Knowledge.
Psychological Review, 88(1): 16-45
Pylyshyn, Z W In Press Mental Imagery: In search of a theory. Behavior and Brain
Sciences: 53
Ressources naturelles Canada 1996 Base nationale de données topographiques - normes
et spécifications. Sherbrooke, Québec, Centre d’information topographique –
Sherbrooke
Rips, L, J Shoben, et E Smith 19973 Semantic Distance and the Verification of Semantic
Relations. Journal of Verbal Learning and Verbal Behavior, 12: 1-20
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine
Schramm, W 1971a How Communication Works. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 12-21
Schramm, W 1971b The Nature of Communication Between Humans. In W Schramm,
and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 3-53
Sciore, E, M Siegel, et A Rosenthal 1992 Context interchange using metaattributes. In
Proceedings of First International Conference on Information and Knowledge
Management (CIKM): 377-386
Shannon, C E 1948 A Mathematical Theory of Communication. The Bell System
Technical Journal, 27: 379-423, 623-656
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
52
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29
Sheth, A, et V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312
Sheth, A P, et J A Larson 1990 Federated Database Systems for Managing Distributed,
Heterogenous, and Autonomous Databases. ACM Computing Surveys, 22(3): 183-
236
Simsion, G C 2001 Data Modeling Essentials - Analysis, Design, and Innovation.
Scottsdale, Arizona, Coriolis
Smith, B 1994 Fiat Objects. In Proceedings of Workshop on Parts and Wholes:
Conceptual Part-Whole Relations and Formal Mereology, 11th European
Conference on Artificial Intelligence: 15-23
Smith, B, et D Mark 1999 Ontology with Human Subjects Testing: An Empirical
Investigation of Geographic Categories. American Journal of Economics and
Sociology, 58(2): 245-272
Smith, B, et A C Varzi 2000 Fiat and Bona Fide Boundaries. Philosophy and
Phenomenological Research, 60(2): 401-420
Sowa, J F 1987 Semantic Networks. In S C Shapiro (ed) Encyclopedia of Artificial
Intelligence. New York, John Wiley & Sons
Spaccapietra, S, C Parent, et Y Dupont 1992 Model Independent Assertions for
Integration of Heterogenous Schemas. VLDB Journal, 1(1): 81-126
Sycara, K, M Klusch, S Widoff, et J Lu 1999 Dynamic Service Matchmaking Among
Agents in Open Information Environnements. Sigmod Record, 28(1): 47-53
Tari, Z 1992 Interoperability Between Database Models. In Proceedings of IFIP WG2.6
Database Semantics Conference on Interoperable Database Systems (DS-5)/IFIP
Transaction (A-25) Elsevier Science Publishers B.V.: 101-119
Tversky, A 1977 Features of similarity. Psychological Review, 84(4): 327-352
Weiner, N 1950 The Human Use of Human Beings: Cybernetics and Society. Boston,
Houghton and Mifflin
CHAPITRE 3
L’INTEROPÉRABILITÉ
DES DONNÉES GÉOSPATIALES :
PROPOSITION D’UN CADRE CONCEPTUEL
Revisiting the Concept of Geospatial Data Interoperability
within the Scope of Human Communication Processes
(J. Brodeur, Y. Bédard, G. Edwards, et B. Moulin)
3.1 Résumé de l’article
D’importants travaux réalisés depuis le début des années 1990 ont porté sur la définition
et le développement de l’interopérabilité des données géospatiales. Cependant, la
définition d’un cadre conceptuel de l’interopérabilité des données géospatiales constitue
une contribution significative pour comprendre ce qu’est l’interopérabilité des données
géospatiales, pour reconnaître l’apport de chacune des contributions existantes et pour
stimuler de nouvelles recherches sur ce problème.
L’article qui fait l’objet de ce chapitre revoit le concept d’interopérabilité avec un recul
plus large en considérant la communication entre les êtres humains et leur
fonctionnement cognitif. En effet, la communication entre les êtres humains apparaît un
54
cadre fertile du fait que les êtres humains communiquent de manière interopérable plus
facilement que les ordinateurs. Par conséquent, ce chapitre propose un cadre conceptuel
d’interopérabilité plus global que les autres cadres existants en l’illustrant par des
exemples concrets. Une ontologie formelle d’interopérabilité des données géospatiales
vient compléter la description du cadre conceptuel proposé. Dans ce cadre conceptuel, les
notions de concept, de contexte, de proximité et d’ontologie apparaissent comme
fondamentales pour élaborer notre approche de proximité géosémantique.
3.2 Abstract
Geospatial data interoperability has been the target of major efforts by standardization
bodies (e.g. OGC, ISO/TC 211) and the research community since the beginning of the
1990s. It is seen as a solution for sharing and integrating geospatial data, more
specifically to solve the syntactic, schematic, and semantic as well as the spatial and
temporal heterogeneities between various representations of real-world phenomena. A
few models have been proposed to automatically overcome heterogeneity of geospatial
data and, as a result, increase the interoperability of geospatial data. However, the
addition of a conceptual framework of geospatial data interoperability would contribute
to understanding geospatial data interoperability, the appreciation of where existing
contributions specifically apply, and would foster new contributions.
In this chapter, we revisit the concept of geospatial data interoperability within the
broader scope of human communication and cognition. Human communication appears
to be a rich framework since humans interoperate more easily than computers do.
Accordingly, we present a conceptual framework of geospatial data interoperability that
is broader in scope than existing frameworks and supported by several practical
examples. An ontology of geospatial data interoperability is also introduced in order to
refine the description of the conceptual framework. In such a communication-based
framework, the notions of concept, context, proximity, and ontology appear to be
fundamental elements. These elements constitute a new approach to geosemantic
proximity.
55
3.3 Introduction
For almost a decade, interoperability of geospatial data has been a prime concern in the
geospatial information community. Software developers, data producers, and users aim at
enabling the sharing and integration of geospatial data and geoprocessing resources
(Kottman, 1999). Organizations such as the OpenGIS Consortium Inc. (OGC) and
ISO/TC 211, as well as the research community, have pioneered in laying the current
foundation for geospatial data interoperability. This community views interoperability as
a solution to problems arising from syntactic, structural, and semantic heterogeneities,
especially spatial and temporal heterogeneities, between data sources (Bishr, 1997;
Charron, 1995; Laurini, 1998; Ouksel and Sheth, 1999; Sheth, 1999).
Within the context of OGC, interoperability corresponds to “software components
operating reciprocally to overcome tedious batch conversion tasks, import/export
obstacles, and distributed resource access barriers imposed by heterogeneous processing
environments and heterogeneous data” (McKee and Buehler, 1998). Sondheim et al.
(1999) describe interoperability as a non-imposed, bottom-up approach where
heterogeneous systems, data models, and data sources deployed independently of one
another, can exchange data and handle queries (and other processing requests) as well as
make use of a common understanding of the data and requests.
Progress in geospatial interoperability is observed for syntactic heterogeneity (i.e. GIS
format translation and spatial data structure transformation, such as raster to vector
transformation) and geometric data-type definition. Progress is also observed for
structural heterogeneity, that is, differences in the internal organization of GIS
application data, the geodetic datum, map projections, and coordinate systems. Research
in geospatial interoperability, however, must go beyond geometry-related and database-
structure concerns to take into account semantics (Egenhofer, 1999; Rodriguez, 2000).
Reconciling both semantic and geometric heterogeneities between different geospatial
datasets describing the same phenomenon is deemed a major challenge. For example, it is
56
still a problem to reconcile representations such as wetlands, marshes/swamps, marshes,
swamps, marshes/fens, milieux humides, and marais (see Table 3) that are used to
describe the same phenomena.
The nature of the problem that initiated the research presented in this thesis relates to
users of geospatial data increasingly having to deal with numerous data sources to meet
their specific needs. Examples of Canadian sources include the National Topographic
Data Base produced by the Department of Natural Resources Canada (Natural Resources
Canada, 1996); the Street Network Files, the Digital Boundary Files, and the Digital
Cartographic Files produced by Statistics Canada (Statistics Canada, 1997); the VMap
libraries produced for military purposes (VMap, 1995); the National Atlas of Canada
produced by the Department of Natural Resources Canada (Natural Resources Canada,
1996); and several provincial topographic data sources that are usually carried out at
larger scales (see BC Ministry of Environment Lands and Parks (Geographic Data BC),
1992; New Brunswick, 2000; OBM, 1996; Québec, 2000). Typically, each data source
describes differently closely related topographical phenomena. See for instance
waterbody, lake, lake/pond, coastline, river/stream, and canal in Table 3 which lists
some examples of phenomena represented differently by different sources, both
geometrically and semantically. Consequently, the retrieval of geospatial data complying
with the user’s needs and, subsequently, the data’s integration into a coherent whole
remains a crucial challenge. This kind of interoperability problem could have been
addressed from different points of view (e.g. Harvey, 1997). Our perspective is strongly
influenced by the artificial intelligence and database modelling approaches to human
communication, negotiation, and ontologies, which are connected but different from the
philosophical perspective of these topics.
Accordingly, the next section of this chapter reviews some fundamental notions of
communication, cognitive sciences, ontology, and database modelling that support the
proposed framework of interoperability presented in Section 3.5. In Section 3.6, we
depict the five phases and three levels of the ontology of geospatial data interoperability
57
Table 3: Examples of phenomena abstracted differently in independent topographical databases
NTDB1 VMap2 BC Digital Baseline Mapping (BCDBM)3
ON Digital Topographic Database
(ONDTD)4 BDTQ5
Information sur les terres et les eaux pour la province du
Nouveau-Brunswick6 - Waterbody - Watercourse - Irrigation Canal - Navigable Canal - Flooded area - Reservoir - Liquid depot/dump
- Lake/Pond - Lake subject to
inundation - River/stream - Coastline/shoreline
- Coastline - Ditch - Flooded land - Lake - River/stream
- Flooded land - Lake - River/stream
- Canal - Cours d’eau - Lac - Mare
- Canal - Rivière–trait double - Lac (?) - Littoral (?) - Lac de rivière (?)
- Wetland - Marsh/swamp - Marsh - Swamp
- Marsh/Fen - Milieu humide (végétation)
- Marais de canneberge (?) - Marais (?)
- Road - Limited access road
- Road - Car track
- Road - Accesway - Road
- Voie de communication
- Autoroute - Rue - Chemin - Route
- Artère (?) - Route collectrice (?) - Chemin local (?) - Chemin municipal (?) - Chemin d’accès aux
ressources naturelles (?) - Route en construction (?) - Rue (?)
- Vegetation - Trees - Orchard/plantation - Vineyard
- Wooded area - Vineyard - Orchard - Nursery
- Wooded area - Milieu boisé - Verger (aires
désignées)
- Clairière (?) - Bande défrichée (>100m) (?) - Pépinière (?) - Verger (?) - Rangée d’arbres (>100m) (?) - Zone boisée (>2m haut) (?)
- Railroad - Railroad - Railroad siding/railroad
spur
- RailLine - Rail line - Voie ferrée - Chemin de fer (?) - Triage de chemin de fer (?)
- Bridge - Obstacle to air
Navigation
- Bridge/overpass/viaduc (?) - Bridge - Trestle
- Bridge (roadway) - Bridge (railway) - Culvert (roadway) - Culvert (railway)
- Pont - Pont d’étagement
- Pont (?) - Ponceau (petit) (?)
Spatial pictogram descriptions: :0D ; :1D ; :2D ; ?:unknown geometry ; :multiple geometry ; :alternate geometry (see (Bédard, 1999b) and (Brodeur et al.,
2000) for more details). (Natural Resources Canada, 1996); 2(VMap, 1995); 3(BC Ministry of Environment Lands and Parks (Geographic Data BC), 1992); 4(OBM,
1996); 5(Québec, 2000); 6(New Brunswick, 2000).
58
supported by the framework. In Section 3.7, we develop the mapping between a concept
and a conceptual representation. Section 3.8 introduces the approach we are working on
to assess semantic proximity between a geospatial concept and a geospatial conceptual
representation. This is called geosemantic proximity as it considers in a holistic way the
semantic, spatial, and temporal descriptions of geospatial concepts and geospatial
conceptual representations. Section 3.9 concludes this chapter and indicates future work.
3.4 Interoperability and the human communication process
The framework of geospatial data interoperability that will be discussed in the following
sections is supported by some theoretical notions used in the fields of human
communication and perception, ontology, data modelling, and, more specifically, by the
notions of context and semantic proximity found in artificial intelligence. This section
describes these concepts.
3.4.1 Communication process
The study of the different aspects of communication between all kinds of systems
originated in 1948 with Norbert Weiner’s ideas concerning cybernetics and the numerous
adaptations that followed, yielding new insights into human communication (Blake and
Haroldsen, 1975; Campbell, 1982; Sowa, 1984; Weiner, 1950). We believe that the
communication process between humans represents an ideal model of what
interoperability should be. It begins when an individual has something in mind
representing real-world phenomena and wants to communicate it to someone else.
The communication process is described as being composed of a source, a signal, a
communication channel, a destination, a possible source of noise, and feedback. In the
first stage of the communication process, the source has a representation of real-world
phenomena that corresponds, for humans, to their cognitive model (Bédard, 1986; Denes
and Pinson, 1971; Logie and Denis, 1991; Schramm, 1971b) and, for machines, to a part
of their physical memory. This model is developed through the direct observation
59
(detection and recognition of raw signals) of phenomena and from the observation of
preprocessed, intentional semantic signals from others (Bédard, 1986). The source selects
information to be communicated, transforms it into signals such as words (spoken or
written) or data, and organizes them into a message that is placed in the communication
channel (Campbell, 1982; Denes and Pinson, 1971). This is known as the encoding
process. It follows rules that are more or less formal depending on the context and the
nature of the encoder (human or machine). The resulting signals or data are physical
descriptions free of any intrinsic signification (Bédard, 1986; Campbell, 1982; Cherry,
1978; Schramm, 1971a). Signification is what signals evoke to the source and to the
destination, respectively and, consequently, cannot be transmitted (Cherry, 1978). Signals
are the mediation component between the source and the destination, but their intended
meaning is not embedded within the signal. Once the destination has received the
message, the decoding process starts as it tries to understand the incoming signal, that is,
to find its intended meaning, and the communication process ends with the creation of the
destination’s evoked concepts (Bédard, 1986; Schramm, 1971a). The communication
process works properly when the destination’s evoked concepts are sufficiently
isomorphic to the source’s concepts, that is, when both represent the same real-world
phenomena. Feedback, or retroactive communication may be used to improve
isomorphism. The notion of commonness (Schramm, 1971a) is basic in the
communication process: in other words, the source and destination shall have a common
set of knowledge and signals to make the process work properly. The destination relies
on signals and referents (i.e. knowledge and beliefs of the world) to recognize the
message (Denes and Pinson, 1971; Krech and Crutchfield, 1971).
3.4.2 Perception and cognition
In the communication process, as well as in the case of interoperability, perception and
cognition play a leading role in building, structuring, and disseminating human
information. As mentioned previously, human communication begins when someone
wants to transmit information in mind to someone else. As such, cognitive models are
basic elements of human-to-human communication. They are built up from physical
60
signals captured through our sensory systems, which generate perceptual states
(Barsalou, 1999). Then, the human selective attention extracts only subsets of interest
among these perceptual states (Barsalou, 1999; Krech and Crutchfield, 1971; Sears and
Freedman, 1971) and stores these permanently in memory as perceptual symbols
(Barsalou, 1999).
The literature essentially recognizes two modes of perceptual symbol representations.
The first corresponds to modal or analogical representations (Barsalou, 1999; Kettani and
Moulin, 1999; Kosslyn, 1980) and is isomorphic to the perceptual state such as an image
captured by the visual sensory system. It consists of the reproduction or conversion of
raw signals into memory. The second mode corresponds to amodal or propositional
representations (Barsalou, 1999; Kettani and Moulin, 1999) and refers to tacit knowledge
(Pylyshyn, 1981). Inspired by logics, statistics, mathematics, and computer science, it
corresponds to structures such as feature lists, frames, schemata, and semantic nets
(Barsalou, 1999; Lehmann, 1992). Moreover, Barsalou (1999) brought a new definition
of a perceptual symbol as a record of neural activation resulting from the perception
process where the neural system, which is common to imagery and perception, underlies
the conceptual knowledge. Perceptual symbols are more likely qualitative and functional,
and are not stored independently from others in memory (Barsalou, 1999; Krech and
Crutchfield, 1971).
In Barsalou’s theory, a perceptual symbol corresponds to a concept and behaves like a
simulator that generates conceptual representations (i.e. simulation of the concept). This
notion of simulator is similar to a kind of dynamic translator generating translations of
the concept on the fly for a specific use. Concepts are made of cognitive elements, which
are not directly accessible, and a translator function that encapsulates these elements. The
translator function reproduces these cognitive elements in the context of data processes.
A concept can only be communicated via selected data elements translated into physical
signals, which are conceptual representations. A huge literature in the field of semiology
exists which defines rules about the best ways to create conceptual representations. For
instance, a concept corresponding to water area can be translated in a number of
61
conceptual representations such as waterbody, coastline, lake, and river/stream
represented by either surfaces or lines in different colors, shading and line styles (all
constituting possible simulations of water area). The concept through its translator
function can also recognize conceptual representations associated with the concept
(Barsalou, 1999). This is carried out by matching a simulation of the concept (i.e. a
generated conceptual representation) with the incoming conceptual representation; when
the matching fails, a new concept is instantiated.
3.4.3 Ontology and conceptual modelling for database development
The communication process, as a proper model to depict interoperability, involves real-
world phenomena along with their descriptions: the different human cognitive models
and physical models such as signals. Real-world phenomena, their identification, and
description have been studied within the realms of ontology and conceptual modelling for
database development.
In its philosophical meaning, ontology stands for the description of the world in itself
(Peuquet et al., 1998); a model and an abstract theory of the world (Smith and Mark,
1999); and, the science of being, of the type of entities, of properties, of categories, and
of relationships that compose reality (Bittner and Edwards, 2001; Lehmann, 1992;
Peuquet et al., 1998; Smith and Mark, 1999). It is described from two perspectives. The
first, called formal ontology, refers to shared structures between scientific domains such
as identity, plurality, and unity. The second, called material ontology, relates to the
conditions that are necessary to belong to an entity type within a given domain (Peuquet
et al., 1998). In artificial intelligence (AI), Gruber (1993a) and Gruber (1993b) defines an
ontology as “an explicit specification of a conceptualisation” and Guarino (1998) as “a
logical theory accounting for the intended meaning of a formal vocabulary.” AI
definitions of ontology and the material ontology in philosophy tend to follow a similar
objective. As shown in Table 3, there may be multiple ways of describing a single
conceptualization. This is particularly reflected in Gruber’s definition of ontology, which
admits that each explicit description (i.e. specification, vocabulary) consists of one
62
specific ontology. Guarino’s definition goes further by considering the ontological
relationship (i.e. the “intended meaning”) that exists between a description (i.e. the
vocabulary) and the concept it evokes. Consequently, we can admit a relationship
between the philosophical and the AI notions of ontology only if we consider that
“conceptualization” in the AI context corresponds to the philosophical definition of
ontology (Rodriguez, 2000). We can also accept that descriptions from multiple
ontologies (as in AI) can ontologically refer to the same concept (or phenomenon). While
there is an obvious connection between the philosophical and the AI definition of
ontology, the latter is considered as the main orientation of this thesis. As such, we refer
to ontology as a formal representation of phenomena with an underlying vocabulary
including definitions and axioms that make the intended meaning explicit and describe
phenomena and their interrelationships.
As applied to databases, a conceptual model is a simplified, abstract representation of a
portion of reality resulting from a data-centered analysis of users’ interests (Bédard,
1999a; Simsion, 2001). It results from reflective thinking to better understand that part of
reality and to communicate information about it (Collongues et al., 1987; Simsion, 2001).
Bédard (1999b) mentions that conceptual models serve as tools for thinking,
communication, development, and documentation. Conceptual models retain, organize,
and store only features of interest in terms of general categories, object classes,
properties, relationships, generalizations, aggregations, roles, constraints, behaviour,
geometry, temporality, and so on, either in a lexical (e.g., EXPRESS, Bakus-Naur
formalism) or graphical formalism (e.g., entity relationship formalism, UML) (Brodeur et
al., 2000). Ideally a data dictionary defining the semantics of each schema component is
included in a conceptual model, which makes the intended meaning of the modelled
feature explicit. In a conceptual model, objects must be unique in the context of the
database and, as such, characterized by only one combination of properties and
relationships (Collongues et al., 1987; Simsion, 2001). Because of the specific
perspective for which a data model is elaborated or because of the experience of the data
modeler, there is usually more than one data model to express the same part of reality
(Collongues et al., 1987; Simsion, 2001). The problem emerging from the existence of
63
different conceptual models is to establish the relationship between models describing the
same set of phenomena. Based on the definitions of data models and ontology, we see
ontology as a theoretic layer underlying conceptual modelling providing the means to
link classes and instances depicting the same part of reality differently but in similar
ways. This linkage of classes and instances is made possible through the analysis of
intrinsic and extrinsic properties (see Section 3.8) that give them their “identity.”
3.4.4 Context
The situation in which a real-world phenomenon is perceived, abstracted, and used
governs its description. This context affects the definition of concepts and conceptual
representations. Context is widely recognized as a fundamental notion in semantic
interoperability. It provides concepts and conceptual representations with real-world
semantics (Kashyap and Sheth, 1996; Ouksel and Sheth, 1999; Wisse, 2000). As reported
in (Kashyap and Sheth, 1996), however, context has been associated with various other
ideas such as knowledge for reasoning about other systems (Ouksel and Naiman, 1993);
signification, content, organization, and properties about data (Sciore et al., 1992); a
characteristic associated with a partition of an ontology (Guha, 1990); and membership in
a database, relationship, export schema, or internal schema (Sheth and Kashyap, 1992).
Moreover, Frank’s five tiers of ontology (Frank, 2001) introduce context as a main
component of the social ontology. Context has influence at the conceptual level as well as
at the implementation level. On the one hand, context drives how phenomena are
perceived and abstracted, resulting in different object classes, properties, geometries,
temporalities, relationships, and so on. On the other hand, it also acts at the
implementation level, for instance via specific data capture specifications such as “rapids
depicted on a map by three points less than 100 metres apart and stretching over a
distance of more than 100 metres are consolidated into a line.” Following a conceptual
orientation, context is in our approach associated with the manner an individual abstracts
real-world phenomena; the description of these phenomena is organized into intrinsic and
extrinsic properties of the corresponding concepts and conceptual representations.
Intrinsic properties refer to the literal meaning while extrinsic properties refer to the
64
dependencies with other concepts or conceptual representations (Guarino and Welty,
2000a). For example, the intrinsic properties of a road concept can be described by its
classification, surface type, status, and geometry while extrinsic properties can be the
relationships the road has with concepts such as built-up areas, bridges, dams, fords,
tunnels, and the behaviour of the road in different situations. These together describe the
context of the phenomenon. An important challenge in semantic interoperability of
geospatial data resides in increasing context-based reasoning capabilities.
3.4.5 Semantic proximity
In interoperability, as in the communication process, an incoming conceptual
representation must be recognized and given a specific meaning. This implies matching
the conceptual representation with a concept in the destination’s cognitive model. The
matching operation analyzes the context of both the conceptual representation and that of
the concept to retrieve commonalities between them. This issue has been studied in AI
thanks to the notion of semantic proximity which expresses the similarity between
conceptual representations such as in semantic networks (Cohen, 1982; Lehmann, 1992),
knowledge networks (Frankhauser et al., 1991; Frankhauser and Neuhold, 1992), and
context-based approaches (Kashyap and Sheth, 1996; Kashyap and Sheth, 1998).
Generally speaking, a semantic network is an interconnected node–arc-like structure such
as a frame system, a relational graph, or a hierarchy (e.g., lattice, tree, or acyclic graph) in
which nodes represent conceptual representations and arcs the relationships that exist
between such representations (Cohen, 1982; Lehmann, 1992). Semantic proximity deals
with the semantic relatedness between two different conceptual representations. The
closest conceptual representation to a given one is typically the one having the smallest
conceptual distance i.e. the shortest path to a given conceptual representation in a
representational space (Rodriguez, 2000). Frankhauser et al. (1991) and Frankhauser and
Neuhold (1992) implement knowledge networks in which conceptual representations
(nodes) are linked to others by associations of the type generalization/specialization,
negative association, or positive association. A coefficient expressing the strength
between the two conceptual representations is assigned to each association. For two non-
65
associated conceptual representations, a relationship and strength are inferred by
traversing the network from one conceptual representation to the other, analyzing the
nature and the strength of each relationship. In (Kashyap and Sheth, 1996; Kashyap and
Sheth, 1998), the semantic proximity between two conceptual representations
corresponds to a comparison modulated by (1) the context of the comparison; (2) the
abstraction or mapping used to associate with each respective domain; (3) the domains
(i.e. the possible values); and (4) the state vectors (i.e. the values they hold at a given
time). Predicates such as semantic resemblance, semantic relevance, semantic relation,
semantic equivalence, and semantic incompatibility are used to express qualitatively the
semantic proximity that exists between two conceptual representations.
3.5 A conceptual framework of geospatial data interoperability
A few attempts have been made to automatically overcome semantic heterogeneity and
increase the interoperability of geospatial data, notably the Semantic Formal Data
Structure (Bishr, 1997), the Matching Distance model (Rodriguez, 2000), and the Isis
approach (Benslimane, 2001). In (Bishr, 1997), interoperability is defined as “the ability
of a system or components of a system to provide information sharing and inter-
application co-operative process control.” Accordingly, the Semantic Formal Data
Structure was elaborated to reconcile heterogeneous representations of a unique concept
combining loosely coupled federated schemata and a Proxy Context mediator. In this
approach, users formulate queries in their own vocabulary and submit them to the
Context Mediator. Subsequently, the Context Mediator translates these queries according
to a shared context definition (i.e. a federated schema) and passes the queries to the data
source export schema to retrieve the data complying with the user’s queries.
Rodriguez (2000) defines the semantic interoperability problem as the “identification of
semantically similar objects belonging to different databases and the resolution of their
schematic differences” (i.e. differences between the database schemata). She proposes the
Matching Distance model to evaluate the semantic similarity between object classes of
geospatial features. The variability and resemblance of class functions, parts, and
66
attributes are analyzed to compute a quantitative measure of similarity between object
classes. This model makes use of an acyclic graph with is-a and part/whole relationships,
and ontologies in which class definitions are given.
In the Isis approach (Benslimane, 2001), interoperability is defined as an operation in
which (1) clients and data sources adopt a common representation model, (2) clients and
data sources share a mutual understanding of common features, and (3) a process can
dynamically transform one object representation into another, adapting the semantics and
structure to the user’s needs. Based on this definition, the Isis solution was developed
following a context-based mediation orientation. In this solution, each independent
database is associated with a context co-operation schema. A context co-operation
schema results from the interpretation of a database export schema with a unique
reference context, which is derived from a domain ontology. Accordingly, data from one
database can be translated into one another by the use of their respective context co-
operation schema.
In the light of these somewhat different approaches, we believe that a more global
framework would enhance understanding of geospatial data interoperability (Egenhofer,
1999) and provide a theoretical foundation to better appreciate each contribution and
foster new ones. In this section, we revisit the notion of interoperability within the
broader scope of human communication and cognition. People interact using different
representations of observable phenomena but regularly end up understanding each other.
We believe that interoperability consists in a process similar to human communication
(Bédard, 1986; Darnell, 1971; Denes and Pinson, 1971; Lippmann, 1971; Schramm,
1971a; Schramm, 1971b) in which independent systems automatically manipulate,
exchange, and integrate data coming from each other. This assumption motivated the
creation of the conceptual framework of geospatial data interoperability, which is
proposed hereafter, and of the concept of geosemantic proximity introduced in section
3.8.
67
First, let us assume the following situation. An individual, hereafter called a user agent
(Au), is looking for information about the hydrographic network in the area of the City of
Sherbrooke. He or she launches a query on a search engine with the keywords lake, river,
and Sherbrooke targeting a geospatial database, hereafter called the data provider agent
(Adp). Adp receives and interprets the request, searches for related information and,
referring to the content it is aware of, sends a response to Au. In other words, Adp provides
the main watercourses and waterbodies in the vicinity of Sherbrooke (for instance
Lac des Nations , Magog River , and Saint-François River ). These elements
correspond exactly to Au’s query.
This situation illustrates what interoperability should be between two agents. In this case,
interoperability is associated with an interpersonal communication (Blake and Haroldsen,
1975) (i.e. a dialogue-like communication) between two agents, each of them using its
own vocabulary to express abstractions of real-world phenomena. As long as the two
agents have a common background and a common set of symbols, they regularly end up
understanding each other (Bédard, 1986; Schramm, 1971a).
Let us review this situation described as a communication process between the two
agents. Figure 12 depicts in greater detail the interaction between them. First, there is the
topographic reality as it exists at a given time and about which Au is looking for
hydrographic information (represented by R in the model).
Second, Au’s cognitive model of R is built from observed signals and his or her frame of
reference–the set of rules and knowledge he or she used to abstract phenomena. Au’s
cognitive model consists of properties that are judged significant. These properties are
joined together and structured within concepts. A concept is a simplified version of a
real-world phenomenon or part of it that does not exist in reality; it is entirely fictional
(Sowa, 1984). It is an abstract notion that denotes the “picture” an agent has in mind
(Bédard, 1986; Denis, 1994; Kettani and Moulin, 1999; Lippmann, 1971; Logie and
Denis, 1991; Schramm, 1971a). All concepts that Au has in mind constitute his or her
68
representation of reality, that is, his or her cognitive model, which is identified by R’ in
the framework.
As they are only abstractions, concepts cannot be communicated directly between agents
and, as such, they must be transformed into physical representations. This is known in the
communication process as the encoding operation. In this operation, only properties that
adequately translate a concept in a given situation are selected. These are then transferred
into signals of different types (words, abbreviations, punctuation, symbols, pictograms,
etc.), which are aligned in a specific order according to a set of rules (i.e. a grammar) to
build a conceptual representation. The encoded conceptual representation becomes the
physical component depicting partly or wholly the concept to which it refers. This
conceptual representation forms the third expression of the reality. It is illustrated by
Lake, River, and Sherbrooke in Figure 12 and is identified by R’’. It designates the data
transmitted and used for interoperability. Conceptual representations are released on the
communication channel exempt of any of Au’s intended meaning and the message
representing Au’s request—“(Lake or River) in Sherbrooke?” (see Figure 12)—travels to
its destination.
At the destination, Adp initiates the decoding operation, which aims at recognizing the
received conceptual representations and at assigning them appropriate meaning. In our
framework, this task is assigned to concepts (this will be discussed in more detail in
Section 3.7). Under perfect conditions, conceptual representations will induce concepts in
Adp that are isomorphic to Au’s concepts. However, in most situations, conceptual
representations will induce in Adp concepts of similar meaning to Au’s concepts. The set
of Adp’s concepts constitutes the fourth expression of reality. It is denoted by R’’’ and
illustrated by Waterbody , Watercourse , and Sherbrooke in the theoretical model
(Figure 12).
When conceptual representations have been recognized, Adp initiates the retrieval of
information complying with Au’s interests. However, as is the case in R’, concepts and
even tokens (i.e. instances of concepts) matching Au’s request cannot be transmitted
69
directly. Similarly to Au’s cognitive model, concepts consist of internal representations
that are hidden to external agents. Consequently, concepts and tokens must be encoded
into conceptual representations and placed in a reply message on the communication
channel to reach Au. These encoded conceptual representations constitute the fifth
representation of the reality. It is denoted by R’’’’ and illustrated by Lac des Nations ,
Magog River , and Saint-François River in Figure 12.
Once the reply reaches its destination, Au starts the decoding operation in order to
recognize the incoming message; he or she analyzes the conceptual representations
included in the message to assess if they infer the previous concepts in R’. If so, we say
that interoperability occurred during the interaction between the two agents. This means
that interoperability is not simply a one-directional communication process, but it
consists in a bi-directional process as clearly demonstrated in the conceptual framework.
It includes feedback in both directions which ensures that messages issued by Au and Adp
have reached their destination and have been understood properly. We believe that this
issue of bi-directional communication process takes on a fundamental character in the
search for a solution to semantic interoperability of geospatial data.
3.6 Ontology of geospatial data interoperability
As introduced in Section 3.4, ontology refers to a formal and accepted representation of
phenomena with an underlying vocabulary, including definitions that make the intended
meaning explicit. In the framework presented above, we have demonstrated that reality
takes various configurations beginning with the reality itself, human beings’ cognitive
representations (or their physical counterpart in machines), and physical representations.
The proposed conceptual framework introduces five different representations of the same
reality (R, R’, R’’, R’’’, and R’’’’). Each of these representations is a distinct facet of the
reality that occurs in the proposed framework of geospatial data interoperability. Frank
(2001) has already introduced a subdivision of ontology called the five tiers of ontology,
in which a distinction is made between the different abstraction levels that an
70
Figure 12: A conceptual framework of geospatial data interoperability
(adapted from the communication process of geographic information systems in (Bédard, 1986))
71
agent has to deal with when building its cognitive model, namely: physical reality,
observation of physical world, objects with properties, social reality, and subjective
knowledge. Here, we introduce a different and complementary subdivision of ontology
for the purpose of geospatial data interoperability composed of the five ontological
phases of geospatial data interoperability and the three levels of ontology presented
hereafter. These new subdivisions will allow us to view interoperability from a different
angle.
3.6.1 The five ontological phases of geospatial data interoperability
The five ontological phases of geospatial data interoperability consist of the five
different facets of reality that appear in the framework of geospatial data interoperability.
The first ontological phase consists of the reality itself (R), which is beyond description.
Each phenomenon has its own identity that makes it distinguishable from the others.
The second ontological phase is Au’s cognitive model of the reality, R’. It gathers all the
concepts that take place in Au’s memory. This phase results from direct observations (i.e.
from our sensory motors) and indirect observations (i.e. mechanical sensors, information
captured by other agents) of reality. It is a partial description of reality and can be viewed
as a subset of R corresponding to Au’s affordances (Gibson, 1979). This ontology is Au’s
internal representation of reality.
The third ontological phase is the set of conceptual representations R’’ (objects and
object classes) that are used to signify concepts of Au’s ontology. This ontological phase
uses a vocabulary, which accurately specifies the intended meaning attributed to the
different concepts. Each conceptual representation describes a concept within a specific
context. Therefore, more than one conceptual representation can refer to a given concept,
for example vegetation, tree, and wooded area (see Table 3 for other examples). It is also
possible that one conceptual representation refers to different concepts, depending on the
context in which they are used. This refers to the notion of polysemy. For example,
72
Bridge may refer to a road infrastructure, a hazard to air navigation, or even a hazard to
marine navigation.
The fourth ontological phase consists of the set of Adp’s concepts and refers to the
database’s internal representation of reality R’’’. In the theoretical framework, database
agents such as Adp behave as cognitive agents in which descriptions of concepts are
internal representations of real-world phenomena that serve as interlingua in the
interaction with other agents. Database concepts also include functions that can afford
reasoning capabilities such as production and recognition of conceptual representations
(this will be discussed in Section 3.7).
The fifth and last ontological phase consists of the conceptual representations R’’’’ that
Adp’s concepts can produce. Like R’’, this ontological phase consists of physical
representations which use a vocabulary that aims to deliver the intended meaning of Adp’s
concepts (R’’’).
In the five ontological phases of geospatial data interoperability, we consider that each
ontological phase includes a set of properties describing the identity of phenomena. This
set of properties allows the binding of the different representations (R’ to R’’’’) with the
phenomena. Moreover, it is our interpretation that the five tiers of ontology (Frank, 2001)
deal more with the steps involved in cognition, which apply to R’ and R’’’ specifically.
In this regard, the five ontological phases of geospatial data interoperability deal more
with reality and its different representations that occur in the interaction between two
agents. Consequently, we feel that Frank's five tiers of ontology and our five ontological
phases of geospatial data interoperability are complementary.
3.6.2 Levels of ontology
Reality is usually abstracted and described with more or less details depending on the
accuracy needed in a given situation. Accordingly, the meaning of concepts and
conceptual representations is described from more general to more specialized when used
73
in a global context, within a scientific community or within a specific application,
respectively. In the literature, authors refer typically to global ontology (Bergamaschi et
al., 1999; Guarino, 1998; Kahng and McLeod, 1998; Kashyap and Sheth, 1996; Kashyap
and Sheth, 1998; Sheth, 1999; Smith, 1999), domain ontology (Fowler et al., 1999;
Guarino, 1998; Kashyap and Sheth, 1996; Kashyap and Sheth, 1998; Sheth, 1999; Smith,
1999), and application ontology (Guarino, 1998; Kahng and McLeod, 1998; Kashyap and
Sheth, 1998; Smith, 1999; Sycara et al., 1999). These levels of ontology (Figure 13) are
characterized by a different granularity in the abstraction and description of phenomena.
A comparison can be made with a dictionary describing a set of generic terms, a lexicon
that is a brief dictionary specialized in a given science or technique, and a glossary that
appears at the end of a book defining the specific meaning of terms used in the book. At
the coarser level of granularity, global ontology compiles concepts or conceptual
representations of a high and generic level of abstraction, independent of any specific
domain. Examples of such ontologies are Wordnet
(http://www.cogsci.princeton.edu/~wn/), CYC (http://www.cyc.com/), TERMIUM Plus
(http://termiumplus.bureaudelatraduction.gc.ca), and Le grand dictionnaire
terminologique (http://www.granddictionnaire.com). At the middle level of granularity,
domain ontology makes an inventory of concepts or conceptual representations which are
accepted and shared within an information community. An example of this level of
ontology is the National Standards for the Exchange of Digital Topographic Data:
Topographic Codes and Dictionary of Topographic Features (Canadian Council on
Surveying and Mapping, 1984), which compiles, defines, and structures a set of terms
describing topographic phenomena. At the most detailed level of abstraction, an
application ontology lists, defines, and organizes concepts or conceptual representations
specific to an application. This kind of ontology is documented in many ways, for
instance application schema (ISO/TC 211, 2002), data dictionary, feature catalogue
(ISO/TC 211, 2001), repository (Brodeur et al., 2000), and data specification. For
example, let us mention the National Topographic Data Base—Standards and
Specifications (Natural Resources Canada, 1996) (http://scar.cits.rncan.gc.ca/bndt/),
VMap Specifications (VMap, 1995), British Columbia Specifications and Guidelines for
Geomatics (BC Ministry of Environment Lands and Parks (Geographic Data BC), 1992),
74
Ontario Digital Topographic Database-1:20,000, 1:10,000-A Guide for Users (OBM,
1996), Base de données topographiques du Québec (BDTQ) à l’échelle 1/20 000—
Normes de production (Québec, 2000), BD TOPO and BD CARTO
(http://www.ign.fr/fr/MP/BDGeo/), ATKIS-Digital Topographic Map 1:10,000
(http://www.atkis.de), and USGS-DLG (http://rockyweb.cr.usgs.gov/nmpstds/
dlgstds.html).
Figure 13: The three levels of ontology
As illustrated in Figure 13, the navigation between the different levels of ontology
follows a bottom-up approach. An agent (Au or Adp) initiates its reasoning using its own
knowledge and, if needed, that of other specific knowledge bases to which it has direct
access. This corresponds to the application ontology level. When required, domain
ontologies can be accessed to get shared conceptual representations within a specific
community to facilitate communication between agents. Domain ontology can also be
linked to other related domain ontologies to expand this level of knowledge. Finally,
domain ontologies can access global ontologies to get conceptual representations of
75
common usage. Again, such an ontology may be associated with others on the same level
to expand the global representation and knowledge of reality.
Accordingly, we propose, as a complement of the framework, the ontology of geospatial
data interoperability viewed as a two-dimensional subdivision. One dimension consists
of the five ontological phases of geospatial data interoperability and the other in the
three levels of ontology. This ontology of geospatial data interoperability consists of the
various configurations of real-world phenomena descriptions that take place in
interoperability. It shows the complexity and the components involved in geospatial data
interoperability. As shown in our conceptual framework, geospatial data interoperability
is not simply being able to access geospatial data in a given format and schema and use it
with a GIS system. Even if the geospatial data is transferred properly on your GIS
system, they have to mean something otherwise they are useless. Geospatial data
interoperability encompasses various abstractions and understandings of geospatial
phenomena that are in interaction thanks to the communication process in which multiple
ontologies of different granularities have to be considered in every phase of geospatial
data interoperability. As such, the ontology of geospatial data interoperability helps to
grasp and describe as a whole the scope of interoperability of geospatial data. Also, along
with the conceptual framework, it helps to understand all the relationships that exist
Figure 14: Ontology of geospatial data interoperability
76
between the real-world phenomena and their various descriptions. The ontology of
geospatial data interoperability is then organized as shown in Figure 14, in which OJ
I
identifies one component of the ontology signifying the level I of the ontological phase J.
However, each component of this ontology may have different levels of relevance
according to the different situations in which different geospatial databases take part.
3.7 Relationship between concept and conceptual representation
With respect to the interoperability of geospatial data, encoding and decoding functions
are crucial components since they are responsible for generating and recognizing
geospatial conceptual representations, respectively. They are, to some extent, translation
functions. Generally speaking, translators have been typically implemented as
middleware components performing the conversion of a dataset from one data model and
data format to another. Such an approach assumes that a correlation between data models
and structures is already available or can be made in a timely and practical manner. This
situation is acceptable when dealing with a small amount of databases. This is not the
case when navigating on the Internet and dealing with a large number of datasets, since
we have to deal with a practically infinite number of representations of the reality.
Consequently, encoding and decoding functions are strongly tied to concepts in the
framework of geospatial data interoperability outlined in Section 3.5. Therefore, a
concept may be viewed as consisting of the set of knowledge with the accompanying
processes that an agent maintains about a phenomenon, which generate and recognize
different representations of the concept. This position is supported by Barsalou’s theory:
“… a concept is equivalent to a simulator. It is the knowledge and accompanying
processes that allow an individual to represent some kind of entity or event
adequately. A given simulator can produce limitless simulations of a kind, with
each simulation providing a different conceptualization of it. Whereas a concept
represents a kind generally, a conceptualization provides one specific way of
thinking about it.
77
… Once a simulator becomes established in memory for a category, it helps
identify members of the category on subsequent occasions …” (Barsalou, 1999)
This approach to the assessment of geospatial data interoperability is in itself different
from those already delineated in the geospatial information community, namely in
(Benslimane, 2001; Bishr, 1997), and appears to be better aligned with human
communication and cognition.
Accordingly, a concept appearing in R’ and R’’’ behaves as a “simulator” which can
generate different simulations of itself, that is, conceptual representations, as well as
recognize a conceptual representation that is bound to it. Essentially, a concept of the
brain or the machine is made of hidden data elements that are encapsulated by a
simulation function (shown in Figure 12 by a dotted ellipse that encompasses the
concept). The simulation function forms the main interface to access a concept.
This simulation function performs the encoding to produce conceptual representations
such as those appearing in R’’ and R’’’’. It is in some way a translation process that goes
from a hidden and more neutral representation—the concept—to a language-dependent
representation—the conceptual representation. This function selects and puts together
properties that adequately describe the concept in a specific situation. It uses a
vocabulary, punctuation, and grammar in order to build conceptual representations.
In order to produce conceptual representations, the concept’s simulation function
searches to find the best way to describe the concept in a given situation. As such, the
function has to take into consideration other concepts of similar meaning. These concepts
are abstracted from a different context and are all organized in the same ontology.
Like perceptual symbols (see Section 3.4), concepts are not stored independently of
others in R’ and R’’’. On the contrary, they are to some extent a kind of attractor within a
dynamic network (as an ontology structure). When a new concept is introduced into the
network, existing similar concepts try to attract this new concept, to place it nearby, and
78
to build the necessary links with it. Hence, similar concepts are clustered together,
expressing some sort of proximity. On the one hand, a concept is defined as a discrete
notion (Sowa, 1984) that takes part in an ontology as a phenomenon in reality. On the
other hand, ontology is seen as a more continuous but yet partial representation of reality
linking multiple concepts together. The concept’s simulation function takes advantage of
this ontology structure to produce and recognize conceptual representations that are best
suited to any given situation.
The simulation function also implements the decoding operation, that is, recognition of
conceptual representations that are bound to the concept. As stated by Barsalou, “… if the
simulator for a category can produce a satisfactory simulation of a perceived entity, the
entity belongs in the category.” So, in order to recognize a conceptual representation as a
member of a concept, a concept must be able to produce such a conceptual
representation. As illustrated in Figure 12, Waterbody and Watercourse are Adp’s
concepts that can produce the Lake and River conceptual representations and, as
such, Lake and River are bound to the Waterbody and Watercourse concepts.
A conceptual representation describes a concept within a specific context. Context is
introduced here as a metaconcept that is omnipresent in the representation of real-world
phenomena (Wisse, 2000). A context is as imaginary and fictitious as is a concept. It
consists of elements that influence the use of a concept and provide its real signification.
Like Ouksel and Sheth (1999), we consider the context as the main vehicle that provides
real-world semantics. The context description is usually embedded in the components
defining and characterizing conceptual representations (Wisse, 2000) such as object
classes, properties, geometries, temporalities, domains, relationships, behaviours, and
memberships to datasets or ontologies.
Two conceptual representations of the same concept express a contextual variation.
Context adds a degree of freedom to the ontological representation of concepts. It is the
context, which drives us to use different conceptual representations for the description of
real-world phenomena, and, for that reason, it is also a notion related to ontology. Let us
79
return to the Bridge example, which is a concept describing a road infrastructure, a
hazard to air navigation, or a hazard to marine operations (in three different contexts). In
this example, context consists in the elements that influence the use of the concept and
that specify its meaning. As we can observe in the example, the description of the context
is typically embedded in the properties of the conceptual representations. Each
conceptual representation has its own specific properties, such as structure type, the
elevation of the highest point, or the clearance between the watercourse and the bridge,
respectively. Independently of their specific descriptions, they all refer to the same
concept, which ontologically links all of these representations.
The framework and related ideas presented so far aim at situating the broad picture of
geospatial data interoperability. In the next section, we define the notion of geosemantic
proximity and identify where it applies within our framework.
3.8 Geosemantic proximity
The simulation function presented in the previous section generates and recognizes
conceptual representations and, thus, evaluates the semantic proximity that stands
between a concept and a conceptual representation, namely between R’ and R’’, R’’ and
R’’’, R’’’ and R’’’’, and R’’’’ and R’. In the semantic interoperability of geospatial data,
Bishr (1997) and Rodriguez (2000) mentioned that schematic heterogeneity can be solved
only between semantically similar representations. Because it evaluates the semantic
proximity, the simulation function becomes a key element of the conceptual framework
and also a prerequisite to solving schematic heterogeneity. Consequently, we develop the
notion of geosemantic proximity to concurrently assess the semantic, spatial, and
temporal similarities (as components of a geosemantic space; see Figure 15) between a
geospatial concept and a geospatial conceptual representation. Even if a geospatial
concept and a geospatial conceptual representation have the same semantics, their spatial
and temporal definitions may differ in several ways (Figure 16). For instance, the
geometry of a building in a dataset may refer to the precise footprint of the basement
while the geometry of a building in another dataset may refer to the precise footprint of
80
the roof. The footprint may be geometrically delineated with more or less details because
of different geometric depiction constraints. Also, in a third dataset, the same building
may be represented as a point and be represented in a fourth dataset as a generalized
surface, where all details smaller than 10% of the width or length of the building are
ignored because of the difference in geometric granularity (e.g. different scales). All
these datasets provide buildings with the same semantics from a pure “object-
class/attribute” point of view (as it is usually considered in semantics proximity analysis)
but they have different meanings from a geometric point of view and such difference is
explicitly taken into account by what we called geosemantic proximity analysis.
Figure 15: Geosemantic space
Figure 16: Various semantics of a building’s geometric representation
According to this approach, work is currently underway to develop a methodology and a
computational model of geosemantic proximity that will end up with geosemantic
81
proximity predicates that are homomorphic with current spatial and temporal topological
predicates (Allen, 1983; Clementini and Di Felice, 1996; Egenhofer, 1993; Egenhofer et
al., 1994). In this methodology, geospatial concepts and geospatial conceptual
representations are compared to segments on a semantic axis made of an interior and a
boundary, and geosemantic proximity consists essentially of the intersection of their
respective contexts. On the one hand, the interior of a concept consists of its intrinsic
properties that are components providing literal meaning (e.g. identification, attributes,
attribute values, geometries, temporalities, and domain). On the other hand, the boundary
of a concept consists of its extrinsic properties that are components providing meaning
through relationships with other concepts (e.g. semantic, spatial, and temporal
relationships as well as behaviours). Consequently, intersection between intrinsic and
extrinsic properties leads to a set of geosemantic proximity predicates, as illustrated in
Figure 17. In a way similar to human reasoning, geosemantic proximity is then assessed
qualitatively taking into account the contexts of the respective representations (Kashyap
and Sheth, 1996).
This notion of geosemantic proximity is being elaborated based on ontology (Guarino,
1999; Guarino and Welty, 2000a; Guarino and Welty, 2000b), fiat boundaries (Casati et
al., 1998; Smith, 1994; Smith and Mark, 1999; Smith and Varzi, 2000), theories of
temporal (Allen, 1983) and spatial topology (Clementini and Di Felice, 1996; Egenhofer,
1993; Egenhofer et al., 1994), context (Bishr, 1997; Kashyap and Sheth, 1996; Wisse,
2000), and semantic similarity (Kashyap and Sheth, 1996; Ouksel and Sheth, 1999;
Rodriguez, 2000). Its detailed description will be addressed in more detail in the next
chapter.
82
Figure 17: Geosemantic proximity Predicates (where K is a concept and L is a conceptual representation)
83
3.9 Conclusion
In this chapter, we have revisited the definitions of geospatial data interoperability and
proposed a conceptual framework based on the cognitive and communication sciences.
The interpersonal communication process between two agents, including the underlying
internal representation of concepts along with encoding and decoding operations, appears
to provide a rich framework to better understand the issues involved in geospatial data
interoperability, especially when extended to human-to-computer communication and
computer-to-computer communication. Central notions involved in this communication-
based framework are concept, conceptual representations, ontology, context, and
proximity.
The description of the conceptual framework is improved with an ontology of geospatial
data interoperability, which is presented in two dimensions: the five ontological phases of
geospatial data interoperability and three levels of ontology. The former describes the
different configurations of reality involved in geospatial data interoperability. The latter
consists of a subdivision of the different levels of granularity used in the description of
real-world phenomena, typically identified by global ontology, domain ontology, and
application ontology.
In the proposed framework, two elements characterize the idea of concept. First, the data
component is hidden and not directly accessible by other agents. Second, a “simulation”
function encapsulates the data component and essentially acts as the main interface for
accessing the concept. This simulation function performs the encoding and decoding
operation as found in the communication process in order to produce or recognize
conceptual representations. It appears to be a fundamental element for the assessment of
geospatial data interoperability. In addition, geosemantic proximity is a constituent
component of the simulation function.
84
Finally, conceptual representations denote physical representations, which serve as
mediating components between agents. They are essentially context-dependent,
conveying a concept in a specific situation.
In furthering this research, the theoretical model of geosemantic proximity will be
developed in detail. This will necessitate a formalization of geospatial concepts and
conceptual representations that must be aligned with the notion of context. Then a
prototype will be designed and implemented as validation and as the experimental phase
of this research. The expected results of this research should lead to significant progress
concerning the assessment of geospatial data interoperability.
Acknowledgements
The authors wish to acknowledge the contribution of Natural Resources Canada – Centre
for Topographic Information in supporting the first author for this research and of the
GEOIDE Network of Centres of Excellence in geomatics, project DEC#2 (Designing the
Technological Foundations of Spatial Decision-making with the World Wide Web), as
well as the contribution of Mike Major for the English editorial review.
3.10 References
Allen, J F 1983 Maintaining Knowledge about Temporal Intervals. Communication of the
ACM, 26(11): 832-843
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y 1986 A Study of the Nature of Data Using a Communication-based
Conceptual Framework of Land Information. Ph.D. Dissertation, University of
Maine
85
Bédard, Y 1999a Principles of Spatial Database Analysis and Design. In P A Longley,
M F Goodchild, D J Maguire, and D W Rhind (eds) Geographical Information
Systems: Principles, Techniques, Applications and Management. New York, John
Wiley and Sons, Inc.: 413-424
Bédard, Y 1999b Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186
Benslimane, D 2001 Interopérabilité de SIG : la solution Isis. Revue internationale de
géomatique, 11(1): 7-42
Bergamaschi, S, S Castano, and M Vincini 1999 Semantic Integration of Semistructured
and Structured Data Sources. Sigmod Record, 28(1): 54-60
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Bittner, T, and G Edwards 2001 Toward an Ontology for Geomatics. Geomatica, 55(4):
475-490
Blake, R H, and E O Haroldsen 1975 A Taxonomy of Concepts in Communication. New
York, Hastings House Publishers
Brodeur, J, Y Bédard, and M J Proulx 2000 Modelling Geospatial Application Databases
using UML-based Repositories Aligned with International Standards in
Geomatics. In Proceedings of Eighth ACM Symposium on Advances in
Geographic Information Systems (ACMGIS) ACM Press: 39-46
Campbell, J 1982 Grammatical Man: Information, Entropy, Language, and Life. New
York, Simon and Schuster
Canadian Council on Surveying and Mapping 1984 National Standards for the Exchange
of Digital Topographic Data: Topographic Codes and Dictionary of Topographic
Features. Ottawa, Topographical Survey Division, Surveys and Mapping Branch,
Energy, Mines and Resources Canada
Casati, R, B Smith, and A C Varzi 1998 Ontological Tools for Geographic
Representation. In N Guarino (ed) Formal Ontology in Information Systems.
Amsterdam, IOS Press: 77-85
86
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Cherry, C 1978 On Human Communication: a Review, a Survey, and a Criticism.
Cambridge, Massachusetts, The MIT Press
Clementini, E, and P Di Felice 1996 A Model for Representing Topological Relationship
Between Complex Geometric Features in Spatial Databases. Information
Sciences, 90(1-4): 121-136
Cohen, P 1982 Model of Cognition: Overview. In P R Cohen, and E A Feigenbaum (eds)
The Handbook of Artificial Intelligence. HeirisTech Press: 1-10
Collongues, A, J Hugues, and B Laroche 1987 Merise - Méthode de conception. Paris,
Bordas
Darnell, D K 1971 Information Theorie. In J A DeVito (ed) Communication: Concepts
and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 37-45
Denes, P B, and E N Pinson 1971 The Speech Chain. In J A DeVito (ed)
Communication: Concepts and Processes. Englewood Cliffs, New Jersey,
Prentice-Hall Inc: 3-11
Denis, M 1994 Image et Cognition. Paris, Presses Universitaires de France
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273
Egenhofer, M, D M Mark, and J R Herring 1994 The 9-Intersection: Formalism and Its
Use for Natural-Language Spatial Predicates. Santa Barbara, CA, University of
California, National Center for Geographic Information and Analysis Technical
Report 94-1
Egenhofer, M J 1999 Introduction: Theory and Concepts. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 1-4
Fowler, J, B Perry, M Nodine, and B Bargmeyer 1999 Agent-Based Semantic
Interoperability in InfoSleuth. Sigmod Record, 28(1): 8
87
Frank, A U 2001 Tiers of ontology and Consistency Constraints in Geographic
Information Systems. International Journal of Geographic Information Science,
15(7): 667-678
Frankhauser, P, M Kracker, and E Neuhold 1991 Semantic vs. Structural Resemblance of
Classes. SIGMOD Record, 20(4): 59-63
Frankhauser, P, and E J Neuhold 1992 Knowledge Based Integration of Heterogenous
Databases. In Proceedings of IFIP WG2.6 Database Semantics Conference on
Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 155-175
Gibson, J J 1979 The Ecological Approach to Visual Perception. Boston, Houghton
Mifflin
Gruber, T R 1993a Toward Principles for the Design of Ontologies Used for Knowledge
Sharing. Palo Alto, California, Knowledge Systems Laboratory Technical Report
KSL 93-04
Gruber, T R 1993b A Translation Approach to Portable Ontology Specification. Stanford,
California, Knowledge Systems Laboratory Technical Report KSL 92-71
Guarino, N 1998 Formal Ontology and Information Systems. In Proceedings of Formal
Ontology in Information Systems (FOIS '98). Amsterdam, IOS Press: 3-15
Guarino, N 1999 The Role of Identity Conditions in Ontology Design. In Proceedings of
Spatial Information Theory - Cognitive and Computational Foundations of
Geographic Information Science, International Conference COSIT'99, 3-540-
66365-7. Berlin, Springer-Verlag Lecture Notes in Computer Science 1661: 221-
234
Guarino, N, and C Welty 2000a A Formal Ontology of Properties. In Proceedings of
Knowledge Engineering and Knowledge Management: Methods, Models and
Tools (12th International Conference, EKAW2000), 3-540-41119-4. Berlin,
Springer-Verlag Lecture Notes in Computer Science 1937: 97-112
Guarino, N, and C Welty 2000b Identity, Unity, and Individuation: Towards a Formal
Toolkit for Ontological Analysis. In Proceedings of ECAI-2000: The European
Conference on Artificial Intelligence. Amsterdam, IOS Press: 219-223
88
Guha, R V 1990 Micro-theories and contexts in Cyc. I. Basic issues. Austin, Texas,
Micro-electronics and Computer Technology Corporation Technical Report ACT-
CYC-129-90
Harvey, F 1997 Improving Multi-Purpose GIS Design: Participative Design. In
Proceedings of Spatial Information Theory: A Theoretical Basis for GIS,
International Conference COSIT'97, ISBN 3-540-63623-4. Berlin, Springer-
Verlag Lecture Notes in Computer Science 1329: 313-328
ISO/TC 211 2001 ISO/DIS 19110 Geographic Information - Feature Cataloguing
Methodology. Geneva, Switzerland, International Organization for
Standardization
ISO/TC 211 2002 ISO/DIS 19109 Geographic Information - Rules for Application
Schema. Geneva, Switzerland, International Organization for Standardization
Kahng, J, and D McLeod 1998 Dynamic Classificational Ontologies: Mediation of
Information Sharing in Cooperative Federated Database Systems, Context and
Ontologies. In M P Papazoglou, and G Schlageter (eds) Cooperative Information
Systems-Trends and Directions. San Diego, CA, Academic Press: 179-203
Kashyap, V, and A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Kashyap, V, and A Sheth 1998 Semantic Heterogeneity in Global Information Systems:
the Role of Metadata, Context and Ontologies. In M P Papazoglou, and
G Schlageter (eds) Cooperative Information Systems-Trends and Directions. San
Diego, CA, Academic Press: 139-178
Kettani, D, and B Moulin 1999 A spatial model based on the notions of spatial
conceptual map and of object’s influence areas. In Proceedings of Spatial
Information Theory, Cognitive and Computational Foundations of Geographic
Information Systems, International Conference COSIT'99. Berlin, Springer-
Verlag Lecture Notes in Computer Science 1661: 401-415
Kosslyn, S M 1980 Image in Mind. Cambridge, Massachusetts, Harvard University Press
Kottman, C 1999 The Open GIS Consortium and Progress Toward Interoperability in
GIS. In M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds)
89
Interoperating Geographic Information Systems. Boston, Massachusetts, Kluwer
Academic Publisher: 39-54
Krech, D, and R S Crutchfield 1971 Perceiving the World. In W Schramm, and D F
Robert (eds) The Process and Effects of Mass Communication. Champaign-
Urbana, IL, University of Illinois Press: 235-264
Laurini, R 1998 Spatial Multi-Database Topological Continuity and Indexing: a Step
Towards Seamless GIS Data Interoperability. International Journal of
Geographic Information Science, 12(4): 373-402
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50
Lippmann, W 1971 The World Outside and the Pictures in Our Heads. In W Schramm,
and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 265-286
Logie, R H, and M Denis (eds) 1991 Mental images in human cognition. Amsterdam,
North-Holland
McKee, L, and K Buehler (eds) 1998 The OpenGIS Guide. Wayland, Massachusetts,
OpenGIS Consortium Inc.
Natural Resources Canada 1996 National Topographic Data Base - Standards and
Specifications. Sherbrooke, Quebec, Centre for Topographic Information-
Sherbrooke
New Brunswick 2000 Guide d’utilisation de la Base de données topographiques
numériques (BDTN) du Nouveau-Brunswick. Fredericton, New Brunswick,
Services Nouveau-Brunswick
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources
Ouksel, A, and C Naiman 1993 Coordinating context build-ing in heterogeneous
information systems. Journal of Intelligent Information Systems, 3: 151–183
Ouksel, A M, and A Sheth 1999 Semantic Interoperability in Global Information
Systems: A Brief Introduction to the Research Area and the Special Section.
Sigmod Record, 28(1): 5-12
90
Peuquet, D, B Smith, and B Brogaard 1998 The Ontology of Fields. In Proceedings of
Summer Assembly of the University Consortium for Geographic Information
Science
Pylyshyn, Z W 1981 The Imagery Debate: Analogue Media Versus Tacit Knowledge.
Psychological Review, 88(1): 16-45
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l'information géographique, CD
Document
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine
Schramm, W 1971a How Communication Works. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 12-21
Schramm, W 1971b The Nature of Communication Between Humans. In W Schramm,
and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 3-53
Sciore, E, M Siegel, and A Rosenthal 1992 Context interchange using metaattributes. In
Proceedings of First International Conference on Information and Knowledge
Management (CIKM): 377-386
Sears, D O, and J L Freedman 1971 Selective Exposure to Information: A Critical
Review. In W Schramm, and D F Robert (eds) The Process and Effects of Mass
Communication. Champaign-Urbana, IL, University of Illinois Press: 209-234
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29
Sheth, A, and V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312
91
Simsion, G C 2001 Data Modeling Essentials - Analysis, Design, and Innovation.
Scottsdale, Arizona, Coriolis
Smith, B 1994 Fiat Objects. In Proceedings of Workshop on Parts and Wholes:
Conceptual Part-Whole Relations and Formal Mereology, 11th European
Conference on Artificial Intelligence: 15-23
Smith, B, and D Mark 1999 Ontology with Human Subjects Testing: An Empirical
Investigation of Geographic Categories. American Journal of Economics and
Sociology, 58(2): 245-272
Smith, B, and A C Varzi 2000 Fiat and Bona Fide Boundaries. Philosophy and
Phenomenological Research, 60(2): 401-420
Smith, K 1999 Unpacking the Semantics of Source and Usage to Perform Semantic
Reconciliation in Large-Scale Information Systems. Sigmod Record, 28(1): 6
Sondheim, M, K Gardels, and K Buehler 1999 GIS Interoperability. In P A Longley, M F
Goodchild, D J Maguire, and D W Rhind (eds) Geographical Information
Systems: Principles, Techniques, Applications and Management. New York, John
Wiley and Sons, Inc.: 347-358
Sowa, J F 1984 Chapter 7: Limits of Conceptualisation. In Conceptual Structures:
Information Processing in Mind Machine. Reading, Massachusetts, Addision-
Westley Publishing Company: 339-351
Statistics Canada 1997 Digital Boundary File and Digital Cartographic File 1996
Census (Reference Guide). Ottawa, Minister of Industry
Sycara, K, M Klusch, S Widoff, and J Lu 1999 Dynamic Service Matchmaking Among
Agents in Open Information Environnements. Sigmod Record, 28(1): 47-53
VMap 1995 Vector Map (VMap), Level 1. Bethesda, MD, U.S. National Imagery and
Mapping Agency Mil-V-89033
Weiner, N 1950 The Human Use of Human Beings: Cybernetics and Society. Boston,
Houghton and Mifflin
Wisse, P 2000 Metapattern: Context and Time in Information Models. Reading,
Massachusetts, Addison-Wesley
CHAPITRE 4
LA PROXIMITÉ GÉOSÉMANTIQUE,
UNE COMPOSANTE DE L’INTEROPÉRABILITÉ
DES DONNÉES GÉOSPATIALES
Geosemantic Proximity for Geospatial Data Interoperability
(J. Brodeur, Y. Bédard, B. Moulin, et G. Edwards)
4.1 Résumé de l’article
Au chapitre précédent, nous avons proposé un cadre conceptuel d’interopérabilité des
données géospatiales qui se fonde sur le processus de communication entre les êtres
humains et les sciences cognitives. Nous y introduisions la notion de proximité
géosémantique en tant que composante d’interopérabilité des données géospatiales. Ce
chapitre présente un article qui expose les aspects théoriques de la proximité
géosémantique. D’abord, il développe les notions de concepts géospatiaux, de
représentations conceptuelles géospatiales et de contexte dans l’esprit du processus de
communication entre les êtres humains et du fonctionnement cognitif des êtres humains.
Puis, il définit la notion de proximité géosémantique. La proximité géosémantique
constitue une approche basée sur le contexte qui évalue qualitativement la similitude
93
entre un concept géospatial et une représentation conceptuelle géospatiale. Elle s’inspire
de la notion de topologie et adapte la matrice à quatre intersections telle qu’utilisée dans
les données géométriques au contexte sémantique. Enfin, ce chapitre illustre la pertinence
de la proximité géosémantique à l’aide d’exemples à partir de spécifications de données
géospatiales existantes.
4.2 Abstract
Aiming to improve interoperability of geospatial data, recent researches in the field of
geomatics address the issue of semantic interoperability. It has been acknowledged that
the semantics of geospatial data must be considered to achieve complete interoperability
of geospatial data. For instance, how similar or different is marsh/swamp as defined in
the Feature and Attribute Code Catalogue (FACC) of VMap libraries compared to
wetland as defined in the National Topographic Data Base (NTDB) of Canada? Do
they mean the same thing? Do their geometry mean the same thing? Could they be used
interchangeably?
In the previous chapter, we proposed a conceptual framework for geospatial data
interoperability that is essentially based on human communication and cognition
paradigms. We also introduced the notion of geosemantic proximity within the broader
issue of interoperability of geospatial data. In the present chapter, we present a theoretical
account of this notion. We first define the notions of geospatial concept, geospatial
conceptual representation, and context in the perspective of the human communication
and cognition. Then we define the notion of geosemantic proximity, which results from a
context-based approach to qualitatively assess the similarity of a geospatial concept and a
geospatial conceptual representation. The geosemantic proximity is also influenced by
the well-known notion of topology and the 4-intersection matrix as applied to geometric
data. Finally, examples based on existing topographic database specifications illustrate
the interest of using the geosemantic proximity.
94
4.3 Introduction
Nowadays geographic phenomena are depicted in numerous geospatial databases. These
databases have been set to support specific purposes but are now made available publicly
through Internet and geospatial data infrastructures such as the CGDI in Canada and the
NSDI in the United States. For instance, in Canada, users of geospatial data have access
to databases such as the National Topographic Data Base (NTDB) of the Department of
Natural Resources Canada (Natural Resources Canada, 1996); the Street Network Files,
the Digital Boundary Files, and the Digital Cartographic Files from Statistics Canada
(Statistics Canada, 1997); the VMap libraries (VMap, 1995); the National Atlas of
Canada produced at a smaller scale by the Department of Natural Resources Canada
(Natural Resources Canada, 1996); and also larger scale topographic data produced by
provincial departments (see BC Ministry of Environment Lands and Parks (Geographic
Data BC), 1992; New Brunswick, 2000; OBM, 1996; Québec, 2000). These databases
use different abstractions to represent highly similar phenomena. For example, forest-like
phenomena are depicted interchangeably as vegetations in NTDB, trees in VMap,
wooded areas in Ontario Digital Topographic Data Base, and milieu boisé in the
Base de données topographiques du Québec (N.B. the point, line, and surface pictograms
express the type of geometric representation used to depict the phenomenon
geographically; see (Bédard and Proulx, 2002) for more details). As a result, users have
now to deal with multiple geospatial databases to search, find, get, and integrate data that
correspond to their specific needs. In such operations, users frequently encounter
problems of syntactic, structural, and moreover semantic heterogeneities (Ouksel and
Sheth, 1999; Sheth, 1999) including geometric and temporal heterogeneities (Bishr,
1997; Charron, 1995; Laurini, 1998). Hence, the combined use of geospatial data from
multiple geospatial databases frequently becomes a nightmare.
At the beginning of the 1990s, geospatial data interoperability (McKee and Buehler,
1998) became an important issue in the geospatial information community since it is seen
as a solution for sharing and integrating geospatial data and geoprocessing resources
(Kottman, 1999). The OpenGIS Consortium Inc., ISO/TC 211-Geographic
95
information/geomatics, the geospatial information industry, and the research community
worked together to build today’s foundation of geospatial data interoperability. Major
advances have been achieved mainly on syntactic and structural heterogeneities
(Rodriguez, 2000) as it can be observed in the documents (ISO/TC 211, 2002a; ISO/TC
211, 2002b; ISO/TC 211, 2003a; ISO/TC 211, 2003b; ISO/TC 211, 2003c; Open GIS
Consortium Inc., 1999a; Open GIS Consortium Inc., 2001), which describe the structure,
the content, and the encoding of geospatial data and metadata. However, according to
(Bishr, 1997) and (Rodriguez, 2000), structural heterogeneity can only be solved for
representations of phenomena that are semantically similar. Therefore, semantic
heterogeneity must also be addressed to acknowledge complete interoperability of
geospatial data (Rodriguez, 2000).
Recently in (Brodeur and Bédard, 2001; Brodeur et al., 2003), we proposed a new
conceptual framework for geospatial data interoperability, which is based on the human
communication process (Schramm, 1971) and cognition. A key element of this
framework, which is used to define the notion of geosemantic proximity (GsP), consists
in the assessment of the semantic proximity between a geospatial concept (hereafter
called geoConcept) and a geospatial conceptual representation (hereafter called
geoConceptRep). The purpose of this chapter is to describe in details this notion of GsP
and to support it with examples.
The remainder of this chapter is structured as follows. The next section reviews notions
related to geospatial data interoperability, which lead to the elaboration of the GsP
notion. Section 4.5 describes our formalisation of geoConcept and geoConceptRep in
relation to the context, which is fundamental in the development of GsP. The following
section develops the notion of GsP. In section 4.7, we introduce a software prototype
(presented in more detail in chapter 5) which serves to validate the GsP approach.
Finally, in section 4.8, we conclude and identify future work.
96
4.4 Geospatial data interoperability and geosemantic proximity
In (Brodeur and Bédard, 2001; Brodeur et al., 2003), we compare geospatial data
interoperability to an interpersonal communication process where people end up
understanding each other when they are communicating. In this conceptual framework,
like in human communication, geospatial data interoperability happens between two
software agents, which exchange information about geographic phenomena. Typically,
agents maintain geographic information in memory in the form of geoConcepts and
communicate this information using geoConceptReps. More specifically, geoConcepts
consist of abstract, internal, and persistent representations of geographic phenomena that
are maintained for humans in their cognitive model and for computer systems in their
physical memory. They are made of hidden data elements, which are encapsulated by a
simulation function (Barsalou, 1999). This function is a kind of translation function that
produces or recognizes the geoConceptReps which express the geoConcepts in a specific
situation. It corresponds to the main interface of a geoConcept. Thus, geoConceptReps
refer to the set of symbols that are used to communicate geoConcepts. They are transient
representations of geographic phenomena. In our framework, geospatial data
interoperability corresponds to a bi-directional communication process between a user
agent and a data provider agent, which includes a feedback mechanism in both
directions. Based on his/her/its own set of geoConcepts, the user agent sends a request for
information about geographic phenomena using his/her/its own geoConceptReps to the
data provider agent. When the data provider agent receives the request, he/she/it
interprets it to find geoConcepts he/she/it knows that match the received
geoConceptReps. Then, he/she/it gathers the information requested by the user agent
based on the identified geoConcepts, transforms it into new geoConceptReps, and sends
the geoConceptReps to the user agent. Once the user agent gets the answer, he/she/it has
to recognize the received geoConceptReps, that is again to find geoConcepts he/she/it
knows that match the received geoConceptReps. Then, he/she/it compares these
geoConcepts against those of the initial request to validate if they answer accurately to
his/her/its initial request. If so, we can say that interoperability happens. In this
framework, GsP takes place in a geoConcept’s translation function. This translation
97
function supports both the user agent and the data producer agent in the production and
recognition of geoConceptReps; it evaluates qualitatively the similarity of the
geoConcept with the geoConceptRep.
Frank (2000) described a similar formalization, which involves user and producer agents
communicating with maps. This formalization draws a parallel between facts of real
world situation and beliefs representations (agents’ cognition), which are communicated
using maps. Here, geoConcepts can be compared to beliefs representations, which are
both agents’ representations about facts of reality, and geoConceptReps to maps, which
are both means to communicate geospatial data.
4.4.1 Semantic similarity of geospatial data
The assessment of semantic similarity of geospatial data has also been studied in the
Semantic Formal Data Structure (SFDS) (Bishr, 1997), in the Matching Distance (MD)
model (Rodriguez, 2000), and in the Isis solution (Benslimane, 2001). In SFDS, three
components are involved in the assessment of semantic interoperability: an export
schema, a federated schema, and a proxy context mediator. The export schema defines
conceptual representations that are used to communicate database concepts to users. The
federated schema gathers the definition of domain specific concepts such as
transportation, hydrology, soils, and so on. The proxy context mediator consists in a
common ontology used to map conceptual representations of the export schema and the
concepts of the federated schema. Consequently, the semantic proximity corresponds
here to the fact that the conceptual representation of the export schema and the concept of
the federated schema are both linked to the same concept in the proxy context mediator.
The MD model is a method that measures the semantic proximity between two
geographic concepts. The semantic proximity consists in a conceptual distance computed
by analysing the common and distinguishing components between the two geographic
concepts. This conceptual distance is evaluated by a weighted sum of the semantic
proximity of the parts, functions, and attributes of the two geographic concepts. The
98
semantic proximity of parts, functions, and attributes respectively relies on the ratio of
their common components (|C1 ∩ C2|) to the sum of their common and distinguishing
components (|C1 ∩ C2|+α|C1 - C2|+(1-α)|C2 – C1|).
The Isis solution is structured in two layers: the data and the mediation layers. The data
layer corresponds to heterogeneous databases and their respective data models. The
mediation layer is made up of the following components: (1) the universe of discourse
(i.e. a subset of reality); (2) a global ontology (i.e. an ontology of generic concepts); (3) a
context of reference (i.e. a domain specific ontology); and (4) the set of database–specific
co-operation contexts (i.e. interpretations of database data models according to a specific
context of reference). A co-operation context consists of a set of classes (called co-
operation classes) that are made of mediation roles, virtual classes and context
transformation functions. Essentially, in this solution, semantic proximity consists in the
mapping of co-operation classes of heterogeneous databases, which is based on the
comparison of their respective mediation roles. It is an asymmetric operation (i.e. the
mapping of C1 to C2 could be different from the mapping of C2 to C1) that qualifies the
semantic proximity between co-operation classes as impossible (or empty), partial, or
complete.
4.4.2 Identity of geographic phenomena
According to our conceptual framework for geospatial data interoperability, many
geoConcepts and geoConceptReps are used for the communication of information about
the same geographic phenomenon, namely geoConcepts of the two agent models
(cognitive or computerized) and geoConceptReps produced by geoConcepts of both
agents. Consequently, geoConcepts and geoConceptReps are not as important as the
phenomenon they designate. In the interoperability of geospatial data, geoConcepts of the
data provider agent should recognize in the received geoConceptReps the same
geographic phenomena as those to which refer the geoConcepts used by the user agent to
produce these geoConceptReps, and reciprocally. Taking this into account, identity of
geographic phenomena appears to be a closely related notion to interoperability of
99
geospatial data. A geoConcept and a geoConceptRep must refer to the same set of
geographic phenomena in order to be perfectly interoperable and this means that the same
identity of geographic phenomena must be recognized from geoConcepts and
geoConceptReps.
In its Essay Concerning Human Understanding, philosopher John Locke defines identity
as follows:
« When we see anything to be in any place in any instant of time, we are
sure (be it what it will) that it is that thing, and not another which at the
same time exists in an another place, how like and undistinguishable
soever it may be in all other respects: and in this consists identity (…) »
(Locke, 1689).
As such, identity is a meta-property, which allows us to distinguish and individualize
distinct geographic phenomena (Guarino and Welty, 2000a) as well as to recognize
representations corresponding to the same phenomena. Identity is a notion that
acknowledges the oneness character of a phenomenon. The recognition of the same
identity in numerous representations of geographic phenomena leads to identify which
representations are interoperable. This could be done by comparing properties from
which identity of the different representations could be recognized. According to
(Guarino and Welty, 2000a; Guarino and Welty, 2000b), these properties are said to carry
identity condition. The comparison of these properties becomes possible because of the
agents’ common knowledge about real world phenomena, commonly known as
commonness (Schramm, 1971) in communication sciences. As such, recognition of the
same identity between a geoConcept and a geoConceptRep is possible if they have
sufficient commonness. Therefore, we consider identity as a basic notion to assess
interoperability of geospatial data.
In the development of geospatial databases, conceptual data modelers identify and
describe geographic phenomena using their own specific abstractions. Consequently, it is
100
common that phenomena are abstracted differently in distinct geospatial databases (e.g.
wetland vs. marsh/swamp, vegetation vs. wooded area) because of the specific goal for
which data models are elaborated or because of the experience of data modelers: this is
related to the context. More specifically, the context corresponds to the situation and the
circumstances in which a phenomenon is perceived, abstracted, and used (the notion of
context is detailed further in this section) (Brodeur et al., 2003). Accordingly, the identity
of a geographic phenomenon is depicted differently in the various abstractions of that
geographic phenomenon and, thus, the context contributes to a partial description of the
identity of such a phenomenon (Wisse, 2000). Consequently, the global description of the
identity of a geographic phenomenon consists of the union of all context dependent
descriptions of its identity. For instance, the union of all descriptions of a road segment
taken from the NTDB, the Street Network File, the VMap libraries, and other databases
results in the global description of the identity of this road segment. However, these
descriptions of phenomena might show inconsistencies between them, thus
complexifying the issue of identity. Specifically, we recognize in a geoConcept and a
geoConceptRep the identity of a geographic phenomenon from their built-in properties,
which describe the context in which the phenomenon has been abstracted (Wisse, 2000).
Hence, the similarity between a geoConcept and a geoConceptRep, as it is the case in the
notion of geosemantic proximity, can be assessed by comparing their respective
properties, especially those supporting the recognition of identity (this will be detailed in
section 4.6).
4.4.3 Boundary of geoConcept and geoConceptRep
In geospatial data modelling, conceptual data modelers abstract geographic phenomena
using geoConcepts and geoConceptReps. Accordingly, the definition of a geoConcept or
a geoConceptRep circumscribes all geographic phenomena that are intended by that
abstraction and not others. As such, we can imagine that a geoConcept or a
geoConceptRep is bounded in order to restrict its specific domain. The assessment of
similarities and differences between a geoConcept and a geoConceptRep, as it is the case
in GsP, has to consider the respective extents and boundaries of the geoConcept and the
101
geoConceptRep. As such, we assume in the GsP notion that a geoConcept and a
geoConceptRep consist of an interior and a boundary, which correspond to intrinsic
properties and extrinsic properties respectively. We call intrinsic properties those
properties providing the literal meaning of a geoConcept or a geoConceptRep (e.g.
identification, attributes, attribute values, geometry, temporality, and domain) whereas
extrinsic properties are those properties providing meaning by their association with other
geoConcepts and geoConceptReps respectively (e.g. semantic, spatial, and temporal
relationships as well as behaviours). The notion of boundary has been studied in (Casati
et al., 1998; Smith, 1994; Smith and Mark, 1999; Smith and Varzi, 2000) who recognize
two types of boundaries: boundaries resulting from genuine or physical object
demarcation (also called bona fide boundaries) and boundary referring to human driven
demarcation which are essentially artificial, imaginary, or virtual (also called fiat
boundaries). According to (Smith, 1994; Smith and Mark, 1999), the boundary of
geoConcepts and geoConceptReps are of the fiat type and our notion of GsP agrees with
the kind of topology associated with fiat boundaries (Casati et al., 1998; Smith and Varzi,
2000).
4.4.4 Geosemantic proximity and topology
Semantic proximity is not a new notion. It has been studied in cognitive science,
psychology, linguistics, and artificial intelligence. It is used to express the similarity
between abstractions of real-world phenomena. As such, it provides a valuable
foundation to further the development of semantic interoperability of geospatial data.
Most of the methods use semantic networks to compute a conceptual distance between
abstractions, which consists in a quantitative assessment of the semantic proximity
(Lehmann, 1992; Rodriguez, 2000). As such, a semantic network constitutes an ontology
of a part of the reality. It is made up of an interconnected node-arc-like structure where
nodes represent abstractions and arcs are links between abstractions. Frame systems,
relational graphs, and hierarchies (e.g. lattices, trees, and acyclic graphs) are types of
structures commonly used to implement semantic networks (Lehmann, 1992; Sowa,
102
1987). In such structures, the semantic proximity consists of the smallest conceptual
distance between two abstractions that is computed from the network. To do so, a
coefficient is assigned to each arc, which expresses the conceptual distance between
linked abstractions (Frankhauser et al., 1991). For abstractions not linked directly, a
conceptual distance is inferred by traversing the network from one abstraction to the
other. When many paths are possible, the smallest conceptual distance expresses the
semantic proximity.
IS_A and PART_OF relationships are types of links frequently encountered in semantic
networks (Guarino, 1995; Guarino, 1999; Lehmann, 1992; Sowa, 1987) and refer, in
linguistics, to the study of hyponymy and meronymy, respectively. The IS_A relationship
establishes a specificity relationship (Kashyap and Sheth, 1996) between two concepts,
distinguishing the specific from the general concept and, as such, is asymmetric. The
PART_OF relationship distinguishes constituent elements from the whole. It is also an
asymmetric relationship. Rodriguez et al. (1999) recalled seven different uses of the
PART_OF relationship: component-object, member-collection, portion-mass, stuff-
object, feature-activity, place area, and phase-process. These types of relationship details
the specific role played by constituent elements and the whole in a PART_OF relation.
However, concepts may show other types of relations between them. These relationships
hold a semantics different from IS_A and PART_OF (e.g. person::possess::lot,
breakwater::protect::harbour) and are as important as IS_A and PART_OF relationships.
In object-oriented modeling (e.g. UML class diagram), IS_A relationships are
implemented as generalization/specialization relationships and PART_OF relationships
as aggregation and composition associations while other relationships are represented as
generic associations between classes (Object Management Group, 2001). As such, UML-
based repositories provide the necessary elements to develop geospatial ontologies.
Semantic proximity has also been studied in context-based approaches (Kashyap and
Sheth, 1996; Kashyap and Sheth, 1998). Several authors acknowledge that the context
plays a fundamental role in the abstraction of phenomena (Wisse, 2000) as well as in the
assessment of semantic proximity as applied to semantic interoperability (Bishr, 1997;
103
Harvey et al., 1999; Kashyap and Sheth, 1996; Kashyap and Sheth, 1998; Ouksel and
Sheth, 1999; Rodriguez, 2000). This notion is however perceived differently from one
author to another. As surveyed by Kashyap and Sheth (1996), the context can refer to:
- the needed knowledge to reason about another system;
- the signification, content, organisation and properties of data;
- a well-defined subset of an ontology;
- the association to one or multiple data sources;
- the relationship in which an object class participates;
- the relationship with an export or an external schema;
- a named collection of domains of objects; and
- the situation in which a particular semantic similarity holds between two objects.
The context can also refer to:
- the set of category definitions, class intention definitions, and geometric
descriptions (Bishr, 1997);
- A set of tuples over operations associated with entity arguments (nouns)
(Rodriguez, 2000).
Specifically, the context is associated to the manner in which a phenomenon is abstracted
(Kashyap and Sheth, 1996; Kashyap and Sheth, 1998; Wisse, 2000). It consists of a set of
elements that influence the perception of a phenomenon, that make some properties more
attractive, and that affect the manner in which the abstraction of the phenomenon is used.
A concept resulting from the abstraction of a phenomenon is described according to a
given context and, as such, it is this context that provides the concept its intended
semantics (Bishr, 1997; Ouksel and Sheth, 1999; Wisse, 2000). For instance, in
geospatial conceptual modelling, object classes, properties, geometry, temporality,
behaviours, relationships are identified and defined according to the context in which
phenomena are observed. As such, the context is described by the way of metadata, data
models, and ontologies. In semantic interoperability of geospatial data, an additional
104
challenge consists in the capability of reasoning about the context, i.e. to compare
geoConcepts with geoConceptReps based on their respective contexts, which accounts
for a context-based approach. A context-based approach expresses the likeness of
abstractions of geographic phenomena qualitatively using a set of semantic proximity
predicates (e.g. semantic resemblance, semantic relevance, semantic relation, semantic
equivalence, and semantic incompatibility (Kashyap and Sheth, 1996)).
Adopting a context-based approach, we believe that semantic proximity of geospatial
data can be thought in terms of topological relationships existing between abstractions.
Topological relationships have been extensively studied when it comes to temporal and
spatial information (Allen, 1983; Clementini and Di Felice, 1995; Clementini and Di
Felice, 1996; Egenhofer, 1993; Egenhofer et al., 1994a; Egenhofer and Sharma, 1992;
Egenhofer, 1997; Egenhofer et al., 1994b; Egenhofer and Franzosa, 1995; Egenhofer and
Mark, 1995). Topological relationships describe geometric-like properties existing
between abstractions of phenomena, which are invariant under continuous
transformations (e.g. translation, rotation, and scaling). In geographic information, the 4-
intersection and the 9-intersection models (Egenhofer, 1993; Egenhofer et al., 1994a) as
well as the calculus-based method (Clementini and Di Felice, 1996) have become
standard approaches (ISO/TC 211, 2003a; Open GIS Consortium Inc., 1999a). The
4-intersection model is based on the intersection between interiors and boundaries of two
geometric representations, and the 9-intersection model adds the notion of exterior. The
calculus-based method models the topology of spatial data using 2D geometric data
types, point, line, and area with three boundary operators: (A, b) returning the boundary b
of an area A, (L, f) returning the “from point” boundary f of a line L, and (L, t) returning
the "to point" boundary t of a line L. In these models, authors have produced sets of
mutually exclusive predicates, which qualitatively describe the commonness and the
difference between two spatial representations of a phenomenon (e.g. disjoint, meet,
equal, inside, contains, covered, coveredBy, and overlap). These predicates express the
similarity between two geometric representations. They have been tested with human
subjects and various results show that they are representative of human spatial reasoning
(Mark and Egenhofer, 1994).
105
We define the GsP (described in section 4.6) in terms of relationships between the
context of a geoConcept and the context of a geoConceptRep in much the same as
topological relationships. Hausdorff defined a concept of topological space based on four
axioms (Weisstein, 1999):
1. To each point x, there corresponds at least one neighbourhood U(x), and U(x)
contains x;
2. If U(x) and V(x) are two neighbourhoods of the same point x, then there exists a
neighbourhood W(x) of x such that W(x) is a subset of the union of U(x) and V(x);
3. If l is a point in U(x), then there exists a neighbourhood U(y) of y such that U(y) is
a subset of U(x);
4. For distinct points x and y, there exists two disjoint neighbourhoods U(x) and
U(y).
We assume in this thesis that the set of properties of a geoConcept or a geoConceptRep
describes an intentional representation of their context and the set of all occurrences of a
geoConcept or a geoConceptRep context consists in their extension. Working with
Hausdorff axioms, if we suppose that (1) x and y are occurrences, and (2) U(x), V(x),
W(x) and U(y) are geoConcept or geoConceptRep contexts, then a geoConcept or a
geoConceptRep context can be considered as a point set and, therefore, as continuous.
Consequently, the notions of interior, boundary, and exterior can be applied to both
geoConcept and geoConceptRep contexts. However, this thesis only addresses the
formalization of interior and boundary. Accordingly, the notion of GsP is developed as a
4-intersection model, which is homomorphic to existing spatial and temporal topological
models. As such, we believe that GsP is best suited when dealing simultaneously with
geometric, temporal, and semantic data, as it is the case for geospatial information.
106
4.5 Formalisation of geoConcept and geoConceptRep in relation with the context
This section presents the formalization of geoConcept and geoConceptRep, which will be
used to define the GsP.
GeoConcepts and geoConceptReps result from the existence of geographic phenomena,
their perception, and their abstraction (see Figure 18). In the scope of this thesis, we
assume that a geographic phenomenon consists of a fact (i.e. something that exists) or an
event that can be observable through our senses (and their technological extensions such
as satellite sensors) which is associated with a geographic position and a shape.
Typically, the abstraction of a geographic phenomenon occurs within a given context. As
such, the context constitutes an essential element in the perception and the abstraction of
a geographic phenomenon. It governs the way in which a geographic phenomenon is
perceived and abstracted as well as how the resulting abstraction will be used. However,
we must recognize that the context is essentially a fictitious and an imaginary notion that,
in fact, does not exist in reality. It is a meta-concept referring to all circumstances
surrounding the existence of a geographic phenomenon and its abstraction, which
provides the intended semantics to the abstraction (Ouksel and Sheth, 1999).
Consequently, the properties of an abstraction of a geographic phenomenon describe the
context to which this geographic phenomenon is subject to. Therefore, an abstraction of a
geographic phenomenon consists in the representation of that geographic phenomenon,
which holds in a specific context. Thus, each geographic phenomenon abstracted in a
different context instantiates a different abstraction.
GeoConcept and geoConceptRep are two types of abstraction of geographic phenomena.
On the one hand, a geoConcept is a type of abstraction that an agent (e.g. human,
computerized system) persistently maintains in memory. A geoConcept constitutes one
element of the agent’s ontology. The agent’s ontology is the set of its geoConcepts with
their interrelationships.
107
Let:
K: a geoConcept,
A: the agent,
OA: the agent’s ontology.
Then:
K ∈ OA Equation 9
Referring to Table 3, road is a geoConcept example of an NTDB –based agent’s
ontology. The agent’s ontology also contains other geoConcepts such as
limited access road , waterbody , watercourse , irrigation canal , and so on.
The information that a concept holds is typically hidden and, therefore, not directly
accessible. As such, a geoConcept has a function (called simulate) corresponding to the
simulation function presented in (Barsalou, 1999). This function performs reasoning
procedures in order to produce and to recognize geoConceptReps that are similar to the
geoConcept. It consists of the main interface of a geoConcept.
Figure 18: UML object class diagram of geoConcept and geoConceptRep
108
On the other hand, a geoConceptRep is a type of abstraction of a geographic
phenomenon, which consists of the representation of a geoConcept with symbols, words,
sounds, etc. A geoConceptRep is used to communicate data about that geographic
phenomenon from one agent to one another. It results from a representation function (r)
over a geoConcept.
Let:
L: a geoConceptRep,
K: a geoConcept,
r: a representation function.
Then:
L = r(K) Equation 10
Pursuing with the road example, an NTDB –based agent could represent the
geoConcept road as a geoConceptRep with the following XML encoding:
<conceptualRepresentation> <intrinsicProperties> <identification> <name>road</name> <definition>a road for the movement of motor vehicles.</definition> </identification> <descriptiveAttribute> <name>surface</name> <attributeValue> <name>hard surface</name> <definition>a surface made of concrete, asphalt, or tar gravel.</definition> </attributeValue> <attributeValue> <name>loose surface</name> <definition>a surface made of other than concrete, asphalt, or tar gravel.</definition> </attributeValue> </descriptiveAttribute> <geometry>1</geometry> </intrinsicProperties> <extrinsicProperties>
109
<relationMembership> <relation> <name>share</name> <firstMember>road</firstMember> <secondMember>bridge</secondMember> </relation> </relationMembership> </extrinsicProperties> </conceptualRepresentation>
Typically, an agent’s geoConcept produces a geoConceptRep, which is then placed in the
communication channel. When the geoConceptRep reaches a destination, geoConcepts of
the destination agent try to recognize it. Accordingly, a geoConceptRep constitutes a
physical but transient representation of a geographic phenomenon and has a determined
lifetime, i.e. it lasts from the moment it is produced to the moment of its destination.
A set of properties describes an abstraction. Each property depicts one aspect of interest
of a geographic phenomenon according to the context. As such, a property constitutes a
partial description of the context in which the geographic phenomenon has been
observed. Properties are of two types: intrinsic properties and extrinsic properties (Figure
19).
Figure 19: UML object class diagram of property types
110
Intrinsic properties describe the essential nature of a geographic phenomenon, its
fundamental character. They provide the literal meaning of the abstraction. They are
independent of any external factor. An agent should be able to recognize the identity of a
geographic phenomenon from the set of intrinsic properties of an abstraction (Guarino
and Welty, 2000a). In the scope of this thesis, intrinsic properties are limited to the
following components: identification, characteristic, and abstraction domain (Figure 20).
Identification is the most fundamental property, which consists of the name and the
definition of the abstraction. They both denote what it is intended by that abstraction.
Continuing with the previous road example, the name <road> and the definition <a
road for the movement of motor vehicles> constitute the road ’s identification.
A characteristic is a descriptor of one distinct character of a geographic phenomenon that
holds in a given context. We assume three types of characteristics: descriptive attribute,
geometry, and temporality. A descriptive attribute consists of a name and a definition.
When applicable, a set of attribute values defines the domain of the descriptive attribute
and, when the domain consists of a set of enumerated textual values, each value has a
name and a definition. Geometry is the characteristic that depicts the geographic position
and the shape of the geographic phenomenon whereas temporality is the characteristic
Figure 20: UML object class diagram describing the types of intrinsic properties
111
that describes a specific moment and the life span associated to the geographic
phenomenon existence and to its other characteristics (descriptive attributes and
geometry). The attribute <surface>, the attribute values <hard surface> and <loose
surface>, and the geometry <1> constitute characteristics of our previous road
example.
An abstraction has also a domain. This domain details the possible combinations of
descriptive attribute values along with the geometry type (e.g. point, line, or surface) and
the temporality type (e.g. punctual and durable time). It sets the extent of the abstraction
of the geographic phenomenon.
Extrinsic properties are elements that provide meaning because of the interaction of the
abstraction with external factors and, conversely, because of the influence of these
external factors on the abstraction. In the scope of this thesis, we limit extrinsic properties
to behaviours and to memberships within a relationship (Figure 21).
A behaviour consists of the set of observable reactions of an abstraction as a response to
a stimulation activated by the abstraction itself or by other abstractions (Merriam-
Webster Inc., 1994; Microsoft Corporation and Liris Interactive, 1996). For example, a
dam could behave as a bridge when roads are connected to both ends of the dam and cars
are allowed to pass on the dam to cross the river. However, we consider as extrinsic
properties only those behaviours that are stimulated by external abstractions.
A relationship expresses the dependency that exists between abstractions. Relationships
can be classified in different manners. Here, we consider IS_A relationships (which take
the form of inheritance and generalization/specialization relationships), PART_OF
relationships (which take the form of aggregation and composition relationships), as well
as other semantic, spatial, and temporal associations. Memberships of an abstraction in a
relationship qualify its dependency with another abstraction and, as such, constitute an
extrinsic property. In the road example, road is member of the relationship between
<road> and <bridge>.
112
Figure 21: UML object class diagram describing the types of extrinsic properties
Although the different properties are classified intrinsic and extrinsic, the comparison of
a geoConcept with a geoConceptRep analyses each property with their inherent semantics
(including the different types of relationship). For instance, let us consider that a car is a
vehicle, an automobile is a vehicle, and both car and automobile have an IS_A
relationship with vehicle. Car and automobile have therefore a common extrinsic
property.
In this thesis, we formalize the context of an abstraction as the union of all its intrinsic
and extrinsic properties (Equation 11):
Let:
K: an abstraction
CK: the context of K,
CK°: intrinsic properties of K,
∂CK: extrinsic properties of K.
Then:
CK = CK° ∪ ∂CK Equation 11
We represent the context of K by a segment on a semantic axis (Figure 22). In this
metaphor, the intrinsic properties correspond to the interior of the segment whereas the
extrinsic properties correspond to the boundaries of the segment. This type of geometry
was chosen because the point geometry has, by definition, no boundary and the line
113
Figure 22: The context of an abstraction K
geometry, which is the next level, provides, in a two dimensional space, the necessary
properties for our purposes. The union of intrinsic and extrinsic properties, as defined in
Equation 11, defines the closure of the context of K.
4.6 Geosemantic proximity
As mentioned previously, a geoConcept and a geoConceptRep must refer to the same set
of geographic phenomena in order to be perfectly interoperable. The identity of
geographic phenomena described by a geoConcept must be the same to the identity of
geographic phenomena that are described by a geoConceptRep. In order to evaluate that a
geoConcept and a geoConceptRep refer to the same set of phenomena, we rely on the
similarities that exist between them, i.e. similarities between their intrinsic and extrinsic
properties. For example, a car and an automobile have the same number of wheels and
are engine propelled, i.e. a set of common intrinsic properties. They are both used to
transport people from one place to another and move along roads, i.e. a set of common
extrinsic properties. However, a bicycle and a car have different intrinsic properties
(different number of wheels and propelled systems) but they are both used to transport
people from one place to another along roads or specially arranged passageways. As
such, they have no common intrinsic properties but similar extrinsic properties.
Following these examples, it is therefore possible to develop the different cases of
common intrinsic and extrinsic properties to express the similarity existing between a
geoConcept and a geoConceptRep. This section aims specifically at developing this set of
114
cases, which are central to the GsP notion. Consequently, GsP takes the form of a
function (g) over the geoConcept and the geoConceptRep, which determine the semantic,
spatial, and temporal similarity of the geoConcept with the geoConceptRep.
Let:
GsP: the semantic, spatial, and temporal similarity of the geoConcept
with the geoConceptRep,
K: a geoConcept,
L: a geoConceptRep,
g: a semantic, spatial, and temporal similarity function.
Then:
GsP = g(K, L) Equation 12
4.6.1 Description of GsP
GsP is an approach that determines qualitatively the similarity of semantic, geometric,
and temporal aspects of a geoConcept and of a geoConceptRep by comparing their
respective intrinsic and extrinsic properties. This approach aims at solving semantic,
geometric and temporal heterogeneities. It essentially relies on the intersection between
the context of a geoConcept (K) and the context of a geoConceptRep (L). Because we
transpose the geoConcept and the geoConceptRep to a geometric-like metaphor (i.e. a
line segment, see Figure 22), this allows us to develop the geosemantic proximity into a
set of topological relations between the geoConcept and the geoConceptRep using the
intersection of their respective context (Figure 23).
Let:
CK: Context of K,
CL: Context of L,
GsP (K,L): Geosemantic proximity between K and L.
Then:
GsP (K,L) = CK ∩ CL Equation 13
115
Figure 23: Intersection between context of K and context of L
Hence, from Equation 11 and Equation 13, GsP takes the form of a 4-intersection matrix
(Equation 11) between the intrinsic (C°) and extrinsic (∂C) properties of K and L. Each
member of the matrix evaluates the commonalities (semantic, geometric and temporal)
that exist between the context of K and the context of L. More specifically, ∂CK ∩ ∂CL
evaluates if K and L participate in similar relationships (i.e. same semantic, spatial, or
temporal relationship type with same external geoConcept or geoConceptRep) and if both
have similar behaviours stimulated likewise by common external geoConcepts or
geoConceptReps. CK° ∩ CL° evaluates the correspondence of the intrinsic properties
between K and L. This intersection goes beyond the simple comparison between the
identification of K and L, their descriptive attributes, geometry, temporality, and the
comparison of their domains of abstraction, respectively, to include also the comparison
of the identification of K with the descriptive attributes of L with their domains of values
and conversely. CK° ∩ ∂CL evaluates if L has relationships with K or has behaviour
stimulated by K, and reciprocally for ∂CK ∩ CL°. Section 4.6.2 illustrates with real-life
examples how GsP works.
GsP(K,L) =
Equation 11
In the comparison of properties between a geoConcept and a geoConceptRep, each
matrix elements could be evaluated empty (denoted by Ф or f) and non-empty (denoted
116
by ¬Ф or t) expressing respectively that none or some properties are common. This leads
to the 16 (i.e. 24) possible GsP predicates (Figure 24) that are presented in the form of
“GsP_” followed by a string of 4 “t” or “f” characters in row major form (i.e. row by
row) according to the 4-intersection matrix.
Figure 24 presents a typology of the GsP predicates, which is organized in four distinct
subdivisions influenced by four poles: common intrinsic properties, common extrinsic
Figure 24: GsP predicates
117
properties, no common intrinsic properties, and no common extrinsic properties. The
GsP predicates are detailed below according to this subdivision and examples follow
thereafter.
The left section of Figure 24 refers to the predicates having common extrinsic properties
but no common intrinsic properties: GsP_tfff (or meet), GsP_tftf, GsP_ttff, and GsP_tttf.
The GsP_tfff (or meet) predicate characterizes that K and L basically refers to different
kinds of phenomena. However, because their similar extrinsic properties with common
external factors, they evoke similar things. As such, only the intersection between
extrinsic properties of K and extrinsic properties of L is not empty.
GsP_tfff(K,L) :=
meet(K,L) :=
∀p [(p ∈ CK°) → (p ∉ CL°) ∧ (p ∉ ∂CL)] ∧
∀q [(q ∈ ∂CK) → (q ∉ CL°)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
3
GsP_tftf, GsP_ttff, and GsP_tttf predicates are specializations of the above GsP_tfff
predicate, where K and L, in addition to having common extrinsic properties, also have
extrinsic properties that depend on the other. More specifically, GsP_tftf characterizes
that L's extrinsic properties rely on K's intrinsic properties and, therefore, the intersection
between extrinsic properties of L and intrinsic properties of K is also not empty.
Reciprocally, GsP_ttff characterizes that K's extrinsic properties rely on L's intrinsic
properties and, therefore, the intersection between extrinsic properties of K and intrinsic
properties of L is also not empty. GsP_tttf corresponds to the cases where K's extrinsic
properties rely on L's intrinsic properties and conversely. Therefore in addition to the
intersection between extrinsic properties of K and extrinsic properties of L, the
intersection between extrinsic properties of K and intrinsic properties of L and the
intersection between extrinsic properties of L and intrinsic properties of K are not empty. 3 The spatial metaphor is used only to show the relationship between the geoConcept context and the
geoConceptRep context.
118
GsP_tftf(K,L) := ∀p [(p ∈ CK°) → (p ∉ CL°)] ∧
∀q [(q ∈ ∂CK) → (q ∉ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
GsP_ttff(K,L) := ∀p [(p ∈ CK°) → (p ∉ CL°) ∧ (p ∉ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (p ∈ CL°)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
GsP_tttf(K,L) := ∀p [(p ∈ CK°) → (p ∉ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
The bottom subdivision of Figure 24 comprises predicates of no common intrinsic
properties and no common extrinsic properties: GsP_fftf, GsP_ftff, GsP_fttf, and GsP_ffff
(or disjoint). GsP_fftf, GsP_ftff, and GsP_fttf are similar to the above three predicates
with the exception that K and L simply rely on each other. So, GsP_fftf characterizes that
L's extrinsic properties rely on K's intrinsic properties and, therefore, only the intersection
between extrinsic properties of L and intrinsic properties of K is not empty. Reciprocally,
GsP_ftff characterizes that K's extrinsic properties rely on L's intrinsic properties and,
therefore, only the intersection between extrinsic properties of K and intrinsic properties
of L is not empty. GsP_fttf corresponds to the cases where K's extrinsic properties rely on
L's intrinsic properties and conversely. Accordingly, the intersection between extrinsic
properties of K and intrinsic properties of L and the intersection between extrinsic
properties of L and intrinsic properties of K are not empty.
119
GsP_fftf(K,L) := ∀q [(q ∈ ∂CK) → (q ∉ CL°) ∧ (q ∉ ∂CL)] ∧
∀p [(p ∈ CK°) → (p ∉ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)]
GsP_ftff(K,L) := ∀p [(p ∈ CK°) → (p ∉ CL°) ∧ (p ∉ ∂CL)] ∧
∀q [(q ∈ ∂CK) → (q ∉ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)]
GsP_fttf(K,L) := ∀p [(p ∈ CK°) → (p ∉ CL°)] ∧
∀q [(q ∈ ∂CK) → (q ∉ ∂CL)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)]
GsP_ffff (or disjoint) characterises that no commonality exists between intrinsic or
extrinsic properties of the geoConcept K (illustrated by a black segment) and the
geoConceptRep L (illustrated by a grey segment). Therefore, all four intersections are
empty.
GsP_ffff(K,L) :=
disjoint(K,L) :=
∀p [(p ∈ CK°) → (p ∉ CL°) ∧ (p ∉ ∂CL)] ∧
∀q [(q ∈ ∂CK) → (q ∉ CL°) ∧ (q ∉ ∂CL)]
The right subdivision of Figure 24 groups the predicates having common intrinsic
properties but no common extrinsic properties: GsP_ffft, GsP_fftt (or contains), GsP_ftft
(or inside), and GsP_fttt (or overlap). The GsP_ffft predicate applies when only
commonalities between intrinsic properties of K and L exist. As such, only the
intersection between intrinsic properties of K and intrinsic properties of L is not empty.
120
GsP_ffft(K,L) := ∀p [(p ∈ CK°) → (p ∉ ∂CL)] ∧
∀q [(q ∈ ∂CK) → (q ∉ CL°) ∧ (q ∉ ∂CL)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ CL°)]
The GsP_fftt (or contains) predicate is possible when L is more specific than K. This
means that some intrinsic properties of K match all intrinsic properties of L and some are
associated with all L's extrinsic properties. Accordingly, only the intersection between K's
intrinsic properties and L's intrinsic properties and the intersection between K's intrinsic
properties and L's extrinsic properties are non-empty.
GsP_fftt(K,L) :=
contains(K,L) :=
∀q [(q ∈ ∂CK) → (q ∉ CL°) ∧ (q ∉ ∂CL)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL °)]
Reciprocally, the GsP_ftft (or inside) predicate is possible when K is more specific than
L. Therefore, all K's intrinsic properties match L's intrinsic properties and all K's extrinsic
properties depend on L's intrinsic properties. Accordingly, only the intersection between
K's extrinsic properties and L's intrinsic properties and the intersection between K's
intrinsic properties and L's intrinsic properties are non-empty.
GsP_ftft(K,L) :=
inside(K,L) :=
∀p [(p ∈ CK°) → (p ∈ CL°)] ∧
∀q [(q ∈ ∂CK) → (q ∈ CL°)]
121
The GsP_fttt (or overlap) predicate applies when commonalities exist between intrinsic
properties of K and L as well as when extrinsic properties of K refer to intrinsic properties
of L, and conversely. Accordingly, the intersection between K's extrinsic properties and
L's intrinsic properties, the intersection between K's extrinsic properties and L's intrinsic
properties, and the intersection between K's intrinsic properties and L's intrinsic
properties are non-empty.
GsP_fttt(K,L) :=
overlap(K,L) :=
∀q [(q ∈ ∂CK) → (q ∉ ∂CL)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)]
Finally, the top section gathers predicates with both common intrinsic and common
extrinsic properties: GsP_tfft (or equal), GsP_tftt (or covers), GsP_ttft (or coveredBy),
and GsP_tttt. The GsP_tfft (or equal) predicate characterizes that K and L refer exactly to
the same set of phenomena. Therefore, there is a mapping between all K's and L's
intrinsic properties as well as between all K's and L's extrinsic properties. Accordingly,
only the intersection between K's intrinsic properties and L's intrinsic properties and the
intersection between K's extrinsic properties and L's extrinsic properties are non-empty.
GsP_tfft(K,L) :=
equal(K,L) :=
∀p [(p ∈ CK°) → (p ∈ CL°)] ∧
∀q [(q ∈ ∂CK) → (q ∈ ∂CL)]
The GsP_tftt (or covers) predicate is possible when L is more specific than K and both are
related similarly to common external factors. This implies that some of K's intrinsic
properties match all extrinsic properties of L, some are related to L's extrinsic properties,
and K and L have similar extrinsic properties with common external factors. Accordingly,
122
the intersection between K's extrinsic properties and L's extrinsic properties, the
intersection between K's intrinsic properties and L's extrinsic properties, and the
intersection between K's intrinsic properties and L's intrinsic properties are non-empty.
GsP_tftt(K,L) :=
covers(K,L) :=
∀q [(q ∈ ∂CK) → (q ∉ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
Reciprocally, the GsP_ttft (or coveredBy) predicate is possible when K is more specific
than the L and K and L depend similarly on common external factors. Therefore, all K's
intrinsic properties match with L's intrinsic properties, part of K’s extrinsic properties
relies on some of L's intrinsic properties, and K's extrinsic properties are common with
L's extrinsic properties. Accordingly, the intersection between K's extrinsic properties and
L's extrinsic properties, the intersection between K's extrinsic properties and L's intrinsic
properties, and the intersection between K's intrinsic properties and L's intrinsic
properties are non-empty.
GsP_ttft(K,L) :=
coveredBy(K,L) :=
∀p [(p ∈ CK°) → (p ∈ CL°)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)]
The GsP_tttt predicate applies when commonalities exist between intrinsic and extrinsic
properties of K and intrinsic and extrinsic properties of L – i.e. both have common
intrinsic properties, extrinsic properties that rely on each other as well as similar extrinsic
properties that depend on common external factors. Accordingly, all four intersections
are non-empty.
123
GsP_tttt(K,L) := ∃p [(p ∈ CK°) ∧ (p ∈ CL°)] ∧
∃p [(p ∈ CK°) ∧ (p ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ ∂CL)] ∧
∃q [(q ∈ ∂CK) ∧ (q ∈ CL°)]
As a result, the geosemantic proximity between a geoConcept K and a geoConceptRep L
is not essentially a symmetric relation. This has also been acknowledged by Rodriguez
(2000) and Benslimane (2001). In GsP, predicates for which (∂CK ∩ CL°) is equal to (∂CK
∩ CL°) are symmetric, for instance GsP_ffff (or disjoint), GsP_ffft, GsP_tfft (or equal),
GsP_fttt (or overlap), GsP_tttt, GsP_tfft (or meet), GsP_tttf, and GsP_fttf. The remaining
predicates are non-symmetric: GsP_fftt (or contains), GsP_ftft (or inside), GsP_tftt (or
covers), GsP_ttft (or coveredBy), GsP_tftf, GsP_ttff, GsP_fftf, and GsP_ftff.
It is interesting to note that the GsP notion behaves similarly with other well-known
structures describing relationships between concepts. For Instance, Thesaurus (Meta Data
Coalition, 1999; Milstead, 1998) uses the following relation types between concepts,
which map GsP predicates:
Narrower term: GsP_tftt (covers), GsP_fftt (contains)
Broader term: GsP_ttft (coveredBy), GsP_ftft (inside)
Use/used for: GsP_tfft (equal)
Related term: GsP_tfff (meet), GsP_tftf, GsP_ttff, GsP_tttf, GsP_fftf, GsP_ftff,
GsP_fttf, GsP_ffft, GsP_fttt (overlap), GsP_tttt
Kashyap and Sheth (1996) and Sheth and Kashyap (1992) introduced a taxonomy of
semantic proximity predicates in order to characterize the semantic similarity between
concepts. Here again, GsP predicates map the semantic proximity predicates:
Semantic equivalence: GsP_tfft (equal)
Semantic relationship: GsP_tftt (covers), GsP_fftt (contains), GsP_ttft
(coveredBy), GsP_ftft (inside)
Semantic relevance: GsP_ffft, GsP_fttt (overlap), GsP_tttt
Semantic resemblance: GsP_tfff (meet), GsP_tftf, GsP_ttff, GsP_tttf,
124
GsP_fftf, GsP_ftff, GsP_fttf
Semantic incompatibility: GsP_ffff (disjoint)
Also, WordNet (Miller et al., 1993) uses another set of relationships expressing the
relatedness between concepts, which can be compared to GsP predicates:
Synonyms: GsP_tfft (equal)
Coordinate terms: GsP_tfff (meet), GsP_tftf, GsP_ttff, GsP_tttf, GsP_tttt
Hypernyms: GsP_tftt (covers), GsP_fftt (contains)
Hyponyms: GsP_ttft (coveredBy), GsP_ftft (inside)
Holonyms/Meronyms: GsP_fftf, GsP_ftff, GsP_fttf
Although GsP predicates behave similarly to the above existing schemes, they provide a
more comprehensive set of predicates to express the semantic similarity between a
geoConcept and a geoConceptRep as we will illustrate below. We find that these
improvements justify the GsP approach, especially as the GsP predicates can be
computed automatically using existing ontologies.
4.6.2 Examples
Examples are outlined below to illustrate the convenience of GsP. They are based on
existing topographical databases and data product specifications that describe geographic
phenomena differently. In these examples, let us assume that an agent refers to a database
with its related data product specification, as its explicit ontology. This agent compares a
geoConcept of its own ontology to recognize a geoConceptRep received from another
agent, which agree with a different ontology, and evaluates the corresponding
geosemantic proximity.
First, let us compare the geoConcept road from (BC Ministry of Environment Lands
and Parks (Geographic Data BC), 1992) to the geoConceptRep vegetation from
(Natural Resources Canada, 1996). In this case, road is defined as “a specially prepared
route on land for the movement of vehicles (other than railway vehicles)” while
vegetation refers “an area covered with shrubs and/or trees.” Obviously, intrinsic and
125
extrinsic properties of road do not match any properties of vegetation . Therefore, the
geosemantic proximity between road and vegetation is GsP_ffff or disjoint.
Let us examine the case in which the geoConcept road from (Natural Resources
Canada, 1996) is compared to the geoConceptRep rue (street in English) from
(Québec, 2000). On one the hand, road refers to “road for the movement of motor
vehicle.” In road , street is one possible value of the road classification attribute and is
defined as “a public road in a residential or commercial area with buildings on one or
both sides.” On the other hand, rue is defined as “a communication thoroughfare lined
by buildings in a built-up area” (author's translation from the French definition). Rue
has to be connected to other streets or roads of other classifications. In this example,
rue maps the attribute street of the geoConcept road . Like road , it is geometrically
depicted as a linear feature. Moreover, rue has also a relationship with other types of
roads that are already included in road . Consequently, the geosemantic proximity
between road and rue is GsP_fftt or contains. Reciprocally, if we consider rue
being the geoConcept and road , the geoConceptRep, we can say that the geosemantic
proximity between rue and road is GsP_ftft or inside.
The following example compares the geoConcept wetland from (Natural Resources
Canada, 1996) to the geoConceptRep marsh/swamp from (VMap, 1995). On the one
hand, wetland consists in “a water-saturated area, covered intermittently or
permanently with water; the vegetation may either be marsh (reeds, grass, and cattails)
and swamp (shrub and trees)”. It has a relationship with peat cutting that is an area where
peat is cut. On the other hand, marsh/swamp corresponds to “a saturated area, at times
covered with water, supporting vegetation which may include trees”. As wetland , it
also has the same type of relationship with peat. Inasmuch as wetland and
marsh/swamp have essentially the same literal meaning, they are both depicted
geometrically by surface, and they have a nearly identical relationship with peat. We can
then consider that the geosemantic proximity between wetland and marsh/swamp is
GsP_tfft or equal.
126
In this next example, the geoConcept waterbody from (Natural Resources Canada,
1996) is compared to the geoConceptRep lake from (BC Ministry of Environment
Lands and Parks (Geographic Data BC), 1992). On the one hand, waterbody is simply
defined as “a body of water including rivers” (i.e. those rivers that are large enough to be
shown as surface). It includes all natural water areas and, also, has relationships with
other feature types such as dam, breakwater, and wharf. On the other hand, lake
represents “a body of fresh water that is completely surrounded by land.” It has
relationships with other kinds of water areas notably river/stream as well as with other
kinds of features such as dam, breakwater, and pier/wharf. As we can observe in this
example, the literal meaning of waterbody comprises the meaning of lake . Moreover,
lake can be considered as a specialization of waterbody that has relationships with
other sub-types of waterbody. They are both depicted geometrically by a surface. Also,
lake and waterbody have equivalent relationships with same external abstractions.
Therefore, we can conclude that the geosemantic proximity between waterbody and
lake is GsP_tftt or covers. Reciprocally, if lake is considered the geoConcept and
waterbody , the geoConceptRep, then the geosemantic proximity of lake with
waterbody is GsP_ttft or coveredBy.
In this example, we illustrate the geosemantic proximity between the geoConcept hazard
to air navigation and the geoConceptRep bridge both from (Natural Resources
Canada, 1996). On the one hand, hazard to air navigation refers to “area containing a
structure or landform high enough to create a hazard to air navigation.” The attribute type
of hazard to air navigation can take the value bridge, to indicate those hazards that are
bridges of height equal to or greater than 60 metres. On the other hand, bridge
represents a “part of a road or railway built on a raised structure and serving to span an
obstacle, river, another road or railway, etc.” without any height restriction. It has a
relationship with hazard to air navigation as well. As we can note in this example,
hazard to air navigation and bridge have common intrinsic properties as both refers
to bridge as attribute or abstraction. They are also depicted geometrically in the same
manner (e.g. line). Finally, they have a relationship with each other and, as such, extrinsic
properties of one intersect intrinsic properties of the other. Accordingly, we can say that
127
the geosemantic proximity between hazard to air navigation and bridge is GsP_fttt or
overlaps.
The last example illustrates the geosemantic proximity that exists between the
geoConcept bridge from (Natural Resources Canada, 1996) and geoConceptReps such
as gué (ford in English) from (BC Ministry of Environment Lands and Parks
(Geographic Data BC), 1992), ferryRoute from (BC Ministry of Environment Lands
and Parks (Geographic Data BC), 1992), or even tunnel from (VMap, 1995). Bridge
as defined in the last example refers to “part of a road or railway built on a raised
structure and serving to span an obstacle, river, another road or railway, etc.” Ford is a
“place where it is possible to cross a river by foot” (author's translation from the French
definition) or by vehicle (as ford is a kind of link between some types of road).
FerryRoute correspond to “the water route a ferry follows when transporting vehicles
and/or passengers.” Tunnel refers to “an underground or underwater passage, open at
both ends, and usually containing a road or a railroad.” All of these abstractions refer to
different kinds of phenomena but all have relationships with roads or railroads and, as
such, have a likewise behaviour corresponding to a place or a path to span an obstacle for
people, vehicles, or trains, that links roads or railroads. As such, the geosemantic
proximity between bridge and ford , ferryRoute , and tunnel is GsP_tfff or meets.
In this section, we have described the notion of geosemantic proximity and have
illustrated its use. An approach such as GsP for the assessment of the semantic proximity
of geospatial data seems quite appropriate because it follows the same topological
paradigm than existing spatial and temporal topological theories commonly employed in
geographical information systems (Allen, 1983; Clementini and Di Felice, 1996;
Egenhofer, 1993; Egenhofer and Franzosa, 1991). The manner GsP is developed is also
compatible with the way spatial and temporal information is handled in ISO 19100–
International standards on geographic information/geomatics (ISO/TC 211, 2002a;
ISO/TC 211, 2003a) and in Open GIS Consortium Inc. specifications (Open GIS
Consortium Inc., 1999b).
128
4.7 Prototype
Although GsP predicates could be used for the development of ontologies to describe the
similarity between geoConcepts a priori, our fundamental intent is to use the GsP
approach to compute automatically the semantic similarity between a geoConcept and a
geoConceptRep based on existing ontologies. The semantic similarity will be expressed
qualitatively with the GsP predicates as it happens for the recognition and the production
of geoConceptReps in interoperability between systems (Brodeur et al., 2003). In our
prototype, interoperability consists in a communication process between a client agent
requesting geospatial information and a provider agent supplying geospatial data (Figure
25). The client agent sends a request about geospatial information based on its own
abstraction of geographic phenomena and vocabulary, i.e. his own ontology, for example
lake, river, and Sherbrooke. The request travels in the communication channel up to the
provider agent. Once the request reaches its destination, the provider agent works to
recognize the message using its own knowledge and vocabulary, i.e. its own set of
geoConcepts, for instance waterbody , watercourse , and Sherbrooke . Once the
message is recognized, the provider agent uses geoConcepts that recognize the
geoConceptReps of the request to retrieve data from its database complying with the
client agent's request. Then, it encodes the data using its own vocabulary into
geoConceptReps, for instance Lac des Nations , Magog river , and St-François
river , and sends it to the client agent. When the client agent receives the data, it has to
recognize the data coming from the provider agent and to evaluate that it fits its request to
complete the interoperability cycle. The prototype called GsP Prototype experiments and
demonstrates the notion of GsP within the framework for geospatial data interoperability
presented in section 2. This section introduces this prototype.
GsP Prototype was carried out using software agents (Nwana, 1996) developed in Java™
and communicating in XML. With, the prototype, we instantiate agents (user and
provider) of identical and different geospatial ontologies to test the geosemantic
129
Figure 25: Prototype principle
proximity concept. Ontologies were developed in the form of geospatial data repositories
(Brodeur et al., 2000). A geospatial data repository consists of a collection of metadata
that is structured in such a way to provide the meaning (i.e. the semantics) and the
structure of concepts maintained in a geospatial database. It includes a conceptual schema
and a data dictionary of geoConcepts. More specifically, we used Perceptory (Bédard,
1999; Bédard and Proulx, 2002), a spatially-extended UML visual modeling tool, to
compile the ontologies. Ontologies correspond to a network of interconnected
geoConcepts (Figure 26) in which geoConcepts are the nodes and associations between
geoConcepts are the arcs. Here, the ontology compares with an existing local database
schema (Bishr, 1997; Sheth, 1999) that is maintained persistently in the agent’s memory.
Each geoConcept is encapsulated by a set of functions, which provide reasoning
capabilities. These functions allow a geoConcept to recognize and to produce
geoConceptReps, and to assess automatically the GsP of the geoConcept with other
geoConceptReps. Each agent navigates from geoConcept to geoConcept in its ontology
using a network traversal function; we used the Breadth First Traversal algorithm.
130
Figure 26: Network of geoConcepts
Once an agent receives a query about geoConceptReps, it initiates the recognition of
these geoConceptReps. Accordingly, the set of geoConceptReps are passed to a proxy,
which begins to visit geoConcepts recently used by the agent. Each geoConcept visited
evaluates its GsP with one geoConceptRep at a time. The geoConcept is placed in a list
when its GsP is different of “ffff” (disjoint). If a geoConcept having a GsP “tfft” (equal) is
found, it is then used to answer the request. Otherwise, the proxy begins to visit
geoConcepts of the ontology (i.e. geoConcepts stored in the geospatial data repository)
until a geoConcept estimates its GsP to “tfft” or the ontology is traversed completely. All
geoConcepts with a GsP different of “ffff” (disjoint) are again placed in a list. If a
geoConcept of GsP equal to “tfft” is found, it is used to answer the request. Otherwise,
the found geoConcepts of GsP different of “ffff” are sorted by their respective GsP to
identify the most similar geoConcept, which is used to answer the query. Once, the other
agent receives the answer to its query, it initiate an identical process to recognize the
geoConceptReps it receives.
Agents communicate with others using messages that include geoConceptReps.
GeoConceptReps are essentially transient representations of geographic phenomena. It is
the responsibility of geoConcepts for which information is wanted or those geoConcepts
that are used to answer a query to generate and encode the appropriate geoConceptReps.
131
In the GsP Prototype, query and answer messages consist of streams of XML data
encoded according to a predefined schema, i.e. an XML Schema, which determines the
semantics and the structure of the message.
We used road and hydrologic network ontologies compiled from five distinct geospatial
data product specifications to test the prototype and more specifically the computer
feasibility of the geosemantic proximity concept:
- Standards and Specifications for the National Topographic Data Base (NTDB) of
Canada (Natural Resources Canada, 1996) (46 geoConcepts),
- Specifications for Digital and Hardcopy Property and Basemap Products of
Province of Prince Edward Island (PEIBP) (P.E.I. Geomatics Information Centre)
(100 geoConcepts),
- Specifications for the “Base de données topographiques du Québec” (BDTQ)
(Québec, 2000) (67 geoConcepts),
- Specifications for the Ontario Digital Topographic Database (ODTDB) (OBM,
1996) (24 geoConcepts), and
- Specifications for the Digital Baseline Mapping at 1:20000 of Province of British
Columbia (DBMBC) (BC Ministry of Environment Lands and Parks (Geographic
Data BC), 1992) (40 geoConcepts).
The experiment demonstrated that it is possible to software agents of identical and
distinct ontologies to communicate each other. With the GsP approach, geoConcepts
assess their geosemantic proximity with geoConceptReps automatically for the
recognition or generation operations in order to send or answer queries. For instance, an
NTDB agent maps properly its geoConcept road with the geoConceptRep street
received as part of a query from a BDTQ agent and, reciprocally, the BDTQ agent maps
properly its street geoConcept with the road generated geoConceptRep received from
the NTDB agent as part of the answer. This was also observed between water
disturbance of NTDB and rapids of PEIBP and in many other cases. The detailed
132
description of the prototype along with the results of the experiments is presented in more
detail in chapter 5.
4.8 Conclusion
In the present chapter, we proposed a new approach to assess the semantic similarity
between geospatial abstractions (specifically between a geoConcept and a
geoConceptRep) called geosemantic proximity (GsP). It is basically a context-based
approach, which compares the context of a geoConcept to the context of a
geoConceptRep. The context of a geoConcept and of a geoConceptRep is expressed by
the way of intrinsic (i.e. the literal meaning) and extrinsic (i.e. meaning influenced by
external aspects) properties, which globally provide the intended meaning (semantics) of
such abstractions.
More specifically, GsP includes a set of geosemantic predicates, which express the
commonalities between intrinsic and extrinsic properties of a geoConcept with a
geoConceptRep. It is based on the 4-intersection topological model and considers that the
interior of an abstraction corresponds to the set of intrinsic properties and boundaries
correspond to the set of extrinsic properties. As such, GsP follows existing spatial and
temporal approaches (Allen, 1983; Egenhofer, 1993; Egenhofer and Franzosa, 1991) and,
consequently, appears to be a convenient approach in the geospatial information realm.
In practice, geoConcepts assess the geosemantic proximity in order to recognize and to
produce geoConceptReps that are semantically similar to them. As such, GsP plays an
essential role to realize the semantic interoperability of geospatial data.
An experimental prototype is currently under development that will be tested against
existing provincial data sources along with the new Canadian GeoBase definition. The
9-intersection model should also be worked out to take into consideration the difference
or the variability (Rodriguez, 2000) of context in the GsP assessment. The expected
outcomes of this research will provide a new understanding of geospatial data
133
interoperability itself as well as a new way to achieve semantic interoperability of
geospatial data.
Acknowledgements
The authors wish to acknowledge the contribution of Natural Resources Canada – Centre
for Topographic Information in supporting the first author for this research and of the
GEOIDE Network of Centres of Excellence in geomatics, project DEC#2 (Designing the
Technological Foundations of Spatial Decision-making with the World Wide Web).
4.9 References
Allen, J F 1983 Maintaining Knowledge about Temporal Intervals. Communication of the
ACM, 26(11): 832-843
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y 1999 Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186
Bédard, Y, and M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory
Benslimane, D 2001 Interopérabilité de SIG : la solution Isis. Revue internationale de
géomatique, 11(1): 7-42
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Brodeur, J, and Y Bédard 2001 Geosemantic Proximity, a Component of Spatial Data
Interoperability. In Proceedings of International Workshop on "Semantics in
Enterprise Integration" (OOPSLA 2001): 6
134
Brodeur, J, Y Bédard, G Edwards, and B Moulin 2003 Revisiting the Concept of
Geospatial Data Interoperability within the Scope of a Human Communication
Process. Transactions in GIS, 7(2): 243-265
Brodeur, J, Y Bédard, and M J Proulx 2000 Modelling Geospatial Application Databases
using UML-based Repositories Aligned with International Standards in
Geomatics. In Proceedings of Eighth ACM Symposium on Advances in
Geographic Information Systems (ACMGIS) ACM Press: 39-46
Casati, R, B Smith, and A C Varzi 1998 Ontological Tools for Geographic
Representation. In N Guarino (ed) Formal Ontology in Information Systems.
Amsterdam, IOS Press: 77-85
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Clementini, E, and P Di Felice 1995 A Comparison of Methods for Representing
Topological Relationships. Information Sciences-Applications: An International
Journal, 3(3): 149-178
Clementini, E, and P Di Felice 1996 A Model for Representing Topological Relationship
Between Complex Geometric Features in Spatial Databases. Information
Sciences, 90(1-4): 121-136
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273
Egenhofer, M, and R D Franzosa 1991 Point-Set Topological Spatial Relations.
International Journal of Geographic Information Science, 5(2): 161-174
Egenhofer, M, D M Mark, and J R Herring 1994a The 9-Intersection: Formalism and Its
Use for Natural-Language Spatial Predicates. Santa Barbara, CA, University of
California, National Center for Geographic Information and Analysis Technical
Report 94-1
Egenhofer, M, and J Sharma 1992 Topological Consistency. In Proceedings of 5th
International Symposium on Spatial Data Handling IGU Commission of GIS:
335-343
135
Egenhofer, M J 1997 Spatial Relations: Models and Inferences. In Proceedings of
Tutorial 2 - 5th International Symposium on Spatial Databases (SSD'97): 83
Egenhofer, M J, E Clementini, and P Di Felice 1994b Topological relations between
regions with holes. International Journal of Geographic Information Science,
8(2): 129-142
Egenhofer, M J, and R D Franzosa 1995 On the equivalence of topological relations.
International Journal of Geographic Information Science, 9(2): 133-152
Egenhofer, M J, and D M Mark 1995 Modelling conceptual neighbourhoods of
topological line-region relations. International Journal of Geographic
Information Science, 9(5): 555-565
Frank, A U 2000 Spatial Communication with Maps: Defining the Correctness of Maps
Using a Multi-Agent Simulation. In C Freksa, W Brauer, C Habel, and K F
Wender (eds) Spatial Cognition II : Integrating Abstract Theories, Empirical
Studies, Formal Methods, and Practical Applications (International Workshop on
Maps and Diagrammatical Representations of the Environment, Hamburg,
August 1999). Berlin Heidelberg, Springer-Verlag: 80-99
Frankhauser, P, M Kracker, and E Neuhold 1991 Semantic vs. Structural Resemblance of
Classes. SIGMOD Record, 20(4): 59-63
Guarino, N 1995 Formal Ontology, Conceptual Analysis and Knowledge Representation.
International Journal of Human and Computer Studies, 43(5&6): 625-640
Guarino, N 1999 The Role of Identity Conditions in Ontology Design. In Proceedings of
Spatial Information Theory - Cognitive and Computational Foundations of
Geographic Information Science, International Conference COSIT'99, 3-540-
66365-7. Berlin, Springer-Verlag Lecture Notes in Computer Science 1661: 221-
234
Guarino, N, and C Welty 2000a A Formal Ontology of Properties. In Proceedings of
Knowledge Engineering and Knowledge Management: Methods, Models and
Tools (12th International Conference, EKAW2000), 3-540-41119-4. Berlin,
Springer-Verlag Lecture Notes in Computer Science 1937: 97-112
136
Guarino, N, and C Welty 2000b Identity, Unity, and Individuation: Towards a Formal
Toolkit for Ontological Analysis. In Proceedings of ECAI-2000: The European
Conference on Artificial Intelligence. Amsterdam, IOS Press: 219-223
Harvey, F J, W Kuhn, H Pundt, Y Bishr, and C Riedemann 1999 Semantic
Interoperability: A Central Issue for Sharing Geographic Information. The Annals
of Regional Science, 33(2): 213-233
ISO/TC 211 2002a ISO 19108:2002 Geographic Information - Temporal Schema.
Geneva, Switzerland, International Organization for Standardization
ISO/TC 211 2002b ISO/DIS 19109 Geographic Information - Rules for Application
Schema. Geneva, Switzerland, International Organization for Standardization
ISO/TC 211 2003a ISO 19107:2003 Geographic Information - Spatial Schema. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003b ISO 19115:2003 Geographic Information - Metadata. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003c ISO/DIS 19118 Geographic Information - Encoding. Geneva,
Switzerland, International Organization for Standardization
Kashyap, V, and A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Kashyap, V, and A Sheth 1998 Semantic Heterogeneity in Global Information Systems:
the Role of Metadata, Context and Ontologies. In M P Papazoglou, and
G Schlageter (eds) Cooperative Information Systems-Trends and Directions. San
Diego, CA, Academic Press: 139-178
Kottman, C 1999 The Open GIS Consortium and Progress Toward Interoperability in
GIS. In M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds)
Interoperating Geographic Information Systems. Boston, Massachusetts, Kluwer
Academic Publisher: 39-54
Laurini, R 1998 Spatial Multi-Database Topological Continuity and Indexing: a Step
Towards Seamless GIS Data Interoperability. International Journal of
Geographic Information Science, 12(4): 373-402
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50
137
Locke, J 1689 An Essay Concerning Human Understanding. Web Page Document,
http://humanum.arts.cuhk.edu.hk/Philosophy/Locke/echu/
Mark, D M, and M Egenhofer 1994 Modeling Spatial Relations Between Lines and
Regions: Combining Formal Mathematical Models and Human Subjects Testing.
In M Egenhofer, D M Mark, and J R Herring (eds) The 9-Intersection: Formalism
and Its Use for Natural-Language Spatial Predicates (Technical Report).
NCGIA: 29-83
McKee, L, and K Buehler (eds) 1998 The OpenGIS Guide. Wayland, Massachusetts,
OpenGIS Consortium Inc.
Merriam-Webster Inc. 1994 Merriam-Webster’s Collegiate Dictionary - Electronic
Edition - Version 1.2. CD Document
Meta Data Coalition 1999 Knowledge Management Model: Knowledge Description.
Microsoft Corporation and Liris Interactive, 1996 Bibliorom Larousse, Version 1.0.
Microsoft Corporation and Liris Interactive, CD Document
Miller, G A, R Beckwith, C Fellbaum, D Gross, and K Miller 1993 Introduction to
WordNet: An On-line Lexical Database. Princeton, Cognitive Science
Laboratory, Princeton University
Milstead, J L 1998 NISO Z39.50: Standard for Structure and Organization of Information
Retrieval Thesauri. In Proceedings of Taxonomic Authority Files Workshop: 9
Natural Resources Canada 1996 National Topographic Data Base - Standards and
Specifications. Sherbrooke, Quebec, Centre for Topographic Information-
Sherbrooke
New Brunswick 2000 Guide d’utilisation de la Base de données topographiques
numériques (BDTN) du Nouveau-Brunswick. Fredericton, New Brunswick,
Services Nouveau-Brunswick
Nwana, H S 1996 Software Agents: An Overview. The Knowledge Engineering Review,
11(2): 205-244
Object Management Group 2001 OMG Unified Modeling Language Specification
(version 1.4). Needham MA, OMG
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources
138
Open GIS Consortium Inc. 1999a OpenGIS Simple Features Specification for SQL.
Wayland, Massachusetts, OpenGIS Consortium Inc.
Open GIS Consortium Inc. 1999b Topic 1: Feature Geometry. Wayland, Massachusetts,
Open GIS Consortium Inc.
Open GIS Consortium Inc. 2001 Geography Markup Language (GML) 2.0. Wayland,
Massachusetts, Open GIS Consortium Inc.
Ouksel, A M, and A Sheth 1999 Semantic Interoperability in Global Information
Systems: A Brief Introduction to the Research Area and the Special Section.
Sigmod Record, 28(1): 5-12
P.E.I. Geomatics Information Centre User’s Guide to Digital and Hardcopy property and
Basemap Products. Charlottetown, P.E.I., Provincial Treasury - Taxation &
Property Records Division
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l’information géographique, CD
Document
Rodriguez, A, M J Egenhofer, and R D Rugg 1999 Assessing Semantic Similarities
Among Geospatial Feature Class Definition. In Proceedings of Interoperating
Geographic Information Systems (Interop '99). Berlin, Springer-Verlag Lecture
Notes in Computer Science 1580: 189-202
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine
Schramm, W 1971 How Communication Works. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 12-21
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29
Sheth, A, and V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
139
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312
Smith, B 1994 Fiat Objects. In Proceedings of Workshop on Parts and Wholes:
Conceptual Part-Whole Relations and Formal Mereology, 11th European
Conference on Artificial Intelligence: 15-23
Smith, B, and D Mark 1999 Ontology with Human Subjects Testing: An Empirical
Investigation of Geographic Categories. American Journal of Economics and
Sociology, 58(2): 245-272
Smith, B, and A C Varzi 2000 Fiat and Bona Fide Boundaries. Philosophy and
Phenomenological Research, 60(2): 401-420
Sowa, J F 1987 Semantic Networks. In S C Shapiro (ed) Encyclopedia of Artificial
Intelligence. New York, John Wiley & Sons
Statistics Canada 1997 Digital Boundary File and Digital Cartographic File 1996
Census (Reference Guide). Ottawa, Minister of Industry
VMap 1995 Vector Map (VMap), Level 1. Bethesda, MD, U.S. National Imagery and
Mapping Agency Mil-V-89033
Weisstein, E W 1999 Topological space. Mathworld. wolfram.com, Web page
Document, http://mathworld.wolfram.com/TopologicalSpace.html
Wisse, P 2000 Metapattern: Context and Time in Information Models. Reading,
Massachusetts, Addison-Wesley
WordNet, a lexical database for the English language. Web Page Document,
http://www.cogsci.princeton.edu/~wn/
CHAPITRE 5
EXPÉRIMENTATION DE L’INTEROPÉRABILITÉ
SÉMANTIQUE DES DONNÉES GÉOSPATIALES
ET DE LA PROXIMITÉ GÉOSÉMANTIQUE :
PRÉSENTATION DU GSP PROTOTYPE
A Geosemantic Proximity -based prototype for the interoperability of geospatial data
(J. Brodeur, Yvan Bédard, Bernard Moulin)
5.1 Résumé de l’article
Les chapitres 3 et 4 ont présenté une nouvelle perspective d’interopérabilité des données
géospatiales qui intPgre le volet sémantique avec un cadre conceptuel d’interopérabilité
basée sur le processus de communication et sur la notion de proximité géosémantique.
Bien que la pertinence de notre cadre conceptuel et de la notion de proximité
géosémantique nous semble acquise jusqu’ici, il devient nécessaire d’expérimenter le tout
pour démontrer le réalisme d’une telle approche. En ce sens, ce chapitre présente un
article sur le développement d’un prototype qui applique notre cadre conceptuel et notre
notion de proximité géosémantique pour réaliser l’interopérabilité des données
géospatiales : le GsP Prototype. D’une part, on y décrit l’architecture et le
fonctionnement du GsP Prototype. C’est un systPme B base d’agents logiciels qui
141
interagissent entre eux B l’aide de messages XML. Chaque agent possPde sa propre
ontologie qui consiste en un répertoire de données géospatiales. D’autre part, on présente
dans ce chapitre les essais conduites qui vérifient le bon fonctionnement du prototype B
l’aide d’ontologies élaborées B partir des spécifications de données de cinq bases de
données existantes.
5.2 Abstract
The research agenda related to the interoperability of geospatial data is influenced by the
increased accessibility of geospatial databases on the Internet, as well as their sharing and
their integration. Although it is now possible to get and use geospatial data independently
of their syntax and structure, it is still difficult for users to find the exact data they need as
long as they do not know the precise vocabulary used by the organizations supporting
geospatial databases. It is now a necessity to take into consideration the semantics of
geospatial data to enable its full interoperability.
To this end, we designed a new conceptual framework for geospatial data interoperability
and introduced the notion of geosemantic proximity based on human communication and
cognition paradigms. This chapter reviews this framework and the notion of geosemantic
proximity. It also presents the GsP Prototype, which demonstrates the relevance of our
framework and of the notion of geosemantic proximity for geospatial data
interoperability. More specifically, we describe the architecture of the GsP Prototype, its
implementation, and tests conducted so far.
5.3 Introduction
Many geospatial databases have been set up during the last twenty years by different
organizations to establish information bases corresponding to their specific needs. In this
respect, the National Topographic Data Base (NTDB) (Natural Resources Canada, 1996)
was elaborated for national mapping and GIS application purposes in Canada. Also, the
VMap libraries (VMap, 1995) that also include topographic features of Canada were
142
developed for military purposes. Moreover, Statistics Canada established the Street
Network Files and the Digital Cartographic Files for socio-demographic and enumeration
purposes (Statistics Canada, 1997). Additional topographic data sources produced at
larger scales by provincial departments (e.g. OBM, 1996; Québec, 2000) are also other
Canadian geospatial database examples. Each of these examples describes topographic
features in different manners. To illustrate this, we have observed that forest-like
phenomena are abstracted as vegetation in NTDB, trees in VMap, wooded area
in Ontario Digital Topographic Data Base, and milieu boisé in the Base de données
topographiques du Québec (where the pictograms point out the type of geometry used to
map the feature geometry: point/ , line/ , or polygon/ ; see (Bédard and Proulx, 2002)
for the description of spatial pictograms).
Since these organizations found that their respective databases are of general interest,
they made them available to the public. Today, the Internet, the Web, and geospatial data
infrastructures such as the Canadian Geospatial Data Infrastructure (CGDI)
(GeoConnections, 2002) and the National Spatial Data Infrastructure (NSDI) (FGDC,
2002) facilitate the access to these geospatial databases.
Because users have access to several topographic databases, they expect to find, get and
integrate the exact data they need from various databases according to their own
perception and abstraction of the topographic reality. Hence, such a situation raises
problems of syntactic, structural, semantic, geometric, and temporal heterogeneities
between geospatial databases (Bishr, 1997; Charron, 1995; Laurini, 1998; Ouksel and
Sheth, 1999; Sheth, 1999).
The idea of interoperability of geospatial databases has been promoted in the nineties to
overcome the above mentioned heterogeneity problems and to allow the sharing and the
integration of geospatial data and geospatial resources (Kottman, 1999). The current basis
of geospatial data interoperability has been worked out by organizations such as the Open
GIS Consortium Inc. (OGC), ISO/TC 211, governmental organizations, the geographic
information industry and the geographic information academic community. They have
143
made considerable progress particularly with regards to syntactic and structural
heterogeneities (Egenhofer, 1999; Ouksel and Sheth, 1999; Rodriguez, 2000). Documents
such as (ISO/TC 211, 2003b; ISO/TC 211, 2003a; Open GIS Consortium Inc., 1999;
Open GIS Consortium Inc., 2001) define the content and the structure of geometric data
as well as the syntactical description of geospatial data. But, to enable complete
interoperability of geospatial data, it is essential to go beyond structural and syntactic
heterogeneities and to address semantic heterogeneities as well as geometric and
temporal heterogeneities (Egenhofer, 1999; Ouksel and Sheth, 1999).
Recently, we proposed a conceptual framework for geospatial data interoperability based
on an analogy with human communication and have also introduced the notion of
geosemantic proximity (GsP) (Brodeur and Bédard, 2001; Brodeur et al., 2003; Brodeur
et al., 2002) as a solution to problems of semantic, spatial, and temporal heterogeneities
of geospatial data. We also developed an experimental prototype, called GsP Prototype,
to validate both our conceptual framework for geospatial data interoperability and the
notion of geosemantic proximity. This chapter specifically aims at presenting this
prototype and the experiments we have conducted so far.
The remaining sections of this chapter are structured as follows. The next section reviews
geospatial data interoperability in the context of the communication process, the notion of
geosemantic proximity, and the notion of a geospatial repository, which serve as the
agent’s application ontologies (Gruber, 1993; Guarino and Welty, 2000) in the prototype.
In section 5.5, we present the GsP Prototype, its architecture, its operation, and tests. We
conclude and present future work in section 5.6.
5.4 Geospatial data interoperability and communication
Because people usually understand each other when communicating, we suggested that
interoperability of geospatial data conforms to a human communication process (Brodeur
et al., 2003). Harvey (2002) and Xhu and Lee (2002) also support this idea. As such, we
developed a conceptual framework for geospatial data interoperability as a human-like
144
communication process. In this section, we review our conceptual framework for
geospatial data interoperability, which is the foundation of the architecture of the
prototype presented in the next section.
According to (Schramm, 1971), a human communication process involves a source, a
message, and a destination. When the source transmits information to the destination,
he/she encodes a message, that is, to identify the information to be communicated and to
transform it into physical signals. At this point, the message is still tied to the source’s
meaning. Afterwards, the source releases the message in the communication channel
towards destination. Then, the message is released from the source’s meaning. The
message plays a mediating role between the source and the destination. When the
message arrives at its destination, the destination begins to decode it. It recognises the
signals that compose it and assigns them a specific meaning. The communication process
is working perfectly when the source’s meaning and the destination’s meaning of the
message are the same. However, a possible source of noise can interfere with message’s
signals in the communication process and affect the transmission of the message. On the
other hand, the communication process includes a feedback mechanism, which acts as a
function to check how well the communication is performed. For instance, feedback may
inform the source whether the destination has understood the message properly. As we
can see, multiple representations of reality take place in the human communication
process, namely the source’s and destination’s cognitive models, and the physical signals
used for the message transmission. In the communication process, by definition the
source and the destination succeed in exchanging information when they interoperate
with each other.
The source’s and destination’s cognitive models result from the direct and the indirect
observation (e.g. through sensors such as Earth observation satellite or aerial digital
camera) of real-world phenomena and intentionally-produced signals received from other
people. Human sensory systems capture signals and form so-called perceptual states
(Barsalou, 1999). From perceptual states, the human selective attention collects the
properties of interest and records them permanently as perceptual symbols, also known as
145
concepts (Barsalou, 1999). As a cognitive element, a concept can never be accessed
directly by another individual. It must be translated into physical signals, here called
conceptual representations, in order to be communicated. A concept consists therefore of
hidden-like data elements and a translation function that encapsulates these data
elements. This translation function operates in two directions: (1) to generate conceptual
representations when one wants to send a message and (2) to recognize conceptual
representations when one wants to understand a received message.
Based on the human communication process, we developed a conceptual framework for
geospatial data interoperability (Figure 12) (Brodeur et al., 2003). Let us use the
following situation to explain our framework. An individual (shown as a user agent in
Figure 12, Au) wants information about the hydrologic network for flood analysis within a
predefined area of the city of Sherbrooke. He/she encodes a query to have information
about lakes and rivers in the specified area—i.e. the conceptual representations—and
sends the query to a geospatial database (shown as a data provider agent in Figure 12,
Adp). When the database gets the request, it decodes it, that is, to find and assign concepts
of the database that recognise the conceptual representations received, for instance
watercourses and waterbodies in the neighbourhood of Sherbrooke . According to its
interpretation, the database then gathers data, encodes and sends them—i.e. Lac des
Nations , Magog River , and Saint-François River —to the individual, who evaluates
that the received data answers exactly his initial request. In this situation, the individual
and the geospatial database use their respective vocabulary to communicate. They end up
understanding each other because of their common set of conceptual representations and
backgrounds, as well as reasoning capabilities, which enable them to recognize and
generate messages as described in section 3.5.
Our framework, illustrated in more detail in Figure 12, encompasses five different
expressions of the same topographic reality (R, R’, R’’, R’’’, and R’’’’). These
expressions, that we called the five ontological phases of geospatial data interoperability
(Brodeur et al., 2003), are related because of the communication process. First, we have
the topographic reality (R) at a given time about which Au wants information. This
146
topographic reality is beyond description. Second, we have the Au’s set of properties (R’)
organized into concepts that represents R. R’ refers to the Au’s cognitive model. Third,
we have the set of conceptual representations (R’’) that Au generates to communicate data
about R’. These conceptual representations consist of relevant properties that describe R’
concepts in a given context. They consist of the data employed for interoperability with
Adp. In Figure 12, R’’ is illustrated by “Lakes or Rivers within Sherbrooke.” Fourth, we have
the set of concepts (R’’’), which Adp maintains. When Adp receives R’’ conceptual
representations, it uses R’’’ concepts to recognize and assign a meaning to R’’ conceptual
representations and, afterwards, to collect the information that complies with Au’s initial
request. In Figure 12, these concepts correspond to watercourses , waterbodies , and
Sherbrooke . In turn, Adp uses again R’’’ concepts along with the corresponding
information to encode conceptual representations (R’’’’) to answer Au. These conceptual
representations consist of Lac des Nations , Magog River , and Saint-François River in
Figure 12. Finally, when Au receives R’’’’ conceptual representations, he/she decodes
them, that is again to recognise and assign them a meaning, and validates them against R’
concepts. If R’’’’ conceptual representations correspond to the requested R’ concepts,
then we can say that interoperability happens between Au and Adp. Accordingly,
interoperability is a bi-directional communication process that includes a feedback
mechanism in both directions, to control the proper reception of messages and ensure that
they were understood properly.
As mentioned previously, R, R’, R’’, R’’’, and R’’’’ consist in different facets of the
reality which are concerned about ontology, even if they have similarities. In philosophy,
ontology is a subject matter dealing with:
- the description of the world (Peuquet et al., 1998);
- a model and an abstract theory of the world (Smith and Mark, 1999);
- the science of being (Bittner and Edwards, 2001; Peuquet et al., 1998);
- the science of the type of entities, of the objects, of the properties, of the
categories, and of relationships, which constitute the world (Lehmann, 1992;
Peuquet et al., 1998; Smith and Mark, 1999).
147
Ontology is also a subject of interest in artificial intelligence and database. It has been
defined by (Gruber, 1993) as “an explicit specification of a conceptualisation”. However,
(Guarino, 1998) refined Gruber’s definition taking into account the philosophical
meaning of ontology and defined ontology as “a logical theory accounting for the
intended meaning of a formal vocabulary.” Hence, in the scope of this thesis, we consider
an ontology as being “a formal representation of phenomena with an underlying
vocabulary and axioms including definitions that make the intended meaning explicit and
describe phenomena and their interrelationships” (Brodeur et al., 2003).
In the database realm, the representation of real-world phenomena is widely developed
using conceptual models (e.g. E-R or UML models) and feature dictionaries. Together,
these two components constitute a comprehensive set of metadata describing the content
and the structure of databases, which are better known as database repositories (Brodeur
et al., 2000; Jones, 1991; Marco, 2000; Moriarty, 1990; Prabandham et al., 1990). A
conceptual model is a tool to capture abstract representations of real-world phenomena
from a data-centered analysis perspective. It is also used to support the development of
databases. It structures and stores features of interest using general categories, object
classes, properties, relationships, generalizations, aggregations, roles, constraints,
behaviours, and more specifically in the context of geospatial databases, geometry and
temporality. The dictionary stores the intended meaning (in other words the semantics) of
all elements that compose the conceptual model. In geographic information, Perceptory
(Bédard and Proulx, 2002) is a tool specially developed to build, manage, and exploit
geospatial data repositories. It consists of a UML-based conceptual modeling tool
enhanced with the Plug-in for Visual Language (PVL) (Bédard, 1999; Bédard and
Proulx, 2002) for spatial and temporal data modeling and an object class dictionary. As
such, geospatial repositories developed with Perceptory can serve as application
ontologies.
Practitioners of different backgrounds and professional experiences typically abstract
identical phenomena and develop geospatial databases with their respective repository
148
differently. The situation and the circumstances surrounding the perception of geospatial
phenomena guide the manner with which these geospatial phenomena are abstracted.
This refers more specifically to the context. The context is an abstract notion, which
drives the definition of concepts and conceptual representations, and the choice of
properties that are used for their description (Simsion, 2001). It is the context that
provides the inherent semantics to concepts and conceptual representations (Kashyap and
Sheth, 1996). Hence, the same part of the topographic reality is typically represented
differently from one database to another because of their specific context. This causes
interoperability problems when merging data from different geospatial databases.
Notwithstanding this, context is a fundamental element for the assessment of the
semantic, spatial, and temporal interoperability of geospatial data. Accordingly, the
assessment of semantic, spatial, and temporal interoperability of geospatial data needs
reasoning capabilities that take the context into consideration. Keeping this in mind, we
developed the notion of geosemantic proximity following a context-based orientation
(Brodeur et al., 2002).
We mentioned earlier that a concept has a translation function in order to generate and
recognize conceptual representations. Hence, geosemantic proximity (GsP) consists of a
basic component of this translation function, which specifically applies to geospatial
concepts. GsP evaluates qualitatively the semantic similarity (Kashyap and Sheth, 1996;
Sheth and Kashyap, 1992) of a geospatial concept (hereafter called a geoConcept) with a
geospatial conceptual representation (hereafter called a geoConceptRep) by the
comparison of their respective context. In GsP, the context (C) consists of the set of
inherent properties of a geoConcept or a geoConceptRep. These properties are classified
in two types: intrinsic and extrinsic. Intrinsic properties (C°) provide the literal meaning
of the geoConcept or the geoConceptRep. They consist of the identification, attributes,
attribute values, geometries, temporalities, and domain of a geoConcept or a
geoConceptRep. Extrinsic properties (∂C) are properties that are subject to external
factors. They give meaning by the action that these factors exercise on the geoConcept or
the geoConceptRep. Behaviours as well as semantic, spatial, and temporal relationships
are kinds of extrinsic properties. We use a segment (Figure 22), which holds in a
149
semantic space, to illustrate the context of a geoConcept or geoConceptRep. Intrinsic
properties correspond to the interior of the segment, whereas extrinsic properties
correspond to the boundary of the segment. Hence, the context (C) consists of the union
of intrinsic and extrinsic properties: C = C° ∪ ∂C. Therefore, the GsP of a geoConcept
(K) with a geoConceptRep (L) can now be defined by the intersection of their respective
context, GsP(K,L) = CK ∩ CL , which becomes a 4-intersection matrix when consolidated
with intrinsic (C°) and extrinsic (∂C) properties (Equation 11).
Each component of the matrix can be evaluated empty (denoted by f or false) or not
empty (denoted by t or true). Accordingly, we derived sixteen (24) predicates that are
presented in the matrix row major form (i.e. row by row) with the prefix “GsP_”:
GsP_ffff (or disjoint), GsP_ffft, GsP_fftt (or contains), GsP_tfft (or equal), GsP_ftft (or
inside), GsP_tftt (or covers), GsP_ttft (or coveredBy), GsP_fttt (or overlap), GsP_tttt,
GsP_tfff (or meet), GsP_tftf, GsP_tttf, GsP_ttff, GsP_fttf, GsP_fftf, GsP_ftff (Brodeur et
al., 2002). These predicates are used to qualify the GsP of a geoConcept with a
geoConceptRep.
Let us use the following example to illustrate the relevance of GsP. According to our
conceptual framework, a user agent, which is based on the Base de données
topographiques du Québec (BDTQ) ontology, aims to update its road network
information. It asks a data provider agent, which is based on the National Topographic
Data Base (NTDB), for information about street —i.e. an encoded geoConceptRep.
When the data provider agent receives the request, it looks through the geoConcepts it
knows to find one that is geosemantically (i.e. semantically, spatially and temporally)
similar to street . The data provider agent identifies that its geoConcept road has an
attribute classification, which can take the value street of similar definition to
geoConceptRep street . Also, road and street have the same type of geometry. As
such, they hold common intrinsic properties. As defined in BDTQ, street possesses
relationships with other road classes. But, these road classes are already included in the
road description. As such, street ’s extrinsic properties intersect with road ’s
intrinsic properties. Accordingly, GsP of the road geoConcept when compared to the
150
street geoConceptRep is GsP_ttff (or contains) and, as such, road can be used to
answer the request of the user agent about street .
As used in (Fowler et al., 1999; Payne et al., 2002; Sycara et al., 1999), software agents
appear well suited to develop user agents and data provider agents as illustrated in our
conceptual framework to experience the GsP notion within a prototype. According to
(Nwana, 1996), a software agent is defined as “a component of a software and/or
hardware which is capable of acting exactingly in order to accomplish tasks on behalf of
its user.” In the specific context of the prototype on semantic, spatial, and temporal
interoperability of geospatial data presented in the following section, user and data
provider agents are deployed as software agents, which own a particular ontology to
interoperate with other agents. However, the description of software agent is beyond the
scope of this thesis and can be obtained in (Nwana, 1996; Nwana and Wooldridge, 1996).
5.5 The GsP Prototype
To evaluate the GsP notion, we built a software prototype, called GsP Prototype, which
agrees to our interoperability conceptual framework illustrated in Figure 12. With the
GsP Prototype, software agents are instantiated and can interoperate with each other.
This section presents successively a high level architecture of the prototype, the way the
prototype operates, and the experimentations conducted so far.
5.5.1 Architecture
The architecture of GsP Prototype illustrated in Figure 27 depicts a communication
process, which takes place between two software agents (Agent A and Agent B)
interacting through a communication channel. It details more specifically one agent’s
internal structure and operations as well as the manner in which agents exchange
information. However, this architecture is not limited to only two agents but can be
expanded to multiple agents interacting in pairs.
151
In this architecture, all agents have an identical internal structure and operate in the same
manner. They communicate using messages composed of geoConceptReps encoded in
XML streams. When an agent receives a message, it captures the inner XML
geoConceptReps of the message and places them in a transitory internal data structure
containing geoConceptReps. Each geoConceptRep stored in this data structure can be
compared to a human perceptual state.
The geoConceptReps are then passed to a Proxy. The Proxy is a server responsible for
finding geoConcepts that match the geoConceptReps in order to assign them a meaning.
This is the recognition process. The Proxy has to mediate between two geoConcept
storages: geoConMem and geoRep.
GeoConMem is a cache memory limited in size, which stores for a short period the most
recent geoConcepts (the geoConcept structure is detailed further in this section) used by
the agent. It may be compared to the short-term memory of a human being.
GeoRep consists of a geospatial data repository that holds the description of all
geoConcepts that the agent knows; it is a direct access storage. In this case, the geoRep
storage is implemented using Perceptory and consists of a graph representation of
geoConcepts in a UML class diagram along with a dictionary which manages the
description of semantic, spatial, and temporal properties of geoConcepts. GeoRep may be
compared to the long-term memory of a human being.
When processing, the Proxy examines one by one all the geoConceptReps that the agent
received in a message. For each geoConceptRep of the geoConceptReps data structure,
the Proxy looks first in geoConMem to visit the geoConcepts it stores until a geoConcept
that has a GsP of “GsP_tfft” (or equal) with a first geoConceptRep is located. It is the
geoConcept that is responsibile to evaluate its GsP with the geoConceptRep. As such, it
compares all its intrinsic (i.e. identification, attributes, attribute values, geometries,
temporalities, and domain) and extrinsic (i.e. relationships and behaviours) properties to
the geoConceptRep’s intrinsic and extrinsic properties as in the 4-intersection matrix
152
presented in the previous section. However, if no geoConcept shows a GsP of “GsP_tfft”
(or equal) with the geoConceptRep, then the Proxy continues to search in geoRep to find
the most similar geoConcept to the geoConceptRep, which consists in the geoConcept
that has the highest GsP with the geoConceptRep. To this end, the Proxy visits geoRep’s
geoConcepts to compute their respective GsP with the geoConceptRep. As such, it uses a
graph traversal algorithm and begins with the geoConcept of the geoConMem cache
memory that has the highest GsP with the geoConceptRep. GeoRep provides
geoConcepts to the Proxy using a geoConcept data structure. When the Proxy gets a
geoConcept from geoRep, it evaluates its GsP with the geoConceptRep and stores it. This
process continues until a geoConcept that has a GsP of “GsP_tfft” (or equal) with the
geoConceptRep is found or all concepts are visited. When the process is completed,
geoConcepts having a GsP different from “GsP_ffff” (or disjoint) with the
geoConceptRep are then sorted from the highest to the lowest GsP. The geoConcept with
the highest GsP constitutes the most similar geoConcept to the geoConceptRep, which is
then used to assign a meaning to the geoConceptRep.
It might happen that no geoConcept is found similar to the geoConceptRep and,
accordingly, no meaning can be assigned to the geoConceptRep. Therefore, the agent will
not be able to answer to the orher agent on this geoConceptRep. The resulting set of
geoConcepts matching the geoConceptReps of the message are then used by the agent to
reply to the other agent. As such, the geoConcepts generate geoConceptReps that are then
encoded in an XML stream and sent through the communication channel to the other
agent.
Similarly to concepts that compose human cognitive models, geoConcepts obtained from
either geoConMem or geoRep consist here of non-visible data elements (or private as in
Java or C++), which are obviously inaccessible to other agents. These data elements are
encapsulated by three functions: recognize, generate, and gspRelate (Figure 28). The
recognition and generate functions serve as the main geoConcept interfaces, which are
supported by gspRelate that evaluates the GsP.
153
Figure 27: Architecture of the GsP Prototype (from Brodeur and Bédard, 2002)
Figure 28: Object structure of a concept (from Brodeur and Bédard, 2002)
Figure 29 draws a more detailed description of the geoConcept object structure in a UML
class diagram. In this diagram, geoConcepts conform to the class GEOCONCEPT, which
inherits its data structure from the class GEOABSTRACTION. GEOABSTRACTION aims
at defining the properties used to identify and describe a geospatial phenomenon. These
properties are divided into two types: intrinsic and extrinsic.
On the one hand, the class INTRINSICPROPS accounts for intrinsic properties and
captures the identification, descriptiveAtts (i.e. descriptive attributes), geometries,
temporalities, and the domainComponents (i.e. various component of the domain) of a
154
GEOABSTRACTION. Essentially, the identification refers to the name and the definition
given to the GEOABSTRACTION. The descriptiveAtts report on the inherent
characteristics of a phenomenon. A name, a definition, and a domain of values
distinguish each descriptive attribute from another. Geometries refer to the various types
of geometry such as simple geometry (e.g. point, line, or surface), geometric aggregate,
complex geometry, and alternate geometry (Bédard, 1999) that are used to depict the
phenomenon spatially along with its inherent semantics (e.g. building basement
footprint). Similarly to geometries, temporalities refer to the various types of temporality
such as instant, period, temporal aggregate, and alternate temporality (Bédard, 1999) that
are used to depict the phenomenon temporally along with its inherent semantics (e.g. date
when the building construction is completed). The domain consists of the numerous
combinations of attribute values, geometry, and temporality that the GEOABSTRACTION
can take where each combination refers to one domainComponent.
On the other hand, the class EXTRINSICPROPS provides the details of the extrinsic
properties. Extrinsic properties are described in terms of behaviours and memberships in
relationships (relationMembership). A behaviour refers to an operation that a
phenomenon can accomplish. A name, a definition, a list of parameters, and a return type
differentiate each behaviour of a phenomenon. A membership in a relationship expresses
the participation of the phenomenon in a relationship with another phenomenon. It
identifies the relationship (name of the relationship and the list of members), the role
played by the GEOABSTRACTION and its minimum and maximum cardinalities.
The class GEOCONCEPT has the three functions mentioned above (recognize, generate,
and gspRelate). The function recognize takes a geoConceptRep as an input. It identifies
geoConcepts that are similar to one geoConceptRep prioritized by their GsP. The
function gspRelate assists the function recognize by computing the GsP of the
geoConcept with the geoConceptRep. It evaluates to what extent the geoConcept matches
the geoConceptRep. The function gspRelate assists the function recognize by computing
the GsP of the geoConcept with the geoConceptRep. Finally, the generate function
produces a geoConceptRep of this geoConcept, which holds in a specific context. Again,
155
the gspRelate function assists the generate function to ensure that the generated
geoConceptRep is similar to the geoConcept.
Figure 29: UML class diagram of GEOABSTRACTION, GEOCONCEPT, and
GEOCONCEPTREP
Being a subtype of the class GEOABSTRACTION, GEOCONCEPTREP inherits also the
data structure of GEOABSTRACTION (Figure 29). Accordingly, GEOCONCEPTREP’s
data structure and GEOCONCEPT’s data structure are identical. Because a
geoConceptRep is essentially encoded data of a geoConcept, the class
GEOCONCEPTREP does not possess any function. When an agent releases
geoConceptReps in the communication channel, it transforms them in an XML stream
and sends this XML stream to its destination. Accordingly, the XML encoding of
geoConceptReps adheres to a predefined definition described either in a Document Type
Definition (DTD) or an XML Schema. For the purpose of the prototype, the XML
encoding of geoConceptReps satisfies the following DTD:
156
<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XML Spy v4.0.1 U (http://www.xmlspy.com) by Jean Brodeur
(Natural Resources Canada) -->
<!ELEMENT GsPmessage (conceptualRepresentation*)>
<!ATTLIST GsPmessage
type CDATA #REQUIRED
recognition (true | false) "true">
<!ELEMENT conceptualRepresentation (intrinsicProperties, extrinsicProperties?)>
<!ELEMENT intrinsicProperties (identification, descriptiveAttribute*, geometry*,
temporality*, domainElement*)>
<!ELEMENT identification (name, definition?)>
<!ELEMENT descriptiveAttribute (name, definition?, attributeValue*)>
<!ELEMENT attributeValue (name, definition?)>
<!ELEMENT geometry (#PCDATA)>
<!ELEMENT temporality (#PCDATA)>
<!ELEMENT domainElement (attValue+, geometry?, temporality?)>
<!ELEMENT attValue (descriptiveAttribute, attributeValue)><!ELEMENT
extrinsicProperties (relationMembership*, behaviour*)>
<!ELEMENT relationMembership (relation, role?, cardMin?, cardMax?)>
<!ELEMENT relation (name, firstMember, secondMember?)>
<!ELEMENT behaviour (name, definition, parameter+, returnType)>
<!ELEMENT parameter (conceptualRepresentationName, defaultValue?)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT definition (#PCDATA)>
<!ELEMENT role (#PCDATA)>
<!ELEMENT cardMin (#PCDATA)>
<!ELEMENT cardMax (#PCDATA)>
<!ELEMENT firstMember (#PCDATA)>
<!ELEMENT secondMember (#PCDATA)>
<!ELEMENT conceptualRepresentationName (#PCDATA)>
157
<!ELEMENT defaultValue (#PCDATA)>
<!ELEMENT returnType (#PCDATA)>
5.5.2 Implementation
Based on the above architecture, GsP Prototype was implemented with Java™ and XML
technologies in combination with Perceptory–based geospatial repositories. The reasons
supporting this choice of technologies are:
1. XML is by far a widely recognized technology for the communication of
information;
2. availability of Java libraries to process XML documents (namely the Java API for
XML Processing (JAXP) (Sun Microsystems Inc., 2002) that includes the Xalan
(The Appache Sofware Foundation, 2002) and the Xerces (The Appache Sofware
Foundation, 2002) libraries for parsing and manipulating XML documents);
3. portability of the development on the Web; and
4. Perceptory is a technology very well suited to develop geospatial repositories
agreeing to ISO19103 Geographic information - Conceptual schema language
(ISO/TC 211, 2001b) and ISO19110 Geographic information - Methodology for
feature cataloguing (ISO/TC 211, 2001a), which can then serve as agent’s
ontologies.
This section presents in detail the implementation of the prototype and the way it
operates.
The GsP Prototype uses interfaces of two kinds: software agent interfaces and an Agent
Manager interface. A software agent appears as a window (Figure 30). The window’s
title bar identifies the agent’s name along with its ontology source name (e.g.
agent1 (NTDB_RN)). The remaining part of the window is divided into two sections: the
Console and the Communication Monitor.
158
The Console section consists of three components. The first component is a drop-down
menu, which presents the list of geoConcepts that compose the agent’s ontology. Each
geoConcept is identified by a unique name. The next item is the Send Query button.
When clicked, this button initiates a query about the geoConcept selected from the drop-
down menu towards an external agent. The external agent is identified by filling its name
in the External Agent field of the Communication Monitor section. The last Console’s
component is a field in which the agent displays messages.
The Communication Monitor section shows the different steps of the communication
process that are accomplished. When an agent receives a message from an external agent,
the name of the external agent appears in the External Agent field. Following this, the
agent extracts the geoConceptReps from the message and displays the name of the
geoConceptRep being processed in the Processing geoConceptRep (R’’/R’’’’) field one
by one. Then, the agent initiates the recognition process of the geoConceptRep and, as
such, visits the geoConcepts of the ontology until one is found similar to the
geoConceptRep. Once a geoConcept is identified, its name is displayed in the
geoConcept (R’\R’’’) field. When a reply is expected by the external agent (e.g. answer
to a query), the corresponding geoConcept generates a geoConceptRep of itself and the
name of the transmitted geoConceptRep is displayed in the Transmitting
geoConceptRep (R’’/R’’’’) field.
Figure 30: The agent window
159
The Agent Manager interface (Figure 31) is used to instantiate software agents and
displays one agent’s state upon user request. The instantiation of an agent requires two
elements: its identification and the name of an ontology source. The agent’s identification
is a unique identifier. The ontology source name consists of the source name of a
geospatial data repository. In our case, it corresponds to an ODBC data source name,
which refers to the database containing the geospatial data repository. Once the name
field and the ontology field are filled in, the agent is instantiated by clicking the New
button. At this time, the agent is alive but not active. It becomes active by clicking the
Start button. The agent’s state can be set inactive (or sleeping) but still alive by clicking
the Stop button. This is needed for management purposes. Even if the agent is inactive, it
keeps all its properties and when it is re-started (by clicking the Start button again), it
becomes active again. Finally, an agent is completely eliminated by clicking the Kill
button. At any time, it is possible to look at an agent’s state simply by filling in the
agent’s name in the name field and by pressing “return”. The agent’s state can be one of
the following:
- Null: the agent does not exist;
- Operating: the agent is alive and active;
- Sleeping: the agent is alive but not active.
Figure 31: The Agent Manager interface
160
Figure 32 illustrates the way the prototype operates. In this Figure, agent2 (BDTQ_RN)
sends a query to agent1 (NTDB_RN) for information about street. As such, it uses its
geoConcept street to generate a geoConceptRep of the same name (i.e. street), encodes it
as in the XML document presented in annex 1 (where the attribute type of the
gspMessage element is set to “query”), and sends the document to agent1.
When agent1 receives the XML document, it identifies its source and displays the name
in the External Agent field—i.e. agent2. Following this, it extracts the message type—
i.e. query—and the included geoConceptReps—i.e. street. Then, it processes the
geoConceptReps one by one. In this example, there is only the geoConceptRep street to
process. As such, Agent1 displays the name street in the Processing geoConceptRep
(R’’/R’’’’) field.
To process the geoConceptRep street, agent1 looks first for geoConcepts in its short-term
memory. If no geoConcept has a GsP of GsP_tfft (or equal) to street, then it goes on
searching its long-term memory until a geoConcept showing a GsP of GsP_tfft (or equal)
with street is found or until all geoConcepts have been visited. As we can see in Figure
32, agent1 visited all geoConcepts. The computation of the GsP of a geoConcept with
street takes into consideration their identification, their descriptive, geometric, and
temporal properties (i.e. the intrinsic properties) as well as their behaviours and their
memberships to relationships (i.e. the extrinsic properties), respectively. As the
geoConcept road shows common intrinsic properties with street and has the most
significant GsP—i.e. GsP_ffft—, agent1 displays its name in the geoConcept (R’/R’’’)
field and as such uses it to assign a meaning to the geoConceptRep street. Now with the
geoConcept road, agent2 can answer agent1’s request. It produces a geoConceptRep of
the same name, displays the name in the Transmitting geoConceptRep (R’’/R’’’’) field
(e.g. road), encodes the geoConceptRep, and sends it to agent1 using the XML document
shown in annex 2.
In turn, when agent2 receives the XML document, it initiates a similar process as agent1
did. It identifies the message originator and displays its name in the External Agent
161
field—i.e. agent1—, extracts the message type—i.e. answer—and the geoConceptRep—
i.e. road—, and processes it. Then, agent2 displays the name road in the field Processing
geoConceptRep (R’’/R’’’’) and computes that its geoConcept street is similar (e.g.
GsP_ffft) to the geoConceptRep road. As such, agent2 displays the name street in the
geoConcept (R’/R’’’) field. Therefore, agent2 acknowledges that road answers its initial
query and thus interoperability happens. Because the message is an answer, no further
action is required and the process stops at this point.
Figure 32: Example of the prototype operation
162
5.5.3 Experimentation
Using the above software agent –based prototype, we conducted experimentations on
road and hydrographic networks to assess the strength of our approach. These two themes
were been chosen because they are both candidates of the essential content and the
desirable content, respectively, of the Canadian GeoBase (GeoBase, 2001), which is
currently being developed. Briefly, GeoBase consists of “the fundamental geographic
information that describes Canadian landmass above and below water” (CCOG -
Working Group on “Base Data Quality Issue”, 2001) that is established in co-operation
with Canadian federal, provincial, and territorial mapping agencies. The experimentation
aimed at assessing computer feasibility and strength of the GsP approach a priori. Tests
were limited to the interaction of software agents using on the one hand identical
ontologies and on the other hand different ontologies.
As such, we built UML-based geospatial data repositories on road and hydrographic
networks with Perceptory using the following topographic data product specifications:
a) National Topographic Data Base – Standards and Specifications of Canada
(Natural Resources Canada, 1996) (NTDB);
b) User's Guide to Digital and Hardcopy property and Basemap Products of Prince
Edward Island (P.E.I. Geomatics Information Centre) (PEIBP);
c) Quebec Topographic Data Base 1:20 000 – Production Standards (Québec, 2000)
(QTDB);
d) Ontario Digital Topographic Database – 1:10,000, 1:20,000– A Guide for User
(OBM, 1996) (ODTDB);
e) Digital Baseline Mapping at 1:20,000 of the province of British Columbia (BC
Ministry of Environment Lands and Parks (Geographic Data BC), 1992)
(BCDBM).
Figures 34 and 35 show UML class diagrams corresponding to both themes of these data
product specifications.
163
Each object class and relationship is documented in a data dictionary, which provides its
semantics and their inherent properties as shown in Figure 33 for the class road of the
NTDB road network (Figure 34a).
Figure 33: Extract of the class road of the data dictionary
of the NTDB road network (made with Perceptory)
164
Figure 34a: NTDB Road network class diagram
165
Figure 34b: PEIBP Road network class diagram
166
Figure 34c: QTDB Road network class diagram
167
Figure 34d: ODTDB Road network class diagram
Figure 34e: BCDBM Road network class diagram
Figure 34: Road network UML class diagrams (aNTDB, bPEIBP, cQTDB, dODTDB, and eBCDBM)
168
Figure 35a: NTDB Hydrographic network class diagram
169
Figure 35b: PEIBP Hydrographic network class diagram
170
Figure 35c: QTDB Hydrographic network class diagram
171
Figure 35d: ODTDB Hydrographic network class diagram
172
Figure 35e: BCDBM Hydrographic network class diagram
Figure 35: Hydrographic network UML class diagrams (aNTDB, bPEIBP, cQTDB, dODTDB, and eBCDBM)
173
Software agents were instantiated using the above geospatial data repositories, which
served as as application ontologies. We used ten different software agents, one for each
ontologies, and 46 road network related geoConcepts distributed among the five different
road network ontologies and also 44 hydrographic network related geoConcepts
distributed among the five different hydrographic network ontologies. We placed road
network agents in interaction between themselves using the road network related
geoConcepts, and did the same thing the hydrographic network agents. The results
presented hereafter show the success rate where a data provider agent answered
adequately to a query from a user agent. The data provider agent either answered it had
not understood the query with its own ontology or used a similar geoConcept, which was
recognised as such by the user agent, to answer the query. In the case where interacting
software agents were of the same ontology, we observed that agents used in all cases the
same geoConcept to generate and recognize the geoConceptRep of the message, which
results in a success rate of 100% for both road and hydrographic networks (Figure 36 and
37). For example, the message receives by an NTDB road network –based agent
including a geoConceptRep generated from the geoConcept Highway exit of another
NTDB road network –based agent was always recognized by the geoConcept Highway
exit with a GsP of GsP_tfft (or equal) with the geoConceptRep.
When software agents of different ontologies but related to the same network (road or
hydrographic) were interacting, a geoConcept of the destination agent succeeds in
recognizing the geoConceptRep generated by the source agent when common intrinsic
and extrinsic properties have been identified. We observed that software agents
succeeded in recognizing messages received from another software agent of a different
ontology in a success rate ranging from 30% to 100% depending on ontologies with a
mean of 59% for the road network and 61% for the hydrographic network (Figure 36 and
37). The difference between these results and 100% is explained because we used an
artificial root geoConcept to link the sub-networks composing an ontology in order to use
a graph traversal algorithm to navigate from one geoConcept to another within the
ontology. This artificial root geoConcept has caused undesirable situations for instance
174
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
NTDB PEIBP BDTQ ODTDB BCDBM
NTDBPEIBPBDTQODTDBBCDBM
Figure 36: Observed success rates – Road Network
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
NTDB PEIBP BDTQ ODTDB BCDBM
NTDBPEIBPBDTQODTDBBCDBM
Figure 37: Observed success rates – Hydrographic Network
175
when a geoConcept and a geoConceptRep had both a relationship with this root
geoConcept, they showed a false geosemantic proximity. Table 4 illustrates a few
examples of software agent’s geoConcepts that automatically recognize
geoConceptsReps encoded by another software agent where both agents were using
different ontologies.
Table 4: Examples of geoConcepts recognizing geoConceptReps, both of different
ontologies.
Agent Agent
geoConcept Ontology
Recognizes (with the corresponding
GsP) geoConceptRep Ontology
Road NTDB GsP_tfft (equal) Road PEIBP
Road NTDB GsP_ffft Street QTDB
Trail ODTDB GsP_ffft Trail NTDB
Lake BCDBM GsP_tfft (equal) Lake PEIBP
Coastline PEIBP GsP_tfft (equal) Coastline BCDBM
Rocky Ledge/Reef PEIBP GsP_ffft Rocky
Ledge/Reef NTDB
Water disturbance NTDB GsP_ffft Rapids PEIBP
Disappearing stream NTDB GsP_ffft Sinkhole PEIBP
In all these examples, even if the geosemantic proximity between the geoConcept and the
geoConceptRep seems obvious in certain cases (e.g. trail or coastline) because they
appear to be identical abstractions, they are essentially different but similar because of all
their inherent properties. It is because of their similarity that the geoConcept can be used
to assign a meaning to the geoConceptRep.
176
However, it is still possible that an agent’s geoConcept may not recognize a
geoConceptRep encoded and transmitted by another agent. The agent’s ability to
recognize a geoConceptRep resides in the richness of its ontology in terms of the
geoConcepts it knows and the relationships between geoConcepts.
These results demonstrate that interoperability is possible between software agents of
different ontologies although their respective data product specifications have not been
developed explicitly for that purpose. To increase the level of interoperability of
geospatial data, organizations involved in geospatial data acquisition, management, and
dissemination should consider the development with meaningful geoConcepts in terms of
content and relationships between each other, regardless the manner they are
implemented in geospatial databases. The integration of software agents of domain and
global ontologies in the prototype would also be an important improvement for geospatial
data interoperability. Finally, the extraction of semantic information from definitions of
geoConcepts, attributes, and attribute values typically stored in natural language in
geospatial data repositories would further enhance the evaluation of GsP.
5.6 Conclusion
In this chapter, we reviewed our conceptual framework for geospatial data
interoperability, which has been derived from human communication and cognition
theories. In this framework, user and data provider agents maintain in memory a set of
geoConcepts, which constitute their respective ontologies. Agents communicate
geoConcepts to others by generating and transmitting representations of the
geoConcepts—i.e. geoConceptReps. When receiving a message, an agent goes through
its geoConcepts to find those that recognise the message’s geoConceptReps and then to
give them a meaning. The notion of geosemantic proximity is here in support of these
geoConcept’s capabilities—i.e. to generate and recognize geoConceptReps. By the
geosemantic proximity, a geoConcept assesses its semantic, spatial, and temporal
similarity with a geoConceptRep. More specifically, the geoConcept evaluates the
177
correspondence of its intrinsic and extrinsic properties with those of the geoConceptRep
and expresses it in a 4-intersection matrix.
Also, we presented the GsP Prototype that was developed to test the notion of
geosemantic proximity within our framework. The GsP Prototype consists of software
agents that communicate with each other by sending queries and replies. A software
agent possesses its own ontology, which consists of a geospatial data repository. This
agent accomplishes tasks on behalf of a user when it communicates with a data provider,
and conversely, it accomplishes tasks on behalf of a data provider when it communicates
with a user. As such, an agent communicates geoConcepts by generating and transmitting
geoConceptReps in XML. An agent recognizes and assigns a meaning to a
geoConceptRep by using the geoConcept of its ontology that has the most significant
geosemantic proximity with the geoConceptRep. The GsP Prototype has been tested
against agents using identical ontologies and agents using different ontologies. For this
purpose, geospatial data repositories have been worked out to serve as agent’s application
ontologies from five different geospatial data product specifications. In the
experimentation conducted, agents using identical ontologies always ended up
understanding each other where the geoConcept that recognizes the geoConceptRep was
always identical to the one that produces that geoConceptRep. Agents using different
ontologies end up understanding each other when geoConcepts and geoConceptReps
show sufficient commonalities. Limitations between agents of different ontologies come
from the poverty of ontologies in terms of amount of geoConcepts and inherent structure
as well as the difficulty to handle definitions in natural language.
Although we consider the prototype and the experimentation to be successful, a number
of issues still need to be addressed, notably in (1) the development of more rigorous
ontologies, (2) the extraction of intrinsic and extrinsic properties from natural language
definitions of geoConcepts, attributes, attribute values, etc., (3) the integration of
application, domain, and global ontologies and their interactions. Finally, we believe that
this research takes a step forward in the achievement of the complete interoperability of
geospatial data.
178
Acknowledgements
The authors wish to acknowledge the contribution of Natural Resources Canada – Centre
for Topographic Information in supporting the first author for this research and of the
GEOIDE Network of Centres of Excellence in geomatics, project DEC#2 (Designing the
Technological Foundations of Spatial Decision-making with the World Wide Web).
5.7 References
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y 1999 Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186
Bédard, Y, and M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Bittner, T, and G Edwards 2001 Toward an Ontology for Geomatics. Geomatica, 55(4):
475-490
Brodeur, J, and Y Bédard 2001 Geosemantic Proximity, a Component of Spatial Data
Interoperability. In Proceedings of International Workshop on “Semantics in
Enterprise Integration” (OOPSLA 2001): 6
Brodeur, J, and Y Bédard 2002 Extending Geospatial Repositories with Geosemantic
Proximity Functionalities to Facilitate the Interoperability of Geospatial Data. In
Proceedings of ISPRS, SDH2002 and CIG Joint International Symposium on
Geospatial Theory, Processing and Application: 6
179
Brodeur, J, Y Bédard, G Edwards, and B Moulin 2003 Revisiting the Concept of
Geospatial Data Interoperability within the Scope of a Human Communication
Process. Transactions in GIS, 7(2): 243-265
Brodeur, J, Y Bédard, B Moulin, and G Edwards 2002 Geosemantic Proximity for
Geospatial Data Interoperability. (Manuscrit)
Brodeur, J, Y Bédard, and M J Proulx 2000 Modelling Geospatial Application Databases
using UML-based Repositories Aligned with International Standards in
Geomatics. In Proceedings of Eighth ACM Symposium on Advances in
Geographic Information Systems (ACMGIS) ACM Press: 39-46
CCOG - Working Group on “Base Data Quality Issue” 2001 A New Vision for Quality
Base Geographic Information in Canada – “GeoBase”.
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Egenhofer, M J 1999 Introduction: Theory and Concepts. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 1-4
FGDC 2002 NSDI. USGS, Web Page Document, http://www.fgdc.gov/nsdi/nsdi.html
Fowler, J, B Perry, M Nodine, and B Bargmeyer 1999 Agent-Based Semantic
Interoperability in InfoSleuth. Sigmod Record, 28(1): 8
GeoBase 2001 National GeoBase Definition.
GeoConnections 2002 Canadian Geospatial Data Infrastructure (CGDI) architecture.
Electronic Document, http://www.geoconnections.org/architecture/index_e.html
Gruber, T R 1993 Toward Principles for the Design of Ontologies Used for Knowledge
Sharing. Palo Alto, California, Knowledge Systems Laboratory Technical Report
KSL 93-04
Guarino, N 1998 Formal Ontology and Information Systems. In Proceedings of Formal
Ontology in Information Systems (FOIS '98). Amsterdam, IOS Press: 3-15
Guarino, N, and C Welty 2000 A Formal Ontology of Properties. In Proceedings of
Knowledge Engineering and Knowledge Management: Methods, Models and
180
Tools (12th International Conference, EKAW2000), 3-540-41119-4. Berlin,
Springer-Verlag Lecture Notes in Computer Science 1937: 97-112
Harvey, F J 2002 Semantic Interoperability and Citizen/Government Interaction. In
Proceedings of Spatial Data Handling 2002: Joint International Symposium on
Geospatial Theory, Processing, and Applications: 13
ISO/TC 211 2001a ISO/DIS 19110 Geographic Information - Feature Cataloguing
Methodology. Geneva, Switzerland, International Organization for
Standardization
ISO/TC 211 2001b ISO/PDTS 19103 Geographic Information - Conceptual Schema
Language. Geneva, Switzerland, International Organization for Standardization
ISO/TC 211 2003a ISO 19107:2003 Geographic Information - Spatial Schema. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003b ISO 19115:2003 Geographic Information - Metadata. Geneva,
Switzerland, International Organization for Standardization
Jones, M 1991 Brave New World: A Vision of IRDS. Database Programming and
Design: 43-49
Kashyap, V, and A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Kottman, C 1999 The Open GIS Consortium and Progress Toward Interoperability in
GIS. In M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds)
Interoperating Geographic Information Systems. Boston, Massachusetts, Kluwer
Academic Publisher: 39-54
Laurini, R 1998 Spatial Multi-Database Topological Continuity and Indexing: a Step
Towards Seamless GIS Data Interoperability. International Journal of
Geographic Information Science, 12(4): 373-402
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50
Marco, D 2000 Building and Managing the Meta Data Repository. Wiley
Moriarty, T 1990 Are You Ready for a Repository? Database Programming and Design:
62-71
181
Natural Resources Canada 1996 National Topographic Data Base - Standards and
Specifications. Sherbrooke, Quebec, Centre for Topographic Information-
Sherbrooke
Nwana, H S 1996 Software Agents: An Overview. The Knowledge Engineering Review,
11(2): 205-244
Nwana, H S, and M Wooldridge 1996 Software Agent Technologies. BT Technology
Journal, 14(4): 68-78
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources
Open GIS Consortium Inc. 1999 OpenGIS Simple Features Specification for SQL.
Wayland, Massachusetts, OpenGIS Consortium Inc.
Open GIS Consortium Inc. 2001 Geography Markup Language (GML) 2.0. Wayland,
Massachusetts, Open GIS Consortium Inc.
Ouksel, A M, and A Sheth 1999 Semantic Interoperability in Global Information
Systems: A Brief Introduction to the Research Area and the Special Section.
Sigmod Record, 28(1): 5-12
P.E.I. Geomatics Information Centre User’s Guide to Digital and Hardcopy property and
Basemap Products. Charlottetown, P.E.I., Provincial Treasury - Taxation &
Property Records Division
Payne, T R, M Paolucci, R Singh, and K Sycara 2002 Communicating Agents in Open
Agent Systems. In Proceedings of First GSFC/JPL Workshop on Radical Agent
Concepts (WRAC): 10
Peuquet, D, B Smith, and B Brogaard 1998 The Ontology of Fields. In Proceedings of
Summer Assembly of the University Consortium for Geographic Information
Science
Prabandham, M, W J Selfridge, and D D Mann 1990 The Role of IRDS. Database
Programming and Design: 41-48
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l’information géographique, CD
Document
182
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine
Schramm, W 1971 The Nature of Communication Between Humans. In W Schramm, and
D F Robert (eds) The Process and Effects of Mass Communication. Champaign-
Urbana, IL, University of Illinois Press: 3-53
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29
Sheth, A, and V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312
Simsion, G C 2001 Data Modeling Essentials - Analysis, Design, and Innovation.
Scottsdale, Arizona, Coriolis
Smith, B, and D Mark 1999 Ontology with Human Subjects Testing: An Empirical
Investigation of Geographic Categories. American Journal of Economics and
Sociology, 58(2): 245-272
Statistics Canada 1997 Digital Boundary File and Digital Cartographic File 1996
Census (Reference Guide). Ottawa, Minister of Industry
Sun Microsystems Inc. 2002 Java™ API for XML Processing (JAXP). Sun
Microsystems Inc., Web Page Document, http://java.sun.com/xml/jaxp/index.html
Sycara, K, M Klusch, S Widoff, and J Lu 1999 Dynamic Service Matchmaking Among
Agents in Open Information Environnements. Sigmod Record, 28(1): 47-53
The Appache Sofware Foundation 2002 Xalan-Java version 2.4.0. The Appache Sofware
Foundation, Web Page Document, http://xml.apache.org/xalan-j/
VMap 1995 Vector Map (VMap), Level 1. Bethesda, MD, U.S. National Imagery and
Mapping Agency Mil-V-89033
Xhu, Z, and Y C Lee 2002 Semantic Heterogeneity of Geodata. In Proceedings of ISPRS
Commission IV Symposium 2002: Joint International Symposium on Geospatial
Theory, Processing, and Applications: 6
CHAPITRE 6
CONCLUSION
Cette thèse s’intéresse au problème d’interopérabilité sémantique, spatiale et temporelle
des données géospatiales. Comme nous avons pu le constater, les bases de données
géospatiales représentent habituellement les mêmes phénomènes géographiques de
manières semblables, mais non-identiques; chaque représentation ayant une signification
qui lui est spécifique. Alors que les bases de données géospatiales sont maintenant
accessibles sur Internet, les différences observées dans les représentations des
phénomènes géographiques causent des problèmes quant à la recherche d’information qui
répond aux besoins exacts des utilisateurs et à l’intégration des données géospatiales.
6.1 Sommaire
Le chapitre 1 présente le problème que cette thèse aborde, soit de repérer et d’obtenir des
données géospatiales qui répondent aux besoins exacts des utilisateurs. Plus précisément,
nous cherchons à élucider, identifier et définir les éléments de la proximité sémantique,
spatiale et temporelle qui interviennent dans le repérage des données géospatiales
répondant au besoin particulier d’un utilisateur, dans le cadre de l’interopérabilité des
données géospatiales.
184
Au chapitre 2, nous avons revu les notions que nous considérons étroitement liées à
l’interopérabilité sémantique, spatiale et temporelle des données géospatiales. Nous
avons premièrement reconsidéré le processus de communication entre les systèmes. Le
processus de communication entre les êtres humains apparaît un modèle
d’interopérabilité remarquable puisque les êtres humains réussissent à communiquer
ensemble et à échanger un grand volume d’information de manière interopérable. Le
processus de communication nous a amenés à examiner le fonctionnement cognitif de
l’être humain pour comprendre pourquoi l’être humain est en mesure de reconnaître un
ensemble de signaux et leur attribuer une signification. Nous avons ensuite revu la notion
d’ontologie qui est intimement liée à la connaissance et à la description des phénomènes
de la réalité. Puis, nous avons revu les différents types d’hétérogénéité des données
géospatiales pour comprendre comment évaluer la similitude entre les données
géospatiales. Nous avons complété ce chapitre en repassant certaines solutions sur
l’hétérogénéité des données géospatiales et certaines méthodes d’évaluation de la
similitude sémantique.
Nous nous sommes intéressés au chapitre 3 au problème global de l’interopérabilité des
données géospatiales. Nous avons proposé un cadre conceptuel qui est basé sur le
processus de communication et sur les sciences cognitives. Ce cadre conceptuel est
assorti d’une ontologie de l’interopérabilité des données géospatiales qui se présente sur
deux dimensions. La première dimension réfère aux cinq phases ontologiques de
l’interopérabilité des données géospatiales qui caractérisent les diverses représentations
de phénomènes impliquées dans le processus de communication : les phénomènes
géospatiaux en soi, les représentations cognitives des phénomènes ou concepts
géospatiaux de la source et de la destination, et les signaux utilisés pour communiquer les
concepts géospatiaux ou représentations conceptuelles géospatiales. La seconde
dimension distingue les niveaux de granularité que nous retrouvons dans la représentation
des phénomènes : ontologie globale, ontologie de domaine et ontologie d’application. Les
concepts géospatiaux, étant des représentations cognitives, ne peuvent pas être accédés
directement. Par conséquent, ils sont encapsulés par une fonction de simulation
185
(Barsalou, 1999) qui leur sert d’interface. Cette fonction du concept géospatial génère et
reconnaît des représentations conceptuelles géospatiales.
Au chapitre 4, nous avons proposé la notion de proximité géosémantique pour évaluer la
similitude sémantique, spatiale et temporelle entre un concept géospatial et une
représentation conceptuelle géospatiale. C’est une approche qui se fonde sur la
comparaison du contexte d’un concept géospatial et du contexte d’une représentation
conceptuelle géospatiale. Le contexte d’un concept géospatial et d’une représentation
conceptuelle géospatiale est essentiellement représenté par l’ensemble de leurs propriétés
intrinsèques et extrinsèques. Les propriétés intrinsèques décrivent la nature spécifique et
la signification inhérente d’un concept géospatial ou d’une représentation conceptuelle
géospatiale de manière indépendante des facteurs extérieurs. Les propriétés extrinsèques
fournissent une signification d’un concept géospatial ou d’une représentation
conceptuelle géospatiale en fonction de l’influence que des facteurs externes (c.-à-d.
d’autres concepts géospatiaux ou représentations conceptuelles géospatiales) exercent sur
le concept géospatial ou la représentation conceptuelle géospatiale. On compare le
contexte d’un concept géospatial ou d’une représentation conceptuelle géospatiale à un
segment sur un axe sémantique où les propriétés intrinsèques correspondent à l’intérieur
du segment et les propriétés extrinsèques, à ses limites. La proximité géosémantique
s’exprime alors sous la forme d’une matrice à quatre intersections, chaque intersection est
évaluée vide ou non-vide. De cette matrice, seize prédicats de proximité géosémantique
sont dérivés pour exprimer la similitude entre un concept géospatial et une représentation
conceptuelle géospatiale. La proximité géosémantique constitue une méthode qui soutient
le raisonnement qualitatif d’un concept géospatial pour générer et reconnaître des
représentations conceptuelles géospatiales.
Enfin, nous avons décrit au chapitre 5 le GsP Prototype. Ce prototype valide la faisabilité
informatique de la notion de proximité géosémantique. Il est fait d’agents logiciels qui
communiquent ensemble et échangent de l’information géospatiale. Chaque agent
possède sa propre ontologie formée d’un ensemble de concepts géospatiaux interreliés
entre eux. Un répertoire de données géospatiales développé avec Perceptory (Bédard et
186
Proulx, 2002) constitue chaque ontologie utilisée dans le prototype. Trois fonctions
encapsulent chaque concept géospatial :
- une fonction qui génère des représentations conceptuelles géospatiales,
- une fonction qui reconnaît des représentations conceptuelles géospatiales, et
- une fonction qui calcule la proximité géosématique (gspRelate) entre le concept
géospatial et une représentation conceptuelle géospatiale.
Les agents communiquent entre eux en échangeant des représentations conceptuelles
géospatiales codées en XML. Nous avons validé le prototype avec des ontologies sur le
réseau routier et le réseau hydrographique construites à l’aide des spécifications de
données géospatiales de la Base nationale de données topographiques (Ressources
naturelles Canada, 1996), du guide des utilisateurs de données cartographiques
numériques de l’Île-du-Prince-Édouard (P.E.I. Geomatics Information Centre), des
normes de production de la Base de données topographiques du Québec (Québec, 2000),
du guide des utilisateurs de la Base de données topographiques de l’Ontario (OBM, 1996)
et des normes cartographiques numériques de la Colombie-Britannique (BC Ministry of
Environment Lands and Parks (Geographic Data BC), 1992). À l’aide de ces ontologies,
nous avons observé que :
- deux agents ayant la même ontologie réussissent toujours à se comprendre en
utilisant les mêmes concepts géospatiaux;
- deux agents, chacun ayant une ontologie distincte de l’autre, se comprennent en
autant que les concepts de leur ontologie respective présentent suffisamment de
propriétés intrinsèques et extrinsèques communes.
6.2 Discussion
Cette thèse apporte une vision renouvelée de l’interopérabilité des données géospatiales
basée sur le processus de communication entre les êtres humains et le fonctionnement
cognitif des êtres humains. Elle intègre aussi une approche novatrice pour retrouver
l’information géospatiale qui permet aux utilisateurs de données de communiquer avec
187
les fournisseurs de données (i.e. serveurs de bases de données) dans leur propre
vocabulaire (i.e. ontologie). Les utilisateurs et les fournisseurs de données reconnaissent,
c’est-à-dire interprètent, automatiquement le contenu des messages qu’ils reçoivent grâce
à leur base de connaissances (i.e. ontologie) et à la capacité de raisonnement des concepts
géospatiaux pour qualifier leur similitude sémantique, spatiale et temporelle avec les
représentations conceptuelles géospatiales. Les fournisseurs de données répondent de
façon plus précise aux requêtes reçues dans un vocabulaire qui n’est pas le leur. En ce
sens, nous croyons que notre hypothèse de départ qui était que « la proximité
géosémantique contribuerait à repérer des concepts géospatiaux qui répondent aux
besoins spécifiques d’un utilisateur » est vérifiée (1) par la définition d’un cadre
conceptuel d’interopérabilité qui situe la notion de proximité géosémantique, (2) par le
développement de cette notion en tenant compte de la signification des concepts
géospatiaux et des représentations conceptuelles géospatiales incluant leurs
représentations spatiales et temporelles, et (3) par l’élaboration d’un prototype validant la
faisabilité informatique. On remarque que les ontologies, les processus de raisonnement
des agents utilisateurs et fournisseurs incluant les prédicats de proximité géosémantique
et les messages que les agents se transmettent sont autant d’éléments qui influencent le
repérage de données géospatiales spécifiques à un besoin. Toutefois, l’approche de
proximité géosémantique se limite à l’évaluation de la similitude qui existe entre les
concepts géospatiaux et les représentations conceptuelles géospatiales sans considérer
leurs différences. L’étude de la différence ajouterait une composante significative dans
l’analyse de proximité géosémantique.
Les résultats de cette thèse permettent d’entrevoir le développement de serveurs de
données plus intelligents accessibles sur le Web pouvant comprendre les requêtes des
utilisateurs nonobstant le vocabulaire utilisé pour les formuler. Nous croyons que
l’approche proposée dans cette thèse augmentera l’efficience et l’efficacité de
l’interaction entre les utilisateurs et les serveurs de données géospatiales. Nous estimons
que les résultats de cette recherche auront des effets dans plusieurs projets. Plus
spécifiquement, on pense aux applications Web utilisant une connexion Internet sans fils
(wireless connection) où les serveurs de données qui intègreront notre approche saisiront
188
mieux le sens des requêtes et y répondront plus précisément. Ceci aura pour effet de
minimiser la quantité de données qui voyagent dans le réseau et de diminuer le temps
d’interaction avec les serveurs de données ainsi que les coûts qui y sont associés. On
pense aussi aux applications associées aux systèmes d’aide à la décision qui favorisent la
prise de décision efficace. Dans ce contexte, notre approche aiderait à l’intégration de
connaissances provenant de sources multiples habituellement décrites de manière
distincte tant au niveau schématique, sémantique, spatial que temporel.
6.3 Conclusions
À la lumière des résultats de cette thèse et plus spécifiquement par la démonstration faite
avec le GsP Prototype, nous concluons ce qui suit :
- le cadre conceptuel proposé représente d’une manière réaliste l’interaction qui
existe entre deux systèmes dans un contexte d’interopérabilité sémantique,
spatiale et temporelle des données géospatiales;
- les concepts géospatiaux qui constituent l’ontologie d’un système décrivent la
signification donnée aux phénomènes géographiques ainsi qu’aux signaux utilisés
pour les représenter. Le concept et sa description, c’est la sémantique!
- les concepts géospatiaux ne sont pas communiqués directement. Les
représentations conceptuelles géospatiales, qui sont des signaux physiques,
servent à communiquer les concepts géospatiaux entre les systèmes;
- les représentations conceptuelles géospatiales sont adaptées au contexte
spécifique puisqu’elles expriment le besoin précis d’information géographique
d’un utilisateur et véhiculent les données spécifiques qui répondent au besoin de
l’utilisateur;
- la notion de proximité géosémantique supporte les fonctions de reconnaissance et
de génération de représentations conceptuelles géospatiales intégrées aux
concepts géospatiaux en qualifiant la similitude sémantique, spatiale et temporelle
d’un concept géospatial avec une représentation conceptuelle géospatiale;
189
- le modèle à quatre intersections utilisé pour évaluer la proximité géosémantique
qualifie efficacement la similitude entre un concept géospatial et une
représentation conceptuelle géospatiale en comparant l’ensemble de leurs
propriétés intrinsèques et leurs propriétés extrinsèques;
- le modèle à quatre intersections ne tient compte que de la ressemblance d’un
concept géospatial avec une représentation conceptuelle géospatiale et gagnerait à
considérer leurs différences;
- la qualification vide/non vide de chacune des quatre intersections entre les
propriétés intrinsèques et extrinsèques mérite aussi d’être enrichie pour décrire de
manière plus précise l’état de chaque intersection;
- l’approche de proximité géosémantique s’intègre très bien à l’ensemble des
méthodes de raisonnement appliquées aux données géospatiales puisqu’elle
s’inspire des approches topologiques couramment utilisées avec les données
spatiales et temporelles (Allen, 1983; Egenhofer, 1993; Egenhofer et Franzosa,
1991);
- des ontologies riches en contenu et en structure favorisent une meilleure
interopérabilité avec les bases de données géospatiales puisqu’elles offrent un
plus grand éventail de concepts géospatiaux, de propriétés intrinsèques et de
propriétés extrinsèquement;
- un répertoire de données géospatiales composé d’un modèle UML et d’un
dictionnaire de données constitue un outil adéquat pour réaliser une ontologie
d’application puisqu’il permet de décrire chaque concept de manière détaillée
avec ses caractéristiques descriptives, spatiales et temporelles ainsi que ses
comportements, et permet d’établir les relations qui existent entre chaque concept
ainsi que le rôle que joue chaque concept dans cette relation.
6.4 Perspectives de recherche
Le cadre conceptuel d’interopérabilité des données géospatiales ainsi que la notion de
proximité géosémantique présentés dans cette thèse démontrent des progrès substantiels
pour l’interopérabilité des données géospatiales. Toutefois, l’étude de plusieurs questions
190
et problèmes doit être poursuivie pour accroître l’interopérabilité des données
géospatiales, notamment :
- la comparaison de résultats obtenus par l’approche de proximité géosémantique
avec ceux obtenus de sujets humains;
- l’analyse des définitions en langage naturel associées aux propriétés intrinsèques
et extrinsèques (incluant la sémantique de la géométrie et de la temporalité) des
concepts géospatiaux et de représentations conceptuelles géospatiales pour faire
ressortir des propriétés non explicitement décrites; les graphes conceptuels
présentent une avenue intéressante;
- l’analyse de la différence entre un concept géospatial et une représentation
conceptuelle géospatiale d’un point de vue qualitative qui permettrait d’enrichir la
notion de proximité géosémantique; le modèle à neuf intersections semble offrir
les caractéristiques pour considérer l’analyse de la différence;
- la qualification plus précise de l’état de chaque intersection dans la matrice à
quatre intersections et éventuellement dans la matrice à neuf intersections pour
caractériser si certaines ou toutes les propriétés du concept géospatial
correspondent à certaines ou toutes les propriétés de la représentation
conceptuelle géospatiale;
- la mise à jour dynamique et automatique de l’ontologie d’un agent à partir de la
reconnaissance de représentations conceptuelles géospatiales pour accroître sa
capacité de génération et de reconnaissance de représentations conceptuelles
géospatiales;
- l’interaction entre des agents d’ontologie d’application, d’ontologie de domaine et
d’ontologie globale pour augmenter l’interopérabilité des données géospatiales;
- l’élaboration d’un ensemble de règles quant à l’élaboration de modèles
conceptuels de données géospatiales et à l’utilisation du formalisme UML pour le
développement d’ontologies géospatiales; les modèles actuels des bases de
données géospatiales sont habituellement présentés au niveau logique voire au
niveau physique et, conséquemment, n’offrent pas toute la connaissance utile pour
supporter l’interopérabilité sémantique, spatiale et temporelle;
191
- l’évaluation, l’adaptation et l’expérimentation du geography markup language
(GML), du resource description framework (RDF) et du Web ontology language
(OWL) pour la communication de représentations conceptuelles géospatiales
telles qu’utilisées dans le GsP Prototype; GML est une spécification de
l’OpenGIS Consortium Inc. utilisée pour l’échange de données géospatiales; RDF
et OWL sont des spécifications du World Wide Web Consortium Inc. pour
développer le SemanticWeb.
6.5 Références
Allen, J F 1983 Maintaining Knowledge about Temporal Intervals. Communication of the
ACM, 26(11): 832-843
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y, et M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273
Egenhofer, M, et R D Franzosa 1991 Point-Set Topological Spatial Relations.
International Journal of Geographic Information Science, 5(2): 161-174
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources
P.E.I. Geomatics Information Centre User’s Guide to Digital and Hardcopy property and
Basemap Products. Charlottetown, P.E.I., Provincial Treasury - Taxation &
Property Records Division
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
192
Ressources naturelles, Direction générale de l’information géographique, CD
Document
Ressources naturelles Canada 1996 Base nationale de données topographiques - normes
et spécifications. Sherbrooke, Québec, Centre d’information topographique –
Sherbrooke
ANNEX A
QUERY ABOUT THE STREET GEOCONCEPT
ENCODED IN AN XML DOCUMENT
<?xml version="1.0" encoding="UTF-8"?>
<GsPmessage type="query">
<conceptualRepresentation>
<intrinsicProperties>
<identification>
<name>street</name>
<definition>rue : voie de communication généralement
bordée de bâtiments dans une agglomération.</definition>
</identification>
<geometry>1</geometry>
</intrinsicProperties>
<extrinsicProperties>
<relationMembership>
<relation>
<name>Inheritance</name>
<firstMember>street</firstMember>
<secondMember>communication route</secondMember>
194
</relation>
<role>subtype</role>
</relationMembership>
<relationMembership>
<relation>
<name>Inheritance</name>
<firstMember>street</firstMember>
<secondMember>street paved</secondMember>
</relation>
<role>supertype</role>
</relationMembership>
<relationMembership>
<relation>
<name>Inheritance</name>
<firstMember>street</firstMember>
<secondMember>street unpaved</secondMember>
</relation>
<role>supertype</role>
</relationMembership>
</extrinsicProperties>
</conceptualRepresentation>
</GsPmessage>
ANNEX B
ANSWER WITH THE ROAD GEOCONCEPTREP
ENCODED IN AN XML DOCUMENT
<?xml version="1.0" encoding="UTF-8"?>
<GsPmessage type="answer?street">
<conceptualRepresentation>
<intrinsicProperties>
<identification>
<name>road</name>
<definition>a road for the movement of motor vehicles.</definition>
</identification>
<descriptiveAttribute>
<name>classification</name>
<attributeValue>
<name>highway</name>
<definition>a road for motor vehicules designed for high-speed travel,
usually lacking rail or road intersections. each roadway of a highway is
an entity occurrence.</definition>
</attributeValue>
<attributeValue>
196
<name>main</name>
<definition>a numbered thoroughfare. access road may not be
numbered. </definition>
</attributeValue>
<attributeValue>
<name>secondary</name>
<definition>a local or rural road not assigned a number. </definition>
</attributeValue>
<attributeValue>
<name>street</name>
<definition>a public road in a residential or commercial area with
buildings on one or both sides.</definition>
</attributeValue>
<attributeValue>
<name>rapid transit</name>
<definition>a road restricted to vehicles of the public
transportation.</definition>
</attributeValue>
<attributeValue>
<name>unknown</name>
<definition>not possible to determine the road classification from the
data source.</definition>
</attributeValue>
</descriptiveAttribute>
<descriptiveAttribute>
<name>support</name>
<attributeValue>
<name>ground level</name>
<definition>a road built directly on ground level.</definition>
</attributeValue>
<attributeValue>
197
<name>other</name>
<definition>all road support known other than those listed for this
attribute (e.g. bridge, tunnel, or dam).</definition>
</attributeValue>
<attributeValue>
<name>unknown</name>
<definition>not possible to determine the road support from the data
source.</definition>
</attributeValue>
</descriptiveAttribute>
<descriptiveAttribute>
<name>surface</name>
<attributeValue>
<name>hard surface</name>
<definition>a surface made of concrete, asphalt, or tar
gravel.</definition>
</attributeValue>
<attributeValue>
<name>loose surface</name>
<definition>a surface made of other than concrete, asphalt, or tar
gravel.</definition>
</attributeValue>
<attributeValue>
<name>unknown</name>
<definition>not possible to determine the road surface from the data
source. </definition>
</attributeValue>
</descriptiveAttribute>
<descriptiveAttribute>
<name>status</name>
<attributeValue>
198
<name>under construction</name>
<definition>a road in the preliminary stages of construction, which
would include activities such as grading and/or building embankments
for a bridge, and on which traffic is prohibited for an extended period
of time.</definition>
</attributeValue>
<attributeValue>
<name>unknown</name>
<definition>not possible to determine the road status from the data
source. </definition>
</attributeValue>
<attributeValue>
<name>operational</name>
<definition>road that can be used or is in full operation.</definition>
</attributeValue>
</descriptiveAttribute>
<descriptiveAttribute>
<name>number of lanes</name>
</descriptiveAttribute>
<descriptiveAttribute>
<name>number</name>
</descriptiveAttribute>
<geometry>1</geometry>
</intrinsicProperties>
<extrinsicProperties>
<relationMembership>
<relation>
<name/>
<firstMember>road</firstMember>
<secondMember>root</secondMember>
</relation>
199
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>barrier/gate</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>highway exit</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>ferry route</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
200
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>limited- use road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>railway</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>railway</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>built-up area</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>road</firstMember>
<secondMember>built-up area</secondMember>
201
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>dam</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>road</firstMember>
<secondMember>dam</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>ford</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>connect</name>
<firstMember>road</firstMember>
<secondMember>trail</secondMember>
</relation>
</relationMembership>
<relationMembership>
202
<relation>
<name>share</name>
<firstMember>road</firstMember>
<secondMember>bridge</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>dyke/levee</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>tunnel</firstMember>
<secondMember>road</secondMember>
</relation>
</relationMembership>
<relationMembership>
<relation>
<name>share</name>
<firstMember>road</firstMember>
<secondMember>snowshed</secondMember>
</relation>
</relationMembership>
</extrinsicProperties>
</conceptualRepresentation>
</GsPmessage>
ANNEX C
LA PROXIMITÉ GÉOSÉMANTIQUE
AU SERVICE DE LA DÉCOUVERTE
D’INFORMATION GÉOSPATIALE
DANS UN ENVIRONNEMENT SANS FILS
Geosemantic Proximity to Improve Geospatial
Information Discovery in a Wireless Environment
(J. Brodeur, Y. Bédard et B. Moulin)
C.1 Résumé de l’article
Les chapitres 3 à 5 inclusivement de cette thèse élaborent essentiellement
l’interopérabilité et la notion de proximité géosémantique. Cette annexe propose la notion
de proximité géosémantique comme un élément qui minimise l’interaction avec les
sources de données géospatiales accessibles sur le Web et qui accroît l’efficience des
engins de recherche d’information géospatiale. Ces sources de données géospatiales
élaborées pour des besoins particuliers selon différentes ontologies peuvent maintenant
être accédées à l’aide d’ordinateurs de poche ou d’assistants numériques personnels
(PDA) branchés sur Internet à l’aide de connexions sans fils et de fureteurs Web sur des
téléphones cellulaires qui utilisent le protocole WAP. L’étroitesse de la largeur de bande
204
actuellement disponible sur Internet pour des connexions sans fils exige en ce sens des
engins de recherche de données géospatiales plus efficients pour l’interaction entre
l’utilisateur et le serveur de données. Plus spécifiquement, cette annexe reprend le cadre
conceptuel d’interopérabilité des données géospatiales présenté au chapitre 3 et la notion
de proximité géosémantique présentés au chapitre 4 pour interagir avec des bases de
données qui peuvent être utilisées dans un environnement sans fils. Des exemples
illustrent la pertinence de cette notion qui appuie la recherche efficiente de données
géospatiales sur le Web, spécialement dans un environnement sans fils. Finalement, nous
présentons succinctement des résultats obtenus à l’aide de notre prototype.
C.2 Abstract
Today, more and more geospatial data sources, which have been created for specific
purposes using different ontologies may be searched using Pocket PC or Palm PDA with
wireless connection to the Internet as well as WAP-based Web browsers on cell phones.
In this annexe, we propose a solution to increase the efficiency of search engines when
looking for geospatial data. More specifically, we describe a framework for geospatial
data interoperability and the notion of geosemantic proximity to interact with geospatial
databases that could be used in a wireless environment. Examples illustrate the suitability
of this notion to support efficient searching for geospatial data over the Web, especially
in a wireless environment. Finally, we briefly address preliminary results obtained with
our prototype.
C.3 Introduction
It is well known that topographic elements are depicted differently in various geospatial
data sources. For instance, the National Topographic Data Base (NTDB) provided by
Natural Resources Canada, the Street Network Files by Statistics Canada, and the VMap
libraries for military purposes depict Canada differently. There are also several other
topographic data sources produced by provincial departments that depict parts of
Canadian topographic elements, e.g. BC Digital Base Line Mapping (Geographic Data
205
BC) and the Base de données topographiques du Québec (BDTQ). Typically, these data
sources provide different abstractions of the topographic reality, resulting in data sharing
and integration problems when users try to merge data from two or more sources. For
example, a water area is represented as a waterbody in the NTDB, a lake/pond in
VMap libraries, a lake in BC Digital Base Line Mapping, and a “lac ” in Base de
données topographiques du Québec (N.B. = point, = line, and = surface pictograms
symbolize the kind of geometry used to describe the phenomenon geographically
(Bédard, 1999)). Increasingly such geospatial data sources are becoming readily available
on the Web. Selecting the most appropriate data source for someone using either a Pocket
PC or Palm Personal Digital Assistant (PDA) with wireless connection to the Internet as
well as WAP-based Web browsers on cell phones requires tedious keying of several
queries before getting the best answer and results in unnecessary and costly data transfer.
Wireless technologies require highly efficient search engines that can identify very
precisely the desired geospatial data sources in order to minimize both the data keying on
these type-unfriendly devices and the cost of data transfer. These facts lead us to develop
new approaches to better interoperate with geospatial data sources on the Web.
Interoperability of geospatial data is considered a solution for various problems, such as
for sharing and integrating geospatial data on the fly. It provides the means to solve
syntactic, structural, semantic, geometric, and temporal heterogeneities (Bishr, 1997;
Charron, 1995). Standardization organizations, such as the Open GIS Consortium Inc.
(OGC) and ISO/TC 211-Geographic information/Geomatics, as well as the research
community have built solid foundations of geospatial data interoperability regarding
syntactic and structural heterogeneities (e.g. ISO/TC 211, 2003a; ISO/TC 211, 2003b;
Open GIS Consortium Inc., 1999; Open GIS Consortium Inc., 2001 that give content,
structure, and syntactical descriptions of geospatial data). However, as structural
heterogeneities can only be solved for semantically similar representations of phenomena
(Bishr, 1997), assessing the semantic proximity of geospatial data becomes an important
issue for geospatial data interoperability.
206
However, accessing available geospatial data sources on the World Wide Web in an
interoperable mode is still an unresolved issue that becomes especially important with
technologies such as PDA and wireless applications. When interacting with geospatial
data sources, people using PDA or WAP-enabled cell phones are usually not aware of the
data specifications of these sources, their data dictionaries, or their technical thesaurus to
get exactly the information they need. Also considering the actual Internet bandwidth for
such technologies, searching geospatial information on the Web using different keywords
could result in fastidious and expensive operations.
In this annexe, we present a framework for geospatial data interoperability and, more
particularly, the new approach of geosemantic proximity, which takes simultaneously into
consideration geometric and semantic characteristics of an object and plays an important
role in the framework. Geosemantic proximity is seen here as an approach that facilitates
the search for geospatial information on the Web, based on the user’s vocabulary, which
results in both time and cost savings.
The remainder of this annexe is structured as follows. The next section reviews basic
elements upon which our framework and the notion of geosemantic proximity have been
delineated. The section C.5 presents our framework of geospatial data interoperability.
Section C.6 describes the approach of geosemantic proximity. In section C.7, we mention
a prototype developed recently and preliminary results. We conclude and present future
works in section C.8.
C.4 Background
The framework of geospatial data interoperability and the approach of geosemantic
proximity presented in the following sections are based on studies on human
communication, cognition, database modeling, artificial intelligence (AI), and
geographical information; especially those related to ontology, context, semantic
proximity, topology, mapping specifications, and semantic interoperability. We consider
the human communication process (Schramm, 1971; Weiner, 1950) to be a powerful
207
representation of interoperability. Human communication corresponds to the process
involving an individual who transmits to someone else something that he has in mind and
that describes phenomena of a given reality. It is essentially composed of a human
source, signals, a communication channel, a human destination, possible noise, and a
feedback component. Cognitive models of the source and the destination refer to signals
(raw and transmitted) that reach their sensory systems and generate perceptual states
(Barsalou, 1999), also called percepts. The human attention selects and records only
properties that appear pertinent and structures them into concepts, or perceptual symbols
(Barsalou, 1999). Concepts are composed of both hidden data-like elements and a
translation process that (1) converts data elements into conceptual representations and (2)
recognizes conceptual representations. Conceptual representations are the physical
symbols used to convey the concept in specific situations.
When communicating, humans deal with multiple representations of real-world
phenomena. The description of real-world phenomena has been studied by people
working in AI (ontology (Gruber, 1993)) and database modeling. Conceptual database
modeling consists of abstraction of parts of reality from a data-centered perspective
(Simsion, 2001), used to convey information about it. Multiple conceptual models could
describe the same portion of reality differently according to the needs of different
systems or users, leading to interoperability problems when integrating the data. In such
cases, an ontology could provide means to facilitate the integration of such data since it
provides linkage elements such as identity (described later in this section), which allow
interoperability.
The context influences the abstraction of real-world phenomena. Context is here defined
as the situation or the circumstances in which phenomena are observed, which drive the
selection of distinctive intrinsic and extrinsic properties, and provide the intended
semantics (Kashyap and Sheth, 1996; Ouksel and Sheth, 1999; Wisse, 2000). When
dealing with geospatial data interoperability, it becomes essential to take the context into
account. Semantic proximity in a context-based perspective is an approach well defined
in the litterature (Kashyap and Sheth, 1996; Ouksel and Sheth, 1999; Sheth and Kashyap,
208
1992) that supports reasoning functionalities and expresses the semantic relationships
between conceptual representations using qualitative predicates such as semantic
resemblance, semantic relevance, semantic relation, semantic equivalence, and semantic
incompatibility.
As mentioned above, conceptual representations are physical symbols used to convey
details about concepts. However, concepts and conceptual representations have to refer to
the same set of phenomena to be interoperable. Therefore, they are not as important as
the phenomena to which they refer. As such, the notion of identity of phenomena appears
to be significantly related to geospatial data interoperability in the sense that concepts and
conceptual representations involved in a communication process should refer to the same
phenomena. In other words, the identity of phenomena must be recognized from source
and destination concepts as well as from the conceptual representation. Identity is then
defined as a meta-property that allows us to distinguish and individualize geographic
phenomena (Guarino and Welty, 2000) as well as to recognize representations that refer
to the same phenomenon.
We can envision that a concept and a conceptual representation are made of intrinsic
properties providing literal meaning and bounded by extrinsic properties restricting the
scope of the concept or the conceptual representation. A concept and a conceptual
representation can be associated to a segment on a semantic axis. The interior of the
segment corresponds to the set of intrinsic properties of the concept or the conceptual
representation whereas the boundary of the segment corresponds to the set of extrinsic
properties. In this regard, the notion of topology as studied in geospatial information by
authors such as (Clementini and Di Felice, 1994; Egenhofer, 1993; Egenhofer and
Franzosa, 1991; Egenhofer et al., 1994) is here extended for the purpose of semantic
interoperability within the approach of geosemantic proximity.
Let’s take the example of road to clarify the above notions of intrinsic and extrinsic
properties of a concept and a conceptual representation and the associated notions of
interior and exterior. On the one hand, a road can be described by its classification type
209
(e.g. highway, main, secondary, and so on), its surface type (e.g. paved or unpaved), its
road number or road name, and its geometric representation (e.g. a line). These represent
intrinsic properties and, as such, the interior of the road concept. On the other hand, a
road can have relationships with other features such as built-up areas, railways, bridges,
ferry routes, and other roads. The memberships of a road in these relationships represent
extrinsic properties and, as such, are boundaries of the road concept.
C.5 Geospatial Data Interoperability on the Web
In Figure C1, we illustrate geospatial data interoperability as an interpersonal
communication-like process. For example, this process corresponds to a user agent (Au),
which could be an individual using a pocket PC with a wireless link to the Internet, who
wants information about the road network within the area of Sherbrooke and queries a
geospatial data source, i.e. a data provider agent (Adp), which could be a geospatial
database on a server also connected to the Internet, about streets within the Sherbrooke
area. As soon as Adp gets the request and interprets it using its personal knowledge (e.g.
road and Sherbrooke ), it first locates the information corresponding to Au’s request,
then translates it into a form that is understandable by Au (e.g. King Street , Portland
Blvd , and so on), and sends it to Au. Au evaluates the answer he has received and
determines if it corresponds exactly to his request. The two agents can understand each
other because they share a common background and a set of symbols that they use.
Hence, in order to develop our framework for geospatial data interoperability, we use five
expressions of the topographic reality, R, R’ R’’, R’’’ and R’’’’, each representing a
separate ontology, which is related to the others thanks to the communication process.
Together, they form what we call the five ontological phases of geospatial data
interoperability. R corresponds to the topographic reality as it appears to Au at a given
instant and for which Au wants information. R cannot be directly described. R’ refers to
Au’s abstraction of R, which consists of a set of selected properties structured in concepts
in order to form Au’s cognitive model. R’ is called Au’s affordances of R (Gibson, 1979).
R’’ joins together the conceptual representations that Au generates to translate the
significant properties of Au’s concepts in a given situation. These conceptual
210
representations are physical signals that use a vocabulary to depict the concepts partly or
wholly and to specify the intended meaning. These signals transit through the
communication channel to reach Adp. R’’’ consists of the set of Adp’s concepts. These
concepts are used to decode and recognize the R’’ ’s conceptual representations and grant
them a specific semantics. In an ideal situation, Adp’s concepts have a meaning closely
similar to Au’s initial concepts. R’’’’ designates the conceptual representations sent back
to Au. They are retrieved from Adp’s knowledge base and encoded before being
transmitted.
Since the encoding and decoding translation processes are typically viewed as
middleware components, they are tied into our framework to concepts that appear in R’
and R’’’. These processes generate and recognize the conceptual representations that
match the concepts. They also take into account the respective contexts of the concept
and the conceptual representation.
As illustrated in this framework, geospatial data interoperability is a bi-directional
process that also includes feedback in both directions in order to ensure that messages
have reached the destination and are understood properly. We think that this is an
important issue when considering semantic interoperability of geospatial data.
Figure C1: A Framework for Geospatial Data Interoperability
211
This communication process is typical on the Web. People surf the Web to find
geospatial information using their respective knowledge and vocabulary. They also use
their knowledge and vocabulary to recognize answers they get and to evaluate them
against their queries. However, as geospatial data sources are not able to recognize
messages encoded in other vocabularies than theirs, people have to know in advance the
exact vocabulary or must have access to the metadata repositories describing the
geospatial data (i.e. to have access to data sources’ ontologies) to query the geospatial
data sources. This makes the interaction with geospatial data sources arduous on the
Web. As a result, semantic interoperability with geospatial data sources available on the
Web is still a problem and automatic solutions are more and more needed. Hence, we
propose the notion of geosemantic proximity to resolve this issue.
C.6 Geosemantic Proximity and the Web
As illustrated in Figure C1, agents exchange personal knowledge by communicating
conceptual representations. On the Web, user agents’ concepts and data provider agents’
concepts (as illustrated in Figure C1) must be able to recognize conceptual
representations in the incoming signals and to generate conceptual representations
translating part of their own knowledge. When considering spatial information, an
important aspect is the assessment of geosemantic proximity between a concept and a
conceptual representation. This section presents this new notion of geosemantic
proximity, which takes the context into consideration.
The context is thought of as a meta-concept omnipresent when abstracting phenomena. It
governs the way phenomena are perceived and is typically described by intrinsic
properties (i.e. properties of literal meaning, such as identification, attributes, attribute
values, geometries, temporalities, domain) and extrinsic properties (i.e. properties
providing meaning because of their association with other abstractions, such as semantic,
spatial, and temporal relationships as well as behaviours). Figure C2 illustrates the
212
relationships that exist between phenomenon, context, abstraction, concept, conceptual
representation, property, intrinsic property, and extrinsic property in a UML class
diagram.
Figure C2: UML Class Diagram Describing Phenomenon,
Abstraction, Context, Properties, and their Relationships
We view the context of a concept K (CK) (just as for the context of a conceptual
representation) as consisting of the union of the intrinsic properties (CK°) and the
extrinsic properties (∂CK) of CK (Equation C1).
CK = CK° ∪ ∂CK (Equation C1)
Where:
CK = Context of concept K
CK° = Intrinsic properties of CK
∂CK = Extrinsic properties of CK
213
We present the intrinsic properties as the interior of a segment on a semantic axis and the
extrinsic properties as the boundaries of that segment. We use this representation in order
to exploit the topological relationships between the context of a concept and the context
of a conceptual representation.
Geosemantic proximity (GsP) is a context-based approach, which compares intrinsic and
extrinsic properties of a spatial concept to those of a spatial conceptual representation in
order to express their similarity qualitatively. It is used by the translation process tied to
the concept and determines how a given conceptual representation matches this concept.
It consists of the intersection of the concept K’s context and the conceptual representation
L’s context (Equation C2 and Figure C3).
GsP (K,L) = CK ∩ CL (Equation C2)
Where:
CK = Context of concept K
CL = Context of conceptual representation L
GsP (K,L) = Geosemantic proximity between K and L
Figure C3: Intersection between context of K and context of L
We expand GsP into a four-intersection matrix (as used for spatial topological
relationships (Egenhofer, 1993)), which develops the four distinct intersections between
the respective intrinsic and extrinsic properties of the concept K’s context and the
conceptual representation L’s context (Equation C3). Each member of the matrix can be
evaluated empty, denoted Ф or f (false), or non-empty, denoted ¬Ф or t (true).
214
GsP (K,L) = ∂CK ∩ ∂CL ∂CK ∩ CL°
CK° ∩ ∂CL CK° ∩ CL°
(Equation C3)
(N.B. the notation used in equation 3 is the same as the one used by Egenhofer for spatial
relationships (Egenhofer, 1993))
Hence sixteen (24) different predicates are derived. According to the four-intersection
matrix, they are presented by the intersection values listed row by row. As shown in
Figure C4, the predicates are gathered into four groups:
- the upper subdivision shows the predicates characterized by common intrinsic and
extrinsic properties (GsP_tfft/equal, GsP_ttft/coveredBy, GsP_tftt/covers, and
GsP_tttt);
- the right subdivision shows predicates characterized by common intrinsic
properties and no common extrinsic properties (GsP_ffft, GsP_fftt/contains,
GsP_ftft/inside, and GsP_fttt/overlap);
- the bottom subdivision shows the predicates characterized by no common
intrinsic properties and no common extrinsic properties (GsP_fttf, GsP_ftff,
GsP_fftf, and GsP_ffff/disjoint); and
- the left subdivision shows the predicates characterized by common extrinsic
properties and no common intrinsic properties (GsP_tttf, GsP_tftf, GsP_ttff, and
GsP_tfff/ meet).
C.6.1 Examples
Let us look at some examples to illustrate how the GsP predicates can be used. In these
examples, we assume an agent is associated with a predefined ontology, which describes
a set of concepts using explicit intrinsic and extrinsic properties. This agent compares
concepts of its associated ontology with conceptual representations it receives as part of a
message from another agent in order to recognize these conceptual representations. These
215
conceptual representations were typically encoded using the ontology of the agent
transmitting the message. So, when the agent’s concept road as described in (BC
Ministry of Environment Lands and Parks (Geographic Data BC), 1992) is compared to
the conceptual representation vegetation as described in (Natural Resources Canada,
1996), road shows no explicit common intrinsic properties nor explicit common
Figure C4: The Sixteen Predicates of Geosemantic Proximity Relationships
216
extrinsic properties with vegetation . Thus the geosemantic proximity of road with
vegetation is GsP_ffff (or disjoint). (N.B. such assessment makes no assumption with
regard to the spatial relationships that exist between road and vegetation instances.)
However, the comparison of the agent’s concept road defined in (Natural Resources
Canada, 1996) with the conceptual representation street (from the French rue as
defined in (Québec, 2000)) reveals that road has an attribute street and also both have
the same type of geometric representation. As a result they have common intrinsic
properties. Also as part of its description, the conceptual representation street has
relationships with other kinds of roads that are included in the concept road ,
consequently street extrinsic properties are related to road intrinsic properties.
Therefore, we can say that the geosemantic proximity of road with street is GsP_fftt
(or contains). Inversely, if we consider street as the concept and road as the
conceptual representation, the geosemantic proximity of street with road is GsP_ftft
(or inside). As another example, when comparing the agent’s concept hazard to air
navigation with the conceptual representation bridge , both described in (Natural
Resources Canada, 1996), on the one hand one can see that hazard to air navigation
has a specific attribute that includes high bridges. On the other hand, they have one
common geometric representation (line in this case). As such, hazard to air
navigation has common intrinsic properties with bridge . Also, hazard to air
navigation and bridge both have relationships with each other, as such intrinsic
properties of one are related to extrinsic properties of the other. Accordingly, we can say
that the geosemantic proximity of hazard to air navigation with bridge are GsP_fttt
(or overlap).
C.7 Experiments
An experimental prototype was developed recently to validate the GsP approach within
the proposed framework. It was developed in Java and XML, and makes use of geospatial
repositories elaborated with Perceptory (Bédard and Proulx, 2002), a UML-based case
tool that supports geographic information standards of the ISO 19100 series. The
prototype computes automatically the geosemantic proximity of a geoConcept when this
217
geoConcept is compared to a geoConceptRep. As a result, it simplifies and reduces the
time-consuming task of mapping geoConcepts with geoConceptReps, while minimizing
subjective interpretations and possible mapping errors. Experiments are currently being
conducted using ontologies on road networks and hydrographic networks using product
specifications such as (1) Standards and Specifications for the National Topographic Data
Base (NTDB) of Canada, (2) Specifications for the Digital Baseline Mapping at 1:20000
of Province of British Columbia (DBMBC), (3) Specifications for the Ontario Digital
Topographic Database (ODTDB), (4) Specifications for the “Base de données
topographiques du Québec” (BDTQ), and (5) Specifications for Digital and Hardcopy
Property and Basemap Products of Province of Prince Edward Island (PEIBP).
Preliminary results of the experiment are promising. For example, using geospatial data
repositories that developed with the above product specification, the prototype maps
automatically the geoConcept road from NTDB with the geoConceptRep street from
BDTQ with a geosemantic proximity of GsP_ffft and the geoConcept water
disturbance from NTDB with the geoConceptRep rapids from PEIBP with a
geosemantic proximity of GsP_ffft. Description of the prototype is addressed in detail in
the chapter 5 of this thesis.
C.8 Conclusion
In this annexe, we recognized that it is essential to take the semantics of geospatial data
into consideration to facilitate and improve the search for geospatial data on the Web,
especially in PDA and WAP-based wireless environments where keying queries is
tedious and data transfer is costly. As such, we have presented a conceptual framework
for the semantic interoperability of geospatial data, as a solution, resulting from a bi-
directional communication process (Figure C1) involving a user agent and a data provider
agent. In this framework, geosemantic proximity plays a major role for geospatial data
interoperability. It expresses, qualitatively, the semantic similarity of a geospatial concept
with a geospatial conceptual representation based on comparison of their intrinsic and
extrinsic properties, which is developed using a four-intersection matrix. Examples have
been presented to demonstrate the suitability of such an approach. A prototype was
218
developed recently and experiments are presently being carried out to assess the strengths
and the weaknesses of the approach.
Although our framework for geospatial data interoperability, the notion of geosemantic
proximity, and the preliminary results of our prototype appear promising to access
geospatial data sources in an interoperable manner, experiments that are presently
conducted need to be finalized, documented, and discussed. Other issues need to be
investigated further, notably the development of ontologies in the context of semantic
interoperability of geospatial databases and the analysis of natural language definitions in
order to extract more intrinsic and extrinsic properties of geospatial concepts and
geospatial conceptual representations.
Acknowledgements
The authors wish to acknowledge the contribution of Natural Resources Canada – Centre
for Topographic Information, which supports the first author for this research; the
GEOIDE Network of Centres of Excellence in geomatics, project DEC#2 (Designing the
Technological Foundations of Spatial Decision making on the World Wide Web); the
Geomatics Information Centre of Prince Edward Island, Transportation and Public
Works, which have provided information about their geospatial data; as well as the
contribution of Mike Major for the English revision.
C.9 References
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks
Bédard, Y 1999 Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186
219
Bédard, Y, and M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval
Clementini, E, and P Di Felice 1995 A Comparison of Methods for Representing
Topological Relationships. Information Sciences-Applications: An International
Journal, 3(3): 149-178
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273
Egenhofer, M, and R D Franzosa 1991 Point-Set Topological Spatial Relations.
International Journal of Geographic Information Science, 5(2): 161-174
Egenhofer, M, D M Mark, and J R Herring 1994 The 9-Intersection: Formalism and Its
Use for Natural-Language Spatial Predicates. Santa Barbara, CA, University of
California, National Center for Geographic Information and Analysis Technical
Report 94-1
Gibson, J J 1979 The Ecological Approach to Visual Perception. Boston, Houghton
Mifflin
Gruber, T R 1993 A Translation Approach to Portable Ontology Specification. Stanford,
California, Knowledge Systems Laboratory Technical Report KSL 92-71
Guarino, N, and C Welty 2000 A Formal Ontology of Properties. In Proceedings of
Knowledge Engineering and Knowledge Management: Methods, Models and
Tools (12th International Conference, EKAW2000), Juan-les-Pins, France. Berlin,
Springer-Verlag Lecture Notes in Computer Science 1937: 97-112
ISO/TC 211 2003a ISO 19107:2003 Geographic Information - Spatial Schema. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003b ISO 19115:2003 Geographic Information - Metadata. Geneva,
Switzerland, International Organization for Standardization
220
Kashyap, V, and A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304
Natural Resources Canada 1996 National Topographic Data Base - Standards and
Specifications. Sherbrooke, Quebec, Centre for Topographic Information –
Sherbrooke
Open GIS Consortium Inc. 1999 Topic 1: Feature Geometry. Wayland, Massachusetts,
Open GIS Consortium Inc.
Open GIS Consortium Inc. 2001 Geography Markup Language (GML) 2.0. Wayland,
Massachusetts, Open GIS Consortium Inc.
Ouksel, A M, and A Sheth 1999 Semantic Interoperability in Global Information
Systems: A Brief Introduction to the Research Area and the Special Section.
Sigmod Record, 28(1): 5-12
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l’information géographique, CD
Document
Schramm, W 1971 How Communication Works. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 12-21
Sheth, A, and V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
Database Systems (DS-5)/IFIP Transaction (A-25), Lorne, Victoria, Australia.
Elsevier Science Publishers B.V.: 283-312
Simsion, G C 2001 Data Modeling Essentials - Analysis, Design, and Innovation.
Scottsdale, Arizona, Coriolis
Weiner, N 1950 The Human Use of Human Beings: Cybernetics and Society. Boston,
Houghton and Mifflin
Wisse, P 2000 Metapattern: Context and Time in Information Models. Reading,
Massachusetts, Addison-Wesley
BIBLIOGRAPHIE
Abel, D J, B C Ooi, K L Tan, and S H Tan 1998 Towards Integrated Geographical
Information Processing. International Journal of Geographic Information
Science, 12(4): 334-371 {§1}
Albrecht, J 1999 Towards interoperable geo-information standards: A comparison of
reference models for geo-spatial information. The Annals of Regional Sciences,
33: 151-169
Allen, J F 1981 An interval-based representation of temporal knowledge. In Proceedings
of International Joint Conference on Artificial Intelligence: 221-226
Allen, J F 1983 Maintaining Knowledge about Temporal Intervals. Communication of the
ACM, 26(11): 832-843 {§3, §4, §6}
Allen, J F 1984 Towards a general theory of action and time. Artificial Intelligence, 23:
123-154
Allen, J F, and P J Hayes 1985 A common-sense theory of time. In Proceedings of 9th
International Joint Conference on Artificial Intelligence: 528-531
Arctur, D, D Hair, G Timson, E P Martin, and R Feagas 1998 Issues and Prospects for the
Next Generation of the Spatial Data Transfer Standard (SDTS). International
Journal of Geographic Information Science, 12(4): 403-425 {§1}
Barsalou, L W 1999 Perceptual symbol systems. Behavioral and Brain Sciences, 22(4):
577-609 {§1, §2, §3, §4, §5, §6, §C}
Bateson, G 2002 Mind and Nature : A Necessary Unity. Cresskill, New Jersey, Hampton
Press
222
BC Ministry of Environment Lands and Parks (Geographic Data BC) 1992 Digital
Baseline Mapping at 1:20,000. Victoria, Province of British Columbia, BC
Ministry of Environment, Lands and Parks {§1, §3, §4, §5, §6, §C}
Beckwith, R, G A Miller, and R Tengi 1993 Design and Implementation of the WordNet
Lexical Database and Searching Software (Five Papers on Wordnet). Princeton,
Cognitive Science Laboratory, Princeton University
Bédard, Y 1986 A Study of the Nature of Data Using a Communication-based
Conceptual Framework of Land Information. Ph.D. Dissertation, University of
Maine {§2, §3}
Bédard, Y 1998 Analyse et conception de systèmes d’information à référence spatiale.
Notes de cours sur l’interopérabilité, Québec, Département des sciences
géomatiques, Université Laval {§1}
Bédard, Y 1999a Principles of Spatial Database Analysis and Design. In P A Longley,
M F Goodchild, D J Maguire, and D W Rhind (eds) Geographical Information
Systems: Principles, Techniques, Applications and Management. New York, John
Wiley and Sons, Inc.: 413-424 {§2, §3}
Bédard, Y 1999b Visual Modelling of Spatial Database Towards Spatial PVL and UML.
Geomatica, 53(2): 169-186 {§2, §3, §4, §5, §C}
Bédard, Y, and M-J Proulx 2002 Perceptory Web Site. Web Page Document,
http://sirs.scg.ulaval.ca/Perceptory {§1, §4, §5, §6, §C}
Behrens, C, L Shklar, C Basu, N Yeager, and E Au 1999 The Geospatial Interoperability
Problem: Lessons Learned from Building the Geolens Prototype. In M Goodchild,
M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating Geographic
Information Systems. Boston, Massachusetts, Kluwer Academic Publisher: 249-
266
Bennett, D A, G A Wade, and R Sengupta 1999 Geographical Modeling in Heterogenous
Computing Environments. In M Goodchild, M Egenhofer, R Fegeas, and C
Kottman (eds) Interoperating Geographic Information Systems. Boston,
Massachusetts, Kluwer Academic Publisher: 149-164
Benslimane, D 2001 Interopérabilité de SIG : la solution Isis. Revue internationale de
géomatique, 11(1): 7-42 {§3, §4}
223
Berendt, B 1996 The utility of mental images: How to construct stable mental models in
an unstable image medium. In Proceedings of First European Workshop on
Cognitive Modeling. Berlin, Technische Universität Berlin, Fachbereich
Informatik Report No. 96-39: 97-103
Bergamaschi, S, S Castano, and M Vincini 1999 Semantic Integration of Semistructured
and Structured Data Sources. Sigmod Record, 28(1): 54-60 {§3}
Bettini, C, S Jajodia, and S Wang 2000 Time Granulatities in Databases, Data Mining
and Temporal Reasoning. Berlin, Springer-Verlag
Bishr, Y 1996 A Mechanism for Object Identification and Transfer in a Heterogeneous
Distributed GIS. In Proceedings of 7th International Symposium on Spatial Data
Handling: A.1-A.13
Bishr, Y 1997 Semantics Aspects of Interoperable GIS. Ph.D. Dissertation, ITC
Publication {§1, §3, §5, §C}
Bishr, Y 1998 Overcoming the Semantic and Other Barriers to GIS Interoperability.
International Journal of Geographic Information Science, 12(4): 299-314 {§1}
Bishr, Y 1999a Draft White Paper for the Creation of Semantic SIG. OpenGIS
Consortium Inc. 99-053
Bishr, Y 1999b A Global Unique Persistent Object ID for Geospatial Information
Sharing. In Proceedings of Interoperating Geographic Information Systems
(Interop '99). Berlin, Springer-Verlag Lecture Notes in Computer Science 1580:
55-64
Bishr, Y, H Pundt, W Kuhn, and M Radwan 1999a Probing the Concept of Information
Communities-A First Step Toward Semantic Interoperability. In M Goodchild,
M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating Geographic
Information Systems. Boston, Massachusetts, Kluwer Academic Publisher: 55-69
Bishr, Y, H Pundt, and C Ruther 1999b Preceeding on the Road of Semantic
Interoperability - Design of a Semantic Mapper Based on a Case Study from
Transportation. In Proceedings of Interoperating Geographic Information
Systems (Interop '99). Berlin, Springer-Verlag Lecture Notes in Computer
Science 1580: 203-215
224
Bittner, T, and G Edwards 2001 Toward an Ontology for Geomatics. Geomatica, 55(4):
475-490 {§3, §5}
Blake, R H, and E O Haroldsen 1975 A Taxonomy of Concepts in Communication. New
York, Hastings House Publishers {§3}
Bolduc, P 1994 Projet de levés intégrés : Némésis. Québec, Département des sciences
géomatiques, Université Laval
Borgo, S, N Guarino, C Masolo, and G Vetere 1997 Using a Large Linguistic Ontology
for Internet-Based Retrieval of Object-Oriented Components. In Proceedings of
9th International Conference on Software Engineering and Knowledge
Engineering (SEKE 97): 528-534
Brodeur, J 2001 Interopérabilité des données géospatiales : Élaboration du concept de
proximité sémantique, spatiale et temporelle. Québec, Département des sciences
géomatiques, Université Laval {§1}
Brodeur, J, and Y Bédard 2001 Geosemantic Proximity, a Component of Spatial Data
Interoperability. In Proceedings of International Workshop on "Semantics in
Enterprise Integration" (OOPSLA 2001): 6 {§4, §5}
Brodeur, J, and Y Bédard 2002 Extending Geospatial Repositories with Geosemantic
Proximity Functionalities to Facilitate the Interoperability of Geospatial Data. In
Proceedings of ISPRS, SDH2002 and CIG Joint International Symposium on
Geospatial Theory, Processing and Application: 6 {§5}
Brodeur, J, Y Bédard, G Edwards, and B Moulin 2003a Revisiting the Concept of
Geospatial Data Interoperability within the Scope of a Human Communication
Process. Transactions in GIS, 7(2): 243-265 {§4, §5}
Brodeur, J, Y Bédard, and B Moulin 2002a A Geosemantic Proximity -Based Prototype
for Interoperability of Geospatial Data. (Manuscrit)
Brodeur, J, Y Bédard, and B Moulin 2003b Geosemantic Proximity to Improve
Geospatial Information Discovery in a Wireless Environment. Geomatica, 57(1):
341-350
Brodeur, J, Y Bédard, B Moulin, and G Edwards 2001 Geosemantics Proximity and Data
Fusion. Présentation lors de l’atelier GeoSpatial information Revision and Fusion
225
Brodeur, J, Y Bédard, B Moulin, and G Edwards 2002b Geosemantic Proximity for
Geospatial Data Interoperability. (Manuscrit)
Brodeur, J, Y Bédard, and M J Proulx 2000 Modelling Geospatial Application Databases
using UML-based Repositories Aligned with International Standards in
Geomatics. In Proceedings of Eighth ACM Symposium on Advances in
Geographic Information Systems (ACMGIS) ACM Press: 39-46 {§2, §3, §4, §5}
Brodeur, J, and F Massé 2001 Standardization in Geomatics in Canada and in ISO/TC
211. Geomatica, 55(1): 91-106
Brodie, M L 1992 The Promise of Distributed Computing and the Challenges of Legacy
Information Systems. In Proceedings of IFIP WG2.6 Database Semantics
Conference on Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 1-30
Brunig, M, A B Cremers, H J Götze, S Schmidt, S Shumilov, and A Siehl 1999 First
Steps Towards an Interoperable GIS - an Example from Southern Lower Saxony.
Physics and Chemistry of the Earth Part A - Solid Earth and Geodesy, 24(3): 179-
189
Calvo Garzon, F 2000 State space semantics and conceptual similarity: Reply to
Churchlands. Philosophical Psychology, 13(1): 77-95
Câmara, G, R Thomé, U Freitas, and A Monteiro 1999 Interoperability in Practice:
Problems in Semantic Conversion from Current Technology to OpenGIS. In
Proceedings of Interoperating Geographic Information Systems (Interop '99).
Berlin, Springer-Verlag Lecture Notes in Computer Science 1580: 129-138
Campbell, J 1982 Grammatical Man: Information, Entropy, Language, and Life. New
York, Simon and Schuster {§1, §2, §3}
Canadian Council on Surveying and Mapping 1984 National Standards for the Exchange
of Digital Topographic Data: Topographic Codes and Dictionary of Topographic
Features. Ottawa, Topographical Survey Division, Surveys and Mapping Branch,
Energy, Mines and Resources Canada {§2, §3}
Casati, R, B Smith, and A C Varzi 1998 Ontological Tools for Geographic
Representation. In N Guarino (ed) Formal Ontology in Information Systems.
Amsterdam, IOS Press: 77-85 {§2, §3, §4}
226
CCOG - Working Group on “Base Data Quality Issue” 2001 A New Vision for Quality
Base Geographic Information in Canada – “GeoBase”. {§5}
Charron, J 1995 Développement d’un processus de sélection des meilleures sources de
données cartographiques pour leur intégration à une base de données à référence
spatiale. Mémoire de maîtrise, Université Laval {§1, §2, §3, §4, §5, §C}
Cherry, C 1978 On Human Communication: a Review, a Survey, and a Criticism.
Cambridge, Massachusetts, The MIT Press {§1, §2, §3}
Clément, G, C Larouche, P Morin, and D Gouin 1999 Interoperating Geographic
Information Systems Using the Open Geospatial Datastore Interface (OGDI). In
M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating
Geographic Information Systems. Boston, Massachusetts, Kluwer Academic
Publisher: 283-300
Clementini, E, and P Di Felice 1995 A Comparison of Methods for Representing
Topological Relationships. Information Sciences-Applications: An International
Journal, 3(3): 149-178 {§4, §C}
Clementini, E, and P Di Felice 1996 A Model for Representing Topological Relationship
Between Complex Geometric Features in Spatial Databases. Information
Sciences, 90(1-4): 121-136 {§3, §4}
Cohen, P 1982 Model of Cognition: Overview. In P R Cohen, and E A Feigenbaum (eds)
The Handbook of Artificial Intelligence. HeirisTech Press: 1-10 {§3}
Collongues, A, J Hugues, and B Laroche 1987 Merise - Méthode de conception. Paris,
Bordas {§2, §3}
Coppock, J T, and D W Rhind 1991 The History of GIS. In D J Maguire, M F Goodchild,
and D W Rhind (eds) Geographical Information Systems. New York, Longman
Scientific and Technical: 21-43 {§1}
Costanza, P 2001 Transmigration of Object Identity: The programming Language
GILGUL. In Proceedings of Conference on Object-Oriented Programming,
Systems, Languages and Applications (OOPSLA 2001), Doctoral Symposium, 1-
58113-441-X Association for Computing and Machinery Inc. Companion: 5-6
227
Daconta, M C, L J Obrst, and K T Smith 2003 The Semantic Web: A Guide to the Future
of XML, Web Services, and Knowledge Management. Indianapolis, Indiana, Wiley
Publishing, Inc. {§2}
Darnell, D K 1971 Information Theorie. In J A DeVito (ed) Communication: Concepts
and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 37-45 {§1, §2,
§3}
Denes, P B, and E N Pinson 1971 The Speech Chain. In J A DeVito (ed)
Communication: Concepts and Processes. Englewood Cliffs, New Jersey,
Prentice-Hall Inc: 3-11 {§1, §3}
Denis, M 1994 Image et Cognition. Paris, Presses Universitaires de France {§3}
Devogele, T, C Parent, and S Spaccapietra 1998 On Spatial Database Integration.
International Journal of Geographic Information Science, 12(4): 335-352
Dong, G, and K Ramamohanarao 1992 Representation and Translation of Queries in
Heterogenous Databases with Schematic Discrepencies. In Proceedings of IFIP
WG2.6 Database Semantics Conference on Interoperable Database Systems (DS-
5)/IFIP Transaction (A-25) Elsevier Science Publishers B.V.: 177-189
Doyle, A, D Dietrick, J Ebbinghaus, and P Ladstatter 1999 Building a Prototype
OpenGIS Demonstration from Interoperable GIS Components. In Proceedings of
Interoperating Geographic Information Systems (Interop '99). Berlin, Springer-
Verlag Lecture Notes in Computer Science 1580: 139-149
Eco, U 1988a Le signe. Bruxelles, Éditions Labor {§2}
Eco, U 1988b Sémiotique et philosophie du langage. France, Presses Universitaires de
France {§2}
Edmond, D, and M P Papazoglou 1998 Reflexion Is the Essence of Cooperation. In M P
Papazoglou, and G Schlageter (eds) Cooperative Information Systems-Trends and
Directions. San Diego, CA, Academic Press: 233-260
Egenhofer, M 1993 A Model for Detailed Binary Topological Relationships. Geomatica,
47(3 & 4): 261-273 {§1, §3, §4, §6, §C}
Egenhofer, M, and R D Franzosa 1991 Point-Set Topological Spatial Relations.
International Journal of Geographic Information Science, 5(2): 161-174 {§1, §4,
§6, §C}
228
Egenhofer, M, D M Mark, and J R Herring 1994a The 9-Intersection: Formalism and Its
Use for Natural-Language Spatial Predicates. Santa Barbara, CA, University of
California, National Center for Geographic Information and Analysis Technical
Report 94-1 {§3, §4, §C}
Egenhofer, M, and M A Rodriguez 1999 Relation Algebras over Containers and
Surfaces: An Ontological Study of a Room Space. Journal of Spatial Cognition
and Computation, 1: 23
Egenhofer, M, and J Sharma 1992 Topological Consistency. In Proceedings of 5th
International Symposium on Spatial Data Handling IGU Commission of GIS:
335-343 {§4}
Egenhofer, M J 1997 Spatial Relations: Models and Inferences. In Proceedings of
Tutorial 2 - 5th International Symposium on Spatial Databases (SSD'97): 83 {§4}
Egenhofer, M J 1999 Introduction: Theory and Concepts. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 1-4 {§3, §5}
Egenhofer, M J, E Clementini, and P Di Felice 1994b Topological relations between
regions with holes. International Journal of Geographic Information Science,
8(2): 129-142 {§4}
Egenhofer, M J, and R D Franzosa 1995 On the equivalence of topological relations.
International Journal of Geographic Information Science, 9(2): 133-152
Egenhofer, M J, J Glasgow, O Günther, J R Herring, and D J Peuquet 1999 Progress in
Computational Methods for Representing Geographical Concept. International
Journal of Geographic Information Science, 13(8): 775-796 {§1}
Egenhofer, M J, and D M Mark 1995 Modelling conceptual neighbourhoods of
topological line-region relations. International Journal of Geographic
Information Science, 9(5): 555-565 {§4}
Ekenberg, L, and P Johannesson 1995 Conflicfreeness as a Basis for Schema Integration.
In Proceedings of CISMOD: 1-13
Evans, J D 1999 Organizational and Technological Interoperability for Geographic
Information Infrastructures. In M Goodchild, M Egenhofer, R Fegeas, and
229
C Kottman (eds) Interoperating Geographic Information Systems. Boston,
Massachusetts, Kluwer Academic Publisher: 401-414
Fellbaum, C 1993 English Verbs as a Semantic Net. Princeton, Cognitive Science
Laboratory, Princeton University
Fellbaum, C, D Gross, and K Miller 1993 Adjectives in WordNet. Princeton, Cognitive
Science Laboratory, Princeton University
FGDC 2002 NSDI. USGS, Web Page Document, http://www.fgdc.gov/nsdi/nsdi.html
{§1, §5}
Fodor, F F, and Z W Pylyshyn unknown Conectionism and Cognitive Architecture: A
Critical Analysis.Document, http://ruccs.rutgers.edu/pub/papers/jaf.pdf
Fonseca, F, and C A Davis, Jr. 1999 Using the Internet to Access Geographic
Information: An Open GIS Prototype. In M Goodchild, M Egenhofer, R Fegeas,
and C Kottman (eds) Interoperating Geographic Information Systems. Boston,
Massachusetts, Kluwer Academic Publisher: 313-324
Fonseca, F, and M Egenhofer 1999 Ontology-Driven Geographic Information Systems.
In Proceedings of 7th International Symposium on Advances in Geographic
Information Systems (ACMGIS) ACM Press: 14-19
Fonseca, F, M Egenhofer, D Clodoveu, and G Câmara in press Semantic Granularity in
Ontology-Driven Geographic Information Systems. Annals of Mathematics and
Artificial Intelligence
Fonseca, F T, M Egenhofer, A D Clodoveu, Jr., and K A V Borges 1999 Ontologies and
Knowledge Sharing in Urban GIS. Computer Environment and Urban Systems (in
press): 19
Forte, E, F Haenni, K Warkentyne, E Duval, K Cardinaels, E Varvaet, K Hendrikx, M W
Forte, and F Simillion 1999 Semantic and Pedagogic Interoperability Mechanisms
in the ARIADNE Educational Repository. Sigmod Record, 28(1): 6
Fowler, J, B Perry, M Nodine, and B Bargmeyer 1999 Agent-Based Semantic
Interoperability in InfoSleuth. Sigmod Record, 28(1): 8 {§3, §5}
Frank, A U 2000 Spatial Communication with Maps: Defining the Correctness of Maps
Using a Multi-Agent Simulation. In C Freksa, W Brauer, C Habel, and K F
Wender (eds) Spatial Cognition II : Integrating Abstract Theories, Empirical
230
Studies, Formal Methods, and Practical Applications (International Workshop on
Maps and Diagrammatical Representations of the Environment, Hamburg,
August 1999). Berlin Heidelberg, Springer-Verlag: 80-99 {§4}
Frank, A U 2001 Tiers of ontology and Consistency Constraints in Geographic
Information Systems. International Journal of Geographic Information Science,
15(7): 667-678 {§3}
Frank, A U, and W Kuhn 1999 A Specification Language for Interoperable GIS. In
M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating
Geographic Information Systems. Boston, Massachusetts, Kluwer Academic
Publisher: 123-132
Frankhauser, P, M Kracker, and E Neuhold 1991 Semantic vs. Structural Resemblance of
Classes. SIGMOD Record, 20(4): 59-63 {§1, §2, §3, §4}
Frankhauser, P, and E J Neuhold 1992 Knowledge Based Integration of Heterogenous
Databases. In Proceedings of IFIP WG2.6 Database Semantics Conference on
Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 155-175 {§1, §2, §3}
Fuller, G W 1999 A Vision for a Global Geospatial Information Network (GGIN).
Photogrammetric Engineering & Remote Sensing, 65(5): 524-538 {§1}
Gahegan, M 1995 Proximity Operators for Qualitative Spatial Reasoning. In Proceedings
of Spatial Information Theory: a Theoretical Basis for GIS, International
Conference COSIT'95, 3-540-60392-1. Berlin, Springer-Verlag Lecture Notes in
Computer Science 988: 31-44
Gahegan, M N 1999 Characterizing the Semantic Content of Geographic Data, Models,
and Systems. In M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds)
Interoperating Geographic Information Systems. Boston, Massachusetts, Kluwer
Academic Publisher: 71-83
Gal, A 1999 Semantic Interoperability in Information Services: Experience with
CoopWARE. Sigmod Record, 28(1): 8 {§2}
Geller, J, Y Perl, and E J Neuhold 1991 Structure and Semantic in OODB Class
Specifications. SIGMOD Record, 20(4): 40-43
GeoBase 2001 National GeoBase Definition. {§5}
231
GeoConnections 2002a Canadian Geospatial Data Infrastructure (CGDI) architecture.
Electronic Document, http://www.geoconnections.org/architecture/index_e.html
{§1, §5}
GeoConnections 2002b Geoconnections. Web Page Document, http://geoconnexions.org
GeoConnections Framework Data Node CGDI Framework Data Definition. Electronic
Document,
http://www.geoconnections.org/english/rfp/announcements/moreinfo/RFP_SD_de
finition.pdf
Getta, J R 1992 Translation of Extended Entity-Relationship Database Model into
Object-Oriented Database Model. In Proceedings of IFIP WG2.6 Database
Semantics Conference on Interoperable Database Systems (DS-5)/IFIP
Transaction (A-25) Elsevier Science Publishers B.V.: 87-100
Gibson, J J 1979 The Ecological Approach to Visual Perception. Boston, Houghton
Mifflin {§3, §C}
Goodchild, M 1995 Attribute Accuracy. In S C Guptill, and J L Morrison (eds) Element
of Spatial Data Quality. Pergamon: 59-79 {§1}
Green, P, and M Rosemann 2000 Integrated Process Modeling: An Ontological
Evaluation. Information Systems, 25(2): 73-87
Gruber, T R 1993a Toward Principles for the Design of Ontologies Used for Knowledge
Sharing. Palo Alto, California, Knowledge Systems Laboratory Technical Report
KSL 93-04 {§3, §5}
Gruber, T R 1993b A Translation Approach to Portable Ontology Specification. Stanford,
California, Knowledge Systems Laboratory Technical Report KSL 92-71 {§1, §2,
§3, §C}
Guarino, N 1995 Formal Ontology, Conceptual Analysis and Knowledge Representation.
International Journal of Human and Computer Studies, 43(5&6): 625-640 {§4}
Guarino, N 1997 Semantic Matching: Formal Ontological Distinctions for Information
Organization, Extraction, and Integration. In M T Pazienza (ed) Information
Extraction: A Multidisciplinary Approach to an Emerging Information
Technology. Berlin, Springer-Verlag: 139-170
232
Guarino, N 1998 Formal Ontology and Information Systems. In Proceedings of Formal
Ontology in Information Systems (FOIS '98). Amsterdam, IOS Press: 3-15 {§1,
§2, §3, §5}
Guarino, N 1999 The Role of Identity Conditions in Ontology Design. In Proceedings of
Spatial Information Theory - Cognitive and Computational Foundations of
Geographic Information Science, International Conference COSIT'99, 3-540-
66365-7. Berlin, Springer-Verlag Lecture Notes in Computer Science 1661: 221-
234 {§3, §4}
Guarino, N, and C Welty 2000a A Formal Ontology of Properties. In Proceedings of
Knowledge Engineering and Knowledge Management: Methods, Models and
Tools (12th International Conference, EKAW2000), 3-540-41119-4. Berlin,
Springer-Verlag Lecture Notes in Computer Science 1937: 97-112 {§3, §4, §5,
§C}
Guarino, N, and C Welty 2000b Identity, Unity, and Individuation: Towards a Formal
Toolkit for Ontological Analysis. In Proceedings of ECAI-2000: The European
Conference on Artificial Intelligence. Amsterdam, IOS Press: 219-223 {§3, §4}
Guarino, N, and C Welty 2000c Ontological Analysis of Taxonomic Relationships. In
A Laender, and V Storey (eds) 19th International Conference on Conceptual
Modeling (ER-2000). Salt Lake City, Utah, Berlin, Springer-Verlag: 210-224
Guarino, N, and C Welty 2000d Towards a methodology for ontology based model
engineering. In Proceedings of ECOOP-2000 Workshop on Model Engineering
Guha, R V 1990 Micro-theories and contexts in Cyc. I. Basic issues. Austin, Texas,
Micro-electronics and Computer Technology Corporation Technical Report ACT-
CYC-129-90 {§2, §3}
Güting, R H 1994 An Introduction to Spatial Database Systems. VLDB Journal, 3(4):
357-399
Hakimpour, F, and S Timpf 2001 Using Ontologies for Resolution of Semantic
Heterogeneity in GIS. In Proceedings of 4th Agile Conference on Geographic
Information Science
Han, J, R T Ng, Y Fu, and S K Dao 1998 Dealing with Semantic Heterogeneity by
Generalization-Based Data Mining Techniques. In M P Papazoglou, and
233
G Schlageter (eds) Cooperative Information Systems-Trends and Directions. San
Diego, CA, Academic Press: 207-231
Härder, T, S Günter, and J Thomas 1999 The Intrinsic Problems of Structural
Heterogeneity and an Approach to Their Solution. The VLDB Journal, 8: 25-43
Harrison, R P 1971 Other Ways of Packaging Information. In J A DeVito (ed)
Communication: Concepts and Processes. Englewood Cliffs, New Jersey,
Prentice-Hall Inc: 88-103 {§2}
Härtig, M, and K R Dittrich 1992 An Object-Oriented Integration Framework for
Building Heterogenous Database Systems. In Proceedings of IFIP WG2.6
Database Semantics Conference on Interoperable Database Systems (DS-5)/IFIP
Transaction (A-25) Elsevier Science Publishers B.V.: 33-53
Harvey, F 1997 Improving Multi-Purpose GIS Design: Participative Design. In
Proceedings of Spatial Information Theory: A Theoretical Basis for GIS,
International Conference COSIT'97, ISBN 3-540-63623-4. Berlin, Springer-
Verlag Lecture Notes in Computer Science 1329: 313-328 {§3}
Harvey, F J 1999 Designing for Interoperability: Overcoming Semantic Differences. In
M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating
Geographic Information Systems. Boston, Massachusetts, Kluwer Academic
Publisher: 85-97
Harvey, F J 2002 Semantic Interoperability and Citizen/Government Interaction. In
Proceedings of Spatial Data Handling 2002: Joint International Symposium on
Geospatial Theory, Processing, and Applications: 13 {§5}
Harvey, F J, W Kuhn, H Pundt, Y Bishr, and C Riedemann 1999 Semantic
Interoperability: A Central Issue for Sharing Geographic Information. The Annals
of Regional Science, 33(2): 213-233 {§4}
Hernandez, D, E Clementini, and P Di Felice 1995 Qualitative Distances. In Proceedings
of Spatial Information Theory: a Theoretical Basis for GIS, International
Conference COSIT'95, 3-540-60392-1. Berlin, Springer-Verlag Lecture Notes in
Computer Science 988: 45-57
Herring, J R 1999 The OpenGIS Data Model. Photogrammetric Engineering & Remote
Sensing, 65(5): 585-588 {§1}
234
Hofmann, C 1999 A Multi Tier Framework for Accessing Distributed Heterogenous
Spatial Data in Federation Based EIS. In Proceedings of 7th International
Symposium on Advances in geographic Information Systems (ACMGIS) ACM
Press: 140-145
Hsiao, D K 1992a Federated Databases and Systems: Part I - A Tutorial on Their Data
Sharing. VLDB Journal, 1(1): 127-179
Hsiao, D K 1992b Federated Databases and Systems: Part II - A Tutorial on Their
Resource Consolidation. VLDB Journal, 1(2): 285-310
ISO/TC 211 1999 CD.2 19102 Geographic Information - Overview (N723). Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2001a DTS 19103 Geographic Information - Conceptual Schema Language
(N1082). Geneva, Switzerland, International Organization for Standardization
{§5}
ISO/TC 211 2001c ISO/DIS 19110 Geographic Information - Feature Cataloguing
Methodology. Geneva, Switzerland, International Organization for
Standardization {§3, §5}
ISO/TC 211 2001d ISO/PDTS 19103 Geographic Information - Conceptual Schema
Language. Geneva, Switzerland, International Organization for Standardization
{§5}
ISO/TC 211 2002a Geographic information/Geomatics. Web Page Document,
http://www.statkart.no/isotc211/ {§1}
ISO/TC 211 2002b ISO 19101:2002 Geographic Information - Reference Model.
Geneva, Switzerland, International Organization for Standardization
ISO/TC 211 2002c ISO 19108:2002 Geographic Information - Temporal Schema.
Geneva, Switzerland, International Organization for Standardization {§4}
ISO/TC 211 2002d ISO 19111:2002 Geographic Information - Spatial Referencing by
Coordinates. Geneva, Switzerland, International Organization for Standardization
ISO/TC 211 2002e ISO/DIS 19109 Geographic Information - Rules for Application
Schema. Geneva, Switzerland, International Organization for Standardization {§3,
§4}
235
ISO/TC 211 2002f ISO/DIS 19119 Geographic Information - Services. Geneva,
Switzerland, International Organization for Standardization
ISO/TC 211 2003a ISO 19107:2003 Geographic Information - Spatial Schema. Geneva,
Switzerland, International Organization for Standardization {§2, §4, §5, §C}
ISO/TC 211 2003b ISO 19112:2003 Geographic Information - Spatial Referencing by
Geographic Identifier. Geneva, Switzerland, International Organization for
Standardization
ISO/TC 211 2003c ISO 19115:2003 Geographic Information - Metadata. Geneva,
Switzerland, International Organization for Standardization {§4, §5, §C}
ISO/TC 211 2003d ISO/DIS 19118 Geographic Information - Encoding. Geneva,
Switzerland, International Organization for Standardization {§4}
ISO/TC 211 2003e WD 19136 Geographic Information - Geography markup language.
Geneva, Switzerland, International Organization for Standardization {§2}
Johnson, W 1971 The Fateful Process of Mr. A Talking to Mr. B. In J A DeVito (ed)
Communication: Concepts and Processes. Englewood Cliffs, New Jersey,
Prentice-Hall Inc: 22-35
Jones, M 1991 Brave New World: A Vision of IRDS. Database Programming and
Design: 43-49 {§5}
Kahng, J, and D McLeod 1998 Dynamic Classificational Ontologies: Mediation of
Information Sharing in Cooperative Federated Database Systems, Context and
Ontologies. In M P Papazoglou, and G Schlageter (eds) Cooperative Information
Systems-Trends and Directions. San Diego, CA, Academic Press: 179-203 {§2,
§3}
Kashyap, V, and A Sheth 1996 Semantic and Schematic Similarities Between Database
Objects: A Context-Based Approach. The VLDB Journal, 5: 276-304 {§1, §2, §3,
§4, §5, §C}
Kashyap, V, and A Sheth 1998 Semantic Heterogeneity in Global Information Systems:
the Role of Metadata, Context and Ontologies. In M P Papazoglou, and
G Schlageter (eds) Cooperative Information Systems-Trends and Directions. San
Diego, CA, Academic Press: 139-178 {§1, §2, §3, §4}
236
Kaski, S 1997 Data Exploration Using Self-Organizing Maps. Ph. D. Thesis, Helsinski
University of Technology
Kemp, K K, and A Vckovsky 1998 Towards an Ontology of Fields. In Proceedings of
3rd International Conference on GeoComputation
Kettani, D 1999 Conception et implantation d’un modèle spatial qualitatif qui s’inspire
du raisonnement spatial de l’être humain. Thèse de doctorat (Ph.D.), Université
Laval
Kettani, D, and B Moulin 1999 A spatial model based on the notions of spatial
conceptual map and of object’s influence areas. In Proceedings of Spatial
Information Theory, Cognitive and Computational Foundations of Geographic
Information Systems, International Conference COSIT'99. Berlin, Springer-
Verlag Lecture Notes in Computer Science 1661: 401-415 {§3}
Kim, T J 1999 Metadata for geo-spatial data sharing: A comparative analysis. The Annals
of Regional Sciences, 33: 171-181
Kosslyn, S M 1980 Image in Mind. Cambridge, Massachusetts, Harvard University Press
{§1, §2, §3}
Kosslyn, S M 1981 The Medium and the Message in Mental Imagery: A Theory.
Psychological Review, 88(1): 46-66 {§2}
Kottman, C 1999 The Open GIS Consortium and Progress Toward Interoperability in
GIS. In M Goodchild, M Egenhofer, R Fegeas, and C Kottman (eds)
Interoperating Geographic Information Systems. Boston, Massachusetts, Kluwer
Academic Publisher: 39-54 {§3, §4, §5}
Krech, D, and R S Crutchfield 1971 Perceiving the World. In W Schramm, and D F
Robert (eds) The Process and Effects of Mass Communication. Champaign-
Urbana, IL, University of Illinois Press: 235-264 {§3}
Kuhn, W 2001 Ontologies in support of activities in geographical space. International
Journal of Geographic Information Science, 15(7): 613-631
Kuhn, W 2002 Modeling The Semantics of Geographic Categories through Conceptual
Integration. In Proceedings of Geographic Information Science, Second
International Conference, GIScience 2002, 3-540-44253-7. Berlin, Springer-
Verlag Lecture Notes in Computer Science 2478: 108-118
237
Kuipers, B 2000 The Spatial Semantic Hierarchy. Artificial Intelligence, 119: 191-233
Laakso, A, and G Cottrell 2000 Content and Cluster Analysis: Assessing
Representational Similarity in Neural Systems. Physical Psychology, 13(1): 47-76
{§2}
Lakoff, G 1987 Women, Fire, and Dangerous Things - What Categories Reveal about the
Mind. Chicago, The University of Chicago Press {§1}
Lasswell, H D 1971 The Structure and Function of Communication in Society. In W
Schramm, and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 83-99
Laurini, R 1998 Spatial Multi-Database Topological Continuity and Indexing: a Step
Towards Seamless GIS Data Interoperability. International Journal of
Geographic Information Science, 12(4): 373-402 {§1, §3, §4, §5}
Lehmann, F 1992 Semantic Networks. Computers and Mathematics with Applications,
23(2-5): 50 {§1, §2, §3, §4, §5}
Lippmann, W 1971 The World Outside and the Pictures in Our Heads. In W Schramm,
and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 265-286 {§3}
Locke, J 1689 An Essay Concerning Human Understanding. Web Page Document
Document, http://humanum.arts.cuhk.edu.hk/Philosophy/Locke/echu/ {§4}
Logie, R H, and M Denis (eds) 1991 Mental images in human cognition. Amsterdam,
North-Holland {§3}
Mackay, D S 1999 Semantic Integration of Environmental Models for Application to
Global Information Systems and Decision Making. Sigmod Record, 28(1): 7 {§2}
Marco, D 2000 Building and Managing the Meta Data Repository. Wiley {§5}
Mark, D M, and M Egenhofer 1994 Modeling Spatial Relations Between Lines and
Regions: Combining Formal Mathematical Models and Human Subjects Testing.
In M Egenhofer, D M Mark, and J R Herring (eds) The 9-Intersection: Formalism
and Its Use for Natural-Language Spatial Predicates (Technical Report).
NCGIA: 29-83 {§4}
238
Mark, D M, C Freksa, S C Hirtle, R Lloyd, and B Tversky 1999 Cognitive Models of
Geographical Space. International Journal of Geographic Information Science,
13(8): 747-774
Martel, C 1999 Développement d’un cadre théorique pour la gestion des représentations
multiples dans la base de données spatiales. Mémoire de maîtrise, Université
Laval
McGranagham, M 1997 Spatial data models in current commercial RDBMS. In
Proceedings of Auto-Carto 13 ACSM/ASPRS 5: 136-144
McKee, L 1999 The Impact of Interoperable Geoprocessing. Photogrammetric
Engineering & Remote Sensing, 65(5): 565-566 {§1}
McKee, L, and K Buehler (eds) 1998 The OpenGIS Guide. Wayland, Massachusetts,
OpenGIS Consortium Inc. {§1, §3, §4}
Meijler, T D, and O Nierstrasz 1998 Beyond Objects: Components. In M P Papazoglou,
and G Schlageter (eds) Cooperative Information Systems-Trends and Directions.
San Diego, CA, Academic Press: 49-78
Mennis, J L, D J Peuquet, and L Qian 2000 A Conceptual Framework for Incorporating
Cognitive Principles into Geographical Database Representation. International
Journal of Geographic Information Science, 14(6): 501-520
Merriam-Webster Inc. 1994 Merriam-Webster’s Collegiate Dictionary - Electronic
Edition - Version 1.2. CD Document {§4}
Meta Data Coalition 1999 Knowledge Management Model: Knowledge Description. {§2,
§4}
Microsoft Corporation and Liris Interactive 1996 Bibliorom Larousse, Version 1.0.
Microsoft Corporation and Liris Interactive, CD Document {§4}
Miller, G A 1993 Nouns in WordNet: A Lexical Inheritance System. Princeton,
Cognitive Science Laboratory, Princeton University {§4}
Miller, G A, R Beckwith, C Fellbaum, D Gross, and K Miller 1993 Introduction to
WordNet: An On-line Lexical Database. Princeton, Cognitive Science
Laboratory, Princeton University
239
Milstead, J L 1998 NISO Z39.50: Standard for Structure and Organization of Information
Retrieval Thesauri. In Proceedings of Taxonomic Authority Files Workshop: 9
{§2, §4}
Moriarty, T 1990 Are You Ready for a Repository? Database Programming and Design:
62-71 {§5}
Navarro, D J, and M D Lee 2001 Clustering Using the Contrast Model. In Proceedings of
Twenty-Third Annual Conference of the Cognitive Science Society, 0-8058-4152-
0 Lawrence Erlbaum Associates, Publishers: 684-689
Navarro, D J, and M D Lee 2002 Commonalities and Distinctions in Featural Stimulus
Representations. In Proceedings of Twenty-Fourth Annual Conference of the
Cognitive Science Society: 685-690
Nebert, D 1999 Interoperable Spatial Data Catalogs. Photogrammetric Engineering &
Remote Sensing, 65(5): 573-575 {§1}
Nouveau-Brunswick 1991 Normes concernant l’information sur les terres et les eaux
pour la province du Nouveau-Brunswick. Corporation d’information
géographique
Nouveau-Brunswick 2000 Guide d’utilisation de la Base de données topographiques
numériques (BDTN) du Nouveau-Brunswick. Services Nouveau-Brunswick {§1,
§3, §4}
Nwana, H S 1996 Software Agents: An Overview. The Knowledge Engineering Review,
11(2): 205-244 {§4, §5}
Nwana, H S, and M Wooldridge 1996 Software Agent Technologies. BT Technology
Journal, 14(4): 68-78 {§5}
Object Management Group 2001 OMG Unified Modeling Language Specification
(version 1.4). Needham MA, OMG {§4}
OBM 1996 Ontario Digital Topographic Database - 1:10,000, 1:20,000 - A Guide for
User. Toronto, Ontario, Ministry of Natural Resources {§1, §3, §4, §5, §6}
Open GIS Consortium Inc. 1999a OpenGIS Simple Features Specification for SQL.
Wayland, Massachusetts, OpenGIS Consortium Inc. {§4, §5}
Open GIS Consortium Inc. 1999b Topic 0: Abstract Specification Overview. Wayland,
Massachusetts, Open GIS Consortium Inc.
240
Open GIS Consortium Inc. 1999c Topic 1: Feature Geometry. Wayland, Massachusetts,
Open GIS Consortium Inc. {§4, §C}
Open GIS Consortium Inc. 1999d Topic 5: Features. Wayland, Massachusetts, Open GIS
Consortium Inc.
Open GIS Consortium Inc. 1999e Topic 8: Relationship Between Features. Wayland,
Massachusetts, Open GIS Consortium Inc.
Open GIS Consortium Inc. 1999f Topic 10: Feature Collections. Wayland,
Massachusetts, Open GIS Consortium Inc.
Open GIS Consortium Inc. 1999g Topic 11: Metadata. Wayland, Massachusetts, Open
GIS Consortium Inc.
Open GIS Consortium Inc. 1999h Topic 14: Semantic and Information Communities.
Wayland, Massachusetts, Open GIS Consortium Inc.
Open GIS Consortium Inc. 2001a Geography Markup Language (GML) 2.0. Wayland,
Massachusetts, Open GIS Consortium Inc. {§2, §4, §5, §C}
Open GIS Consortium Inc. 2001b Open GIS Consortium -Spatial connectivity for a
changing world.Document, http://www.opensgis.org
Open GIS Consortium Inc. 2002 OpenGIS Specifications. Open GIS Consortium Inc.,
Web Page Document, http://www.opengis.org/ogcSpecs.htm {§1}
Ouksel, A, and C Naiman 1993 Coordinating context build-ing in heterogeneous
information systems. Journal of Intelligent Information Systems, 3: 151–183 {§2,
§3}
Ouksel, A M, and A Sheth 1999 Semantic Interoperability in Global Information
Systems: A Brief Introduction to the Research Area and the Special Section.
Sigmod Record, 28(1): 5-12 {§1, §2, §3, §4, §5, §C}
P.E.I. Geomatics Information Centre User’s Guide to Digital and Hardcopy property and
Basemap Products. Charlottetown, P.E.I., Provincial Treasury - Taxation &
Property Records Division {§1, §4, §5, §6}
Papazoglou, M P, and J Hoppenbrouwers 1999 Contextualizing the Information Space in
Federated Digital Libraries. Sigmod Record, 28(1): 7
Parent, C, and S Spaccapietra 1998 Issues and approaches of database integration.
Communications of the ACM, 41(5): 166-178
241
Patel, K C 2001 The Difference Between Reading and Understanding and the Impact on
Scalability and Performance (Part 3) Semantics and Context. XML Journal, 2(2):
36-39
Payne, T R, M Paolucci, R Singh, and K Sycara 2002 Communicating Agents in Open
Agent Systems. In Proceedings of First GSFC/JPL Workshop on Radical Agent
Concepts (WRAC): 10 {§5}
Peuquet, D, B Smith, and B Brogaard 1998 The Ontology of Fields. In Proceedings of
Summer Assembly of the University Consortium for Geographic Information
Science {§2, §3, §5}
Plewe, B S, and S R Johnson 1999 Automated Metadata Interpretation to Assist in the
Use of Unfamiliar GIS Data Sources. In M Goodchild, M Egenhofer, R Fegeas,
and C Kottman (eds) Interoperating Geographic Information Systems. Boston,
Massachusetts, Kluwer Academic Publisher: 203-214
Ploux, S 1997 Modélisation et traitement informatique de la synonymie. Lisguisticæ
Investigationes, XXI(1): 1-28
Ploux, S, and B Victorri 1998 Construction d’espaces sémantiques à l’aide de
dictionnaires de synonymes. T.A.L., 39(1): 161-182
Prabandham, M, W J Selfridge, and D D Mann 1990 The Role of IRDS. Database
Programming and Design: 41-48 {§5}
Priest, G 1999 Semantic Closure, Descriptions and Non-Triviality. Journal of
Philosophical Logic, 28(6): 549-558
Pylyshyn, Z W 1981 The Imagery Debate: Analogue Media Versus Tacit Knowledge.
Psychological Review, 88(1): 16-45 {§1, §2, §3}
Pylyshyn, Z W 1999 What is in your Mind? (Manuscrit)
Pylyshyn, Z W In Press Mental Imagery: In search of a theory. Behavior and Brain
Sciences: 53 {§2}
Québec 2000 Base de données topographiques du Québec (BDTQ) à l’échelle de
1/20 000 - Normes de production (Version 1.0). Québec, Ministère des
Ressources naturelles, Direction générale de l’information géographique, CD
Document {§1, §3, §4, §5, §6, §C}
242
Québec-Office de la langue française 2000 Le grand dictionnaire terminologique. Web
Page Document, www.granddictionnaire.com
RBMS Bibliographic Standards Committee XIII. Thesaurus Construction and
Maintenance Guidelines.
Ressources naturelles Canada 1996 Base nationale de données topographiques - normes
et spécifications. Sherbrooke, Québec, Centre d’information topographique –
Sherbrooke {§1, §2, §3, §4, §5, §6, §C}
Ressources naturelles Canada 2002 L’Atlas national du Canada en ligne. Web Page
Document, http://atlas.gc.ca
Riedemann, C, and W Kuhn 1999 What are Sports Grounds? (Or: Why Semantics
Requires Interoperability). In Proceedings of Interoperating Geographic
Information Systems (Interop '99). Berlin, Springer-Verlag Lecture Notes in
Computer Science 1580: 217-229
Rips, L, J Shoben, and E Smith 19973 Semantic Distance and the Verification of
Semantic Relations. Journal of Verbal Learning and Verbal Behavior, 12: 1-20
{§2}
Rishe, N, W Sun, D Barton, Y Deng, C Orji, M Alexopoulos, L Loureiro, C Ordonez, M
Sanchez, and A Sphaposhnikov 1995 Florida International University High
Performance Database Research Center. SIGMOD Record, 24(3): 71-76
Rodriguez, A, M J Egenhofer, and R D Rugg 1999 Assessing Semantic Similarities
Among Geospatial Feature Class Definition. In Proceedings of Interoperating
Geographic Information Systems (Interop '99). Berlin, Springer-Verlag Lecture
Notes in Computer Science 1580: 189-202 {§1, §4}
Rodriguez, M A 2000 Assessing Semantic Similarity Among Entity Classes. Ph.D. Thesis,
University of Maine {§1, §2, §3, §4, §5}
Rugg, R D, M Egenhofer, and W Kuhn 1997 Formalizing Behavior of Geographic
Feature Types. Geographical Systems, 4(2): 8
Salgé, F 1995 Semantic Accuracy. In S C Guptill, and J L Morrison (eds) Element of
Spatial Data Quality. Pergamon: 139-151
Saltor, F, G Castellanos, and M Garcia-Solaco 1992 Overcoming Schematic
Discrepancies in Interoperable Databases. In Proceedings of IFIP WG2.6
243
Database Semantics Conference on Interoperable Database Systems (DS-5)/IFIP
Transaction (A-25) Elsevier Science Publishers B.V.: 191-205
Saltor, F, M G Castellanos, and M Garcia-Solaco 1991 Suitability of Data Models as
Canonical Models for Federated Databases. SIGMOD Record, 20(4): 44-48
Sargent, P 1999 Feature Identities, Descriptors and Handles. In Proceedings of
Interoperating Geographic Information Systems (Interop '99). Berlin, Springer-
Verlag Lecture Notes in Computer Science 1580: 41-53
Schramm, W 1971a How Communication Works. In J A DeVito (ed) Communication:
Concepts and Processes. Englewood Cliffs, New Jersey, Prentice-Hall Inc: 12-21
{§2, §3, §4}
Schramm, W 1971b The Nature of Communication Between Humans. In W Schramm,
and D F Robert (eds) The Process and Effects of Mass Communication.
Champaign-Urbana, IL, University of Illinois Press: 3-53 {§1, §2, §3, §5, §C}
Sciore, E, M Siegel, and A Rosenthal 1992 Context interchange using metaattributes. In
Proceedings of First International Conference on Information and Knowledge
Management (CIKM): 377-386 {§2, §3}
Sears, D O, and J L Freedman 1971 Selective Exposure to Information: A Critical
Review. In W Schramm, and D F Robert (eds) The Process and Effects of Mass
Communication. Champaign-Urbana, IL, University of Illinois Press: 209-234
{§3}
Sester, M 2000 Knowledge Acquisition for the Automatic Interpretation of Spatial Data.
International Journal of Geographic Information Science, 14(1): 1-24
Shannon, C E 1948 A Mathematical Theory of Communication. The Bell System
Technical Journal, 27: 379-423, 623-656 {§2}
Sheth, A 1999 Changing Focus on Interoperability in Information Systems: From
Systems, Syntax, Structure to Semantics. In M Goodchild, M Egenhofer,
R Fegeas, and C Kottman (eds) Interoperating Geographic Information Systems.
Boston, Massachusetts, Kluwer Academic Publisher: 5-29 {§1, §2, §3, §4, §5}
Sheth, A, and V Kashyap 1992 So Far (Schematically) Yet So Near (Semantically). In
Proceedings of IFIP WG2.6 Database Semantics Conference on Interoperable
244
Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 283-312 {§1, §2, §3, §4, §5, §C}
Sheth, A P, and J A Larson 1990 Federated Database Systems for Managing Distributed,
Heterogenous, and Autonomous Databases. ACM Computing Surveys, 22(3): 183-
236 {§2}
Simsion, G C 2001 Data Modeling Essentials - Analysis, Design, and Innovation.
Scottsdale, Arizona, Coriolis {§2, §3, §5, §C}
Smith, B 1994 Fiat Objects. In Proceedings of Workshop on Parts and Wholes:
Conceptual Part-Whole Relations and Formal Mereology, 11th European
Conference on Artificial Intelligence: 15-23 {§2, §3, §4}
Smith, B, and D Mark 1999 Ontology with Human Subjects Testing: An Empirical
Investigation of Geographic Categories. American Journal of Economics and
Sociology, 58(2): 245-272 {§2, §3, §4, §5}
Smith, B, and A C Varzi 2000 Fiat and Bona Fide Boundaries. Philosophy and
Phenomenological Research, 60(2): 401-420 {§2, §3, §4}
Smith, H, and K Poulter 1999 The Role of Shared Ontology in XML-Based Trading
Architecture. Web Page Document, http://www.ontology.org/main/papers/cacm-
agents99.html
Smith, K 1999 Unpacking the Semantics of Source and Usage to Perform Semantic
Reconciliation in Large-Scale Information Systems. Sigmod Record, 28(1): 6
{§3}
Sondheim, M, K Gardels, and K Buehler 1999 GIS Interoperability. In P A Longley, M F
Goodchild, D J Maguire, and D W Rhind (eds) Geographical Information
Systems: Principles, Techniques, Applications and Management. New York, John
Wiley and Sons, Inc.: 347-358 {§3}
Sowa, J F 1984 Chapter 7: Limits of Conceptualisation. In Conceptual Structures:
Information Processing in Mind Machine. Reading, Massachusetts, Addision-
Westley Publishing Company: 339-351 {§1, §3, §4}
Sowa, J F 1987 Semantic Networks. In S C Shapiro (ed) Encyclopedia of Artificial
Intelligence. New York, John Wiley & Sons {§1, §2}
245
Spaccapietra, S, and C Parent 1990 Intégration de vues et relativisme sémantique. In
Proceedings of Vièmes Journées Base de Données Avancées: 17
Spaccapietra, S, C Parent, and Y Dupont 1991 Automating Heterogeneous Schema
Integration. École polytechnique féderale de Lausane, Laboratoire de bases de
données, Département d’informatique
Spaccapietra, S, C Parent, and Y Dupont 1992 Model Independent Assertions for
Integration of Heterogenous Schemas. VLDB Journal, 1(1): 81-126
Statistique Canada 1997 Fichiers numériques des limites et fichiers numériques
cartographiques, Recensement de 1996 (guide de référence). Ottawa, Ministère
de l’Industrie {§1, §3, §4, §5}
Stock, K, and D Pullar 1999 Identifying Semantically Similar Elements in Heterogenous
Spatial Databases Using Predicate Logic Expressions. In Proceedings of
Interoperating Geographic Information Systems (Interop '99). Berlin, Springer-
Verlag Lecture Notes in Computer Science 1580: 231-252
Storey, V C 1993 Understanding Semantic Relationships. VLDB Journal, 2(4): 455-488
Sui, D Z 1998 GIS-Based Urban Modelling: Practices, Problems, and Prospects.
International Journal of Geographic Information Science, 12(7): 651-671
Sun Microsystems Inc. 2002 Java™ API for XML Processing (JAXP). Sun
Microsystems Inc., Web Page Document, http://java.sun.com/xml/jaxp/index.html
{§5}
Sycara, K, M Klusch, S Widoff, and J Lu 1999 Dynamic Service Matchmaking Among
Agents in Open Information Environnements. Sigmod Record, 28(1): 47-53 {§3,
§5}
Tari, Z 1992 Interoperability Between Database Models. In Proceedings of IFIP WG2.6
Database Semantics Conference on Interoperable Database Systems (DS-5)/IFIP
Transaction (A-25) Elsevier Science Publishers B.V.: 101-119 {§2}
Taylor, M 1999 Zthes: a Z39.50 Profile for Thesaurus Navigation. Web Page Document,
http://lcweb.loc.gov/z3950/agency/profiles/zthes-02.html
The Appache Sofware Foundation 2002a Xalan-Java version 2.4.0. The Appache
Sofware Foundation, Web Page Document, http://xml.apache.org/xalan-j/ {§5}
246
The Appache Sofware Foundation 2002b Xerces-Java Java Parser Readme. The Appache
Sofware Foundation, Web Page Document, http://xml.apache.org/xalan-j/
Tversky, A 1977 Features of similarity. Psychological Review, 84(4): 327-352 {§2}
Urban, S D, and J Wu 1991 Resolving Semantic Heterogeneity Trough the Explicit
Representation of Data Models Semantics. SIGMOD Record, 20(4): 55-58
Vckovski, A 1999 Interoperability and Spatial Information Theory. In M Goodchild,
M Egenhofer, R Fegeas, and C Kottman (eds) Interoperating Geographic
Information Systems. Boston, Massachusetts, Kluwer Academic Publisher: 31-37
Vckovski, A, K Brassel, and H J Schek (eds) 1999 Interoperating Geographic
Information Systems (Interop '99). Berlin, Springer-Verlag {§1}
Ventrone, V, and S Heiler 1991 Semantic Heterogeneity as a Result of Domain
Evolution. SIGMOD Record, 20(4): 16-20
VMap 1995 Vector Map (VMap), Level 1. Bethesda, MD, U.S. National Imagery and
Mapping Agency Mil-V-89033 {§1, §3, §4, §5}
Voisard, A, and H Schweppe 1998 Abstraction and Decomposition in Interoperable GIS.
International Journal of Geographic Information Science, 12(4): 315-333 {§1}
Waterson, A, and A Preece 1999 Verifying Ontological Commitment in Knowledge-
Based Systems. Knowledge-Based systems, 12(1-2): 45-54
Weiner, N 1950 The Human Use of Human Beings: Cybernetics and Society. Boston,
Houghton and Mifflin {§1, §2, §3, §C}
Weisstein, E W 1999 Topological space. Mathworld. wolfram.com, Web page
Document, http://mathworld.wolfram.com/TopologicalSpace.html {§4}
Wiederhold, G 1999 Mediation to Deal with Heterogenous Data Sources. In Proceedings
of Interoperating Geographic Information Systems (Interop '99). Berlin, Springer-
Verlag Lecture Notes in Computer Science 1580: 1-16
Wisse, P 2000 Metapattern: Context and Time in Information Models. Reading,
Massachusetts, Addison-Wesley {§3, §4, §C}
WordNet, a lexical database for the English language. Web Page Document,
http://www.cogsci.princeton.edu/~wn/ {§4}
Wurm, L H 2000 The Adaptative Value of Lexical Connotation in Speech Perception.
Cognition and Emotion, 14(2): 177-191
247
Xhu, Z, and Y C Lee 2002 Semantic Heterogeneity of Geodata. In Proceedings of ISPRS
Commission IV Symposium 2002: Joint International Symposium on Geospatial
Theory, Processing, and Applications: 6 {§5}
Yan, L, and T Ling 1992 Translating Relational Schema With Constraints Into OODB
Schema. In Proceedings of IFIP WG2.6 Database Semantics Conference on
Interoperable Database Systems (DS-5)/IFIP Transaction (A-25)
Elsevier Science Publishers B.V.: 69-85