| Linkable geographic ontologies | | BIBAK | Full-Text | 1 | |
| Francisco J. Lopez-Pellicer; Mário J. Silva; Marcirio Chaves | |||
| The performance of some tasks in Information Retrieval is strongly related
to the extent and quality of the geographic knowledge about named places. This
paper presents a conceptualization of the geographic knowledge, the Geo-Net
vocabulary, and a tool for building large knowledge bases of named places, the
GKB management system, developed in the GREASE-II project. The Geo-Net
vocabulary is a conceptual model for describing geographic places, including
their names, types, relationships and footprints. It uses URIs and the RDF data
model to expose, share and connect pieces of geographic knowledge each other
and to related data on the Web. The GKB system is a multi-paradigm knowledge
management system that enables the development of geographic ontologies with
the Geo-Net vocabulary. This paper also presents a geographic ontology of
Portugal, Geo-Net-PT 02, created with the Geo-Net vocabulary and the GKB
system. Keywords: geo-ontologies, geographic information retrieval, geographic knowledge base,
linked data | |||
| Towards mapping of alpine route descriptions | | BIBAK | Full-Text | 2 | |
| Michael Piotrowski; Samuel Läubli; Martin Volk | |||
| We describe a corpus of historic mountaineering accounts and on-going work
on geocoding toponyms and route descriptions in these accounts. Mountaineering
accounts contain a wealth of geographic information but its extraction for
purposes of geographic information retrieval poses specific challenges, in
particular the distinction between toponyms pertinent to route descriptions and
those mentioned in descriptions of panoramas. We describe some preliminary
considerations for natural language cues to distinguish between these two types
of occurrences. Keywords: cultural heritage data, geographic information retrieval, mountaineering
accounts, route extraction, toponym resolution | |||
| Unnamed locations, underspecified regions, and other linguistic phenomena in geographic annotation of water-based locations | | BIBAK | Full-Text | 3 | |
| Johannes Leveling | |||
| This short paper investigates how locations in or close to water masses in
topics and documents (e.g. rivers, seas, oceans) are referred to. For this
study, 13 topics from the GeoCLEF topics 2005-2008 aiming at documents on
rivers, oceans, or sea names were selected and the corresponding relevant
documents retrieved and manually annotated.
Results of the geographic annotation indicate that i) topics aiming at locations close to water contain a wide variety of spatial relations (indicated by different prepositions), ii) unnamed locations can be generated on-the-fly by referring to movable objects (e.g. ships, planes) travelling along a path, iii) underspecified regions are referenced by proximity or distance or directional relations. In addition, several generic expressions (e.g. "in international waters") are frequently used, but refer to different underspecified regions. Keywords: GIR, annotation, toponyms | |||
| An ontology of place and service types to facilitate place-affordance geographic information retrieval | | BIBAK | Full-Text | 4 | |
| Ahmed N. Alazzawi; Alia I. Abdelmoty; Christopher B. Jones | |||
| In order to facilitate place-affordance queries on the Web, this work
proposes the employment of an ontology of place and service types. While other
works defined place-affordance by associating a place with its physical
objects, the conceptual view of a place-affordance in this work is based on
associating a place type with its typical service types, which is reflected in
the ontology construction methodology. Preliminary results, as well as an
overview of the current work, are briefly introduced. Keywords: place ontology, place-affordance, semantic web | |||
| Towards automated georeferencing of Flickr photos | | BIBAK | Full-Text | 5 | |
| Olivier Van Laere; Steven Schockaert; Bart Dhoedt | |||
| We explore the task of automatically assigning geographic coordinates to
photos on Flickr. Using an approach based on k-medoids clustering and Naive
Bayes classification, we demonstrate that the task is feasible, although high
accuracy can only be expected for a portion of all photos. Based on this
observation, we stress the importance of adaptive approaches that estimate
locations at different granularities for different photos. Keywords: georeferencing, naive Bayes classification, web 2.0 | |||
| Geotagging: using proximity, sibling, and prominence clues to understand comma groups | | BIBAK | Full-Text | 6 | |
| Michael D. Lieberman; Hanan Samet; Jagan Sankaranayananan | |||
| Geotagging is the process of recognizing textual references to geographic
locations, known as toponyms, and resolving these references by assigning each
lat/long values. Typical geotagging algorithms use a variety of heuristic
evidence to select the correct interpretation for each toponym. A study is
presented of one such heuristic which aids in recognizing and resolving lists
of toponyms, referred to as comma groups. Comma groups of toponyms are
recognized and resolved by inferring the common threads that bind them
together, based on the toponyms' shared geographic attributes. Three such
common threads are proposed and studied -- population-based prominence,
distance-based proximity, and sibling relationships in a geographic hierarchy
-- and examples of each are noted. In addition, measurements are made of these
comma groups' usage and variety in a large dataset of news articles, indicating
that the proposed heuristics, and in particular the proximity and sibling
heuristics, are useful for resolving comma group toponyms. Keywords: comma groups, geotagging, toponyms | |||
| Evaluation of georeferencing | | BIBAK | Full-Text | 7 | |
| Richard Tobin; Claire Grover; Kate Byrne; James Reid; Jo Walsh | |||
| In this paper we describe a georeferencing system which first uses
Information Extraction techniques to identify place names in textual documents
and which then resolves the place names against a choice of gazetteers. We have
used the system to georeference three digitised historical collections and have
evaluated its performance against human annotated gold standard samples from
the three collections. We have also evaluated its performance on the SpatialML
corpus which is a geo-annotated corpus of newspaper text. The main focus of
this paper is the evaluation of georesolution and we discuss evaluation methods
and issues arising from the evaluation. Keywords: evaluation, georeferencing, named entity recognition, toponym resolution | |||
| A GIR architecture with semantic-flavored query reformulation | | BIBAK | Full-Text | 8 | |
| Nuno Cardoso; Mário J. Silva | |||
| Most geographic queries include references to entities (geographic and
non-geographic). Grounding such entities is essential to properly understand
the user's information need. As statistical-based query reformulation
strategies work at term level, not entity level, they don't use the semantic
information given by such entities, which is considerably relevant for the
types of queries that should be handled by GIR systems. We motivate the need of
a semantic-flavored query reformulation approach for geographic information
retrieval systems and describe a GIR architecture where query reformulation
focuses on i) grounding entities in the query, ii) selecting a reasoning
strategy according to the user information need, and iii) generating a
reformulated query containing answers and related entities for a more focused
retrieval step. Reformulated queries obtain the answers by accessing a
knowledge base. Keywords: evaluation, geographic ontology, geographical information retrieval,
information management, query reformulation | |||
| OGC catalog service for heterogeneous earth observation metadata using extensible search indices | | BIBAK | Full-Text | 9 | |
| Isao Kojima; Masahiro Kimoto; Akiyoshi Matono | |||
| In this paper, we propose an extensible information retrieval system based
on data typed indices. The indices are constructed for various data types and
are customized and extensible. Based on this system, we have implemented a
catalog service of earth observation metadata. Using this system, it is
possible to search through a large amount of metadata with heterogeneous
schema. Fast response time is also achieved regardless of the number of
individual query results. Keywords: earth observation, heterogeneous data integration, information retrieval,
open geospatial consortium, search engines | |||
| TWinner: understanding news queries with geo-content using Twitter | | BIBAK | Full-Text | 10 | |
| Satyen Abrol; Latifur Khan | |||
| In the present world scenario, where the search engines wars are becoming
fiercer than ever, it becomes necessary for each search engine to realize the
intent of the user query to be able to provide him with more relevant search
results. Amongst the various categories of search queries, a major portion is
constituted by those having news intent. Seeing the tremendous growth of social
media users, the spatial-temporal nature of the media can prove to be a very
useful tool to improve the search quality. In our work we examine the
development of such a tool that combines social media in improving the quality
of web search and predicting whether the user is looking for news or not. We go
one step beyond the previous research by mining Twitter messages, assigning
weights to them and determining keywords that can be added to the search query
to act as pointers to the existing search engine algorithms suggesting to it
that the user is looking for news. We conduct a series of experiments and show
the impact that TWinner has on the results. Keywords: geographic information retrieval, news queries, search engines | |||
| Getting context on the go: mobile urban exploration with ambient tag clouds | | BIBAK | Full-Text | 11 | |
| Matthias Baldauf; Rainer Simon | |||
| Tags clouds are a well-established concept for organizing and visualizing
large amounts of user-generated content annotated with keywords. Applied on
mobile devices, so-called 'ambient tag clouds' which are based on surrounding
georeferenced and tagged resources may act as compact location descriptors.
This paper presents our on-going work towards more expressive ambient tag
clouds. By analyzing locative textual Web content, such representations
summarizing available background information can be generated without
explicitly assigned tags. Thus, these ambient tag clouds enable the mobile
exploration of a place's semantic beyond visible objects and common
points-of-interest. Keywords: location-based service, tag cloud, user-generated content | |||
| Geographical classification of documents using evidence from Wikipedia | | BIBAK | Full-Text | 12 | |
| Rafael Odon de Alencar; Clodoveu Augusto, Jr. Davis; Marcos André Gonçalves | |||
| Obtaining or approximating a geographic location for search results often
motivates users to include place names and other geography-related terms in
their queries. Previous work shows that queries that include geography-related
terms correspond to a significant share of the users' demand. Therefore, it is
important to recognize the association of documents to places in order to
adequately respond to such queries. This paper describes strategies for text
classification into geography-related categories, using evidence extracted from
Wikipedia. We use terms that correspond to entry titles and the connections
between entries in Wikipedia's graph to establish a semantic network from which
classification features are generated. Results of experiments using a news
data-set, classified over Brazilian states, show that such terms constitute
valid evidence for the geographical classification of documents, and
demonstrate the potential of this technique for text classification. Keywords: geographic information retrieval, geospatial evidence, text classification | |||
| Images and perceptions of neighbourhood extents | | BIBAK | Full-Text | 13 | |
| Paul Clough; Robert Pasley | |||
| In this paper, we describe an experiment in which we use an online
questionnaire to elicit people's perception of the extents of smaller vague
regions, such as neighbourhoods. Our approach uses images of street scenes
rather than landmarks or placenames. Keywords: questionnaire, user study, vague regions | |||
| A web platform for the evaluation of vernacular place names in automatically constructed gazetteers | | BIBAK | Full-Text | 14 | |
| Florian A. Twaroch; Christopher B. Jones | |||
| Vernacular place names pose a research challenge in geographic information
retrieval. There is a long standing demand from investigators for a reference
collection to train their methods and evaluate their models and data. However
no large collection of informal place names associated with type and footprint
data is currently available to the GIR community. The present contribution
discusses the implementation of a web platform to collect such an evaluation
data set. Design considerations of the user interface are addressed and we
present first results of a nationwide attempt to collect the vernacular place
names of Great Britain. Our result will aid further research in automatic
gazetteer construction, considering vernacular place names. Keywords: evaluation, gazetteer services, vernacular place names | |||
| Grounding toponyms in an Italian local news corpus | | BIBAK | Full-Text | 15 | |
| Davide Buscaldi; Bernardo Magnini | |||
| In this paper we present a study carried out over toponyms contained in an
Italian news collection, in order to determine the degree of ambiguity of
toponyms and how difficult could be to resolve such ambiguities. The results
show that frequent toponyms are usually less ambiguous than rare toponyms. The
resolution of ambiguities on a sample of 1,042 toponyms with different features
confirms that ambiguous toponyms are spatially autocorrelated. Keywords: geographic information retrieval, toponym resolution | |||
| Extraction and exploration of spatio-temporal information in documents | | BIBAK | Full-Text | 16 | |
| Jannik Strötgen; Michael Gertz; Pavel Popov | |||
| In the past couple of years, there have been significant advances in the
areas of temporal information retrieval (TIR) and geographic information
retrieval (GIR), each focusing on extracting and utilizing temporal and
geographic information, respectively, from documents for search and exploration
tasks. Interestingly, there is only little work that combines models,
techniques and applications from these two areas to support scenarios and
applications where temporal and geographic information in combination provide
interesting meaningful nuggets in document exploration tasks, such as
visualizing a chronological sequence of events with their locations.
In this paper, we present an approach that combines the two areas of TIR and GIR. Using temporal and geographic information extracted from documents and recorded in temporal and geographic document profiles, we show how co-occurrences of such information are determined and spatio-temporal document profiles are computed. Such profiles then provide the basis for a variety of document search and exploration tasks, such as visualizing the sequences of events on a map. We present a prototypical implementation of our system and demonstrate the effectiveness of combining GIR and TIR in the context of document exploration tasks. Keywords: UIMA, information retrieval, spatial data, temporal data, text mining | |||
| Leveraging back-of-the-book indices to enable spatial browsing of a historical document collection | | BIBAK | Full-Text | 17 | |
| Michael Piotrowski | |||
| We describe ongoing work on detecting toponyms in back-of-the-book indices
to geocode historical documents not available in full text; the goal is
specifically to provide spatial browsing for the Collection of Swiss Law
Sources. We discuss some of the peculiarities of handcrafted indices and
approaches for coping with them. Keywords: cultural heritage data, law sources, spatial browsing, toponym resolution | |||
| Using the geographic scopes of web documents for contextual advertising | | BIBAK | Full-Text | 18 | |
| Ivo Anastácio; Bruno Martins; Pável Calado | |||
| Geotargeting is a specialization of contextual advertising where the
objective is to target ads to Website visitors concentrated in well-defined
areas. Current approaches involve targeting ads based on the physical location
of the visitors, estimated through their IP addresses. However, there are many
situations where it would be more interesting to target ads based on the
geographic scope of the target pages, i.e., on the general area implied by the
locations mentioned in the textual contents of the pages. Our proposal applies
techniques from the area of geographic information retrieval to the problem of
geotargeting. We address the task through a pipeline of processing stages,
which involves (i) determining the geographic scope of target pages, (ii)
classifying target pages according to locational relevance, and (iii)
retrieving ads relevant to the target page, using both textual contents and
geographic scopes. Experimental results attest for the adequacy of the proposed
methods in each of the individual processing stages. Keywords: contextual advertisement, geographic information retrieval, geographic text
mining, geotargeting | |||
| Geographic signatures for semantic retrieval | | BIBAK | Full-Text | 19 | |
| David S. Batista; Mário J. Silva; Francisco M. Couto; Bibek Behera | |||
| Geotargeting is a specialization of contextual advertising where the
objective is to target ads to Website visitors concentrated in well-defined
areas. Current approaches involve targeting ads based on the physical location
of the visitors, estimated through their IP addresses. However, there are many
situations where it would be more interesting to target ads based on the
geographic scope of the target pages, i.e., on the general area implied by the
locations mentioned in the textual contents of the pages. Our proposal applies
techniques from the area of geographic information retrieval to the problem of
geotargeting. We address the task through a pipeline of processing stages,
which involves (i) determining the geographic scope of target pages, (ii)
classifying target pages according to locational relevance, and (iii)
retrieving ads relevant to the target page, using both textual contents and
geographic scopes. Experimental results attest for the adequacy of the proposed
methods in each of the individual processing stages. Keywords: contextual advertisement, geographic information retrieval, geographic text
mining, geotargeting | |||
| Annotating data to support decision-making: a case study | | BIBAK | Full-Text | 20 | |
| Carla Geovana N. Macário; Jefersson A. dos Santos; Claudia Bauzer Medeiros; Ricardo da S. Torres | |||
| Georeferenced data are a key factor in many decision-making systems.
However, their interpretation is user and context dependent so that, for each
situation, data analysts have to interpret them, a time-consuming task. One
approach to alleviate this task, is the use of semantic annotations to store
the produced information. Annotating data is however hard to perform and prone
to errors, especially when executed manually. This difficulty increases with
the amount of data to annotate. Moreover, annotation requires
multi-disciplinary collaboration of researchers, with access to heterogeneous
and distributed data sources and scientific computations. This paper
illustrates our solution to approach this problem by means of a case study in
agriculture. It shows how our implementation of a framework to automate the
annotation of geospatial data can be used to process real data from remote
sensing images and other official Brazilian data sources. Keywords: geospatial data, geospatial standards, remote sensing image classification,
semantic annotation | |||
| Learning to rank for geographic information retrieval | | BIBAK | Full-Text | 21 | |
| Bruno Martins; Pável Calado | |||
| The task of Learning to Rank is currently getting increasing attention,
providing a sound methodology for combining different sources of evidence. The
goal is to design and apply machine learning methods to automatically learn a
function from training data that can sort documents according to their
relevance. Geographic information retrieval has also emerged as an active and
growing research area, addressing the retrieval of textual documents according
to geographic criteria of relevance. In this paper, we explore the usage of a
learning to rank approach for geographic information retrieval, leveraging on
the datasets made available in the context of the previous GeoCLEF evaluation
campaigns. The idea is to combine different metrics of textual and geographic
similarity into a single ranking function, through the use of the SV Mmap
framework. Experimental results show that the proposed approach can outperform
baselines based on heuristic combinations of features. Keywords: geographic information retrieval, learning to rank | |||
| Spatial diversity, do users appreciate it? | | BIBAK | Full-Text | 22 | |
| Jiayu Tang; Mark Sanderson | |||
| Spatial diversity is a relatively new branch of research in the context of
spatial information retrieval. It tries to answer user's query with results
that are not only relevant but also spatially diversified so that they are from
many different locations. Although the assumption that spatially diversified
results may meet users' needs better seems reasonable, there has been little
hard evidence in the literature indicating so. In this paper, we will show our
follow-up work on the novel approach to investigating user preference on
spatial diversity by using Amazon Mechanical Turk. Keywords: Amazon Mechanical Turk, spatial diversity, user study | |||
| A probabilistic model of geographic relevance | | BIBAK | Full-Text | 23 | |
| Stefano De Sabbata; Tumasch Reichenbacher | |||
| In this paper, we present a new model for the assessment of Geographic
Relevance. This model is drawn from Okapi BM25, thus it takes into account not
only a score for each dimension of relevance but also the distribution of these
scores within the collection. Preliminary results suggest that the relevance
estimation of top-ranked objects is more sensitive to small changes in the user
context. Keywords: GRBM25, Okapi BM25, geographic relevance | |||
| How geographic was GikiCLEF?: a GIR-critical review | | BIBAK | Full-Text | 24 | |
| Diana Santos; Nuno Cardoso; Luís Miguel Cabral | |||
| In this paper we draw a balance of GikiCLEF as far as its appropriateness
for the evaluation of GIR systems is concerned. We measure its degree of
dealing with geographic matter, and offer GIRA, the final resource, for GIR
evaluation purposes. Keywords: Wikipedia, crosslinguality, evaluation, geographical IR, multilinguality,
question answering | |||