Automatic Extraction of Geographic Context from Textual Data
Volume 2, Issue 1 (2014), pp. 229–237
Pub. online: 25 August 2014
Type: Article
Open Access
Received
9 May 2013
9 May 2013
Accepted
23 October 2013
23 October 2013
Published
25 August 2014
25 August 2014
Abstract
The amount of information on the internet grows exponentially. It isnot enough anymore just to have a general access to this huge amount of data,instead it is becoming a necessity to be able to use different kinds ofautomatic filters to retrieve just the information you actually want. One solution for the information filtering and retrieval is context analysis in which one of the contexts of interest is the geographic context. This paper studies the problem and methodology of geoparsing – recognition of geographic names in unstructured textual content for the aim of extracting geographic context. A prototype implementation of a geoparsing system, capable of automatically analyzing unstructured text, recognizing geographic information and marking geographic names, is developed. Empirical evaluation of the system using articles from real-world news showed that the average quality of its geographic name recognition varies around 75-100%. Possible applications of the developed prototype include automated grouping of any texts by their geographic contexts (e.g., in news portals) and location-based search. Preliminary results of empirical evaluation showed that the average rate of its geographic name recognition varies around 75-100%.