direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content


INDREX: In-database relation extraction
Citation key DBLP:journals/is/KiliasLA15
Author Torsten Kilias and Alexander Löser and Periklis Andritsos
Pages 124–144
Year 2015
DOI 10.1016/j.is.2014.11.006
Journal Inf. Syst.
Volume 53
Abstract Relation extraction transforms the textual representation of a relationship into to the relational model of a data warehouse. Early systems, such as SystemT by IBM or the open source system GATE solve this task with handcrafted rule sets that the system executes document-by-document. Thereby the user must execute a highly interactive and iterative process of reading a document, of expressing rules, of testing these rules on the next document and of refiningrules. Until now, these systems do neither leverage the full potentialof built-in declarative query languages nor the indexing and queryoptimization techniques of a modern RDBMS that would enable auser interactive rule refinementacross documentsand on theentirecorpus. We propose the INDREX system that enables a user forthe first time to describe corpus-wide extraction tasks in a declara-tive language and permits the user to run interactive rule refinementqueries. For enabling this powerful functionality we extend a stan-dard PostgreSQL with a set of white-box user-defined-functionsthat enable corpus-wide transformations from sentences into relations. We store the text corpus and rules in the same RDBMS that already holds domain specific structured data. As a result, (1) the user can leverage this data to further adapt rules to the targetdomain, (2) the user does not need an additional system for rule extraction and (3) the INDREX system can leverage the full power ofbuilt-in indexing and query optimization techniques of the underlaying RDBMS. In a preliminary study we report on the feasibility of this disruptive approach and show multiple queries in INDREXon the REUTERS-News’97 corpora.
Link to publication [1] Link to original publication [2] Download Bibtex entry [3]

------ Links: ------

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Copyright TU Berlin 2008