TU Berlin

Database Systems and Information Management GroupGeneration and Evaluation of N-ary extraction patterns for Open Information Extraction

Logo FG DIMA-new  65px

Page Content

to Navigation

Short info

Candidate: Stephan Pieper

Advisor: Alan Akbik

Desired degree: Master


Current relation extraction methods mine information from natural language texts (e.g., Web pages) and recognize semantic relations between entities such as persons or locations. However, either traditional IE (Information Extraction) systems are limited to a predefined set of relations or modern OIE (Open Information Extraction) approaches are often restricted to binary verb-based relations. Even worse, 40% of the semantic relations in English are N-ary (more than two arguments) and mainly arranged by verbs and nouns. As a result the extracted relations of these methods are erroneous or incomplete, suffer from low recall or the systems prevent scaling to the size and diversity of the Web.

To enable systems that solve these problems, it is desirable to automatically generate and employ N-ary relation extraction rules. We propose an automatic prototypical pipeline to learn N-ary domain-independent extraction patterns and apply these patterns to distill N-ary verb-based and noun-based relations from texts. Our method is inspired by two observations: On the one hand semantically annotated corpora contain the structure to mine and train efficient N-ary extractors for verb-based and noun-based relations. On the other hand dependency parsing techniques reveal the ‘buried’ linguistic knowledge that is implemented in the dependency parse trees of texts. Therefore, we combine both approaches to create an open N-ary relation extraction system.


Quick Access

Schnellnavigation zur Seite über Nummerneingabe