TU Berlin

Database Systems and Information Management GroupDesign und Umsetzung eines Systems zur verteilten Ausführung von patternbasierter Informationsextraktion

Logo FG DIMA-new  65px

Page Content

to Navigation

Short info

Candidate: Michail Melnikov

Advisor: Alan Akbik


Rule-based Information Extraction from natural language text has seen a renewed interest in recent years. Incorporating deep syntactic information such as typed dependencies in the extraction rules offers the possibility of a better description of grammatical patterns over dependency trees. Despite of this potential, the process of manually writing such extractors is very difficult for persons without extensive NLP knowledge.

The Propminer project involves the usage of a workflow to help the users easily create SQL-like queries that describe partial graphs over dependency trees with grammatical constrains for the nodes. The auto-generation of queries from annotated sentences and the interactive nature of the software make it possible to quickly generate extractors and evaluate the results over millions of sentences. This document presents the Propminer way of assisting the users create such grammatical patterns and extract structured information from texts.


Quick Access

Schnellnavigation zur Seite über Nummerneingabe