Annotating Corpora with Information Structure
Kordula De Kuthy & Arndt Riester
Language & Computation
Week Two - 11.00-12.30 - Level: I
In this course, participants will get acquainted with contemporary information structure theory as well as state-of-the-art methods for creating manually annotated text and speech corpora. The course is aimed at students and researchers of both theoretical and computational linguistics. We first provide an introduction to classical and contemporary interpretations of information structural notions including focus, aboutness topic, contrastive topic, givenness, anaphora and information status, as well as relevant underlying frameworks and concepts like Alternative Semantics, discourse trees and Questions under Discussion. Turning from theory to practice, we will introduce the participants to the use of the EXMARaLDA annotation tool. In a practical exercise, participants will analyse texts with respect to referential information status (anaphora) as well as lexical information status (semantic relations). Furthermore, we will discuss ways of identifying focus and topic constituents in corpus data. At the end of the class, participants will have gone through a detailed information structural analysis of an English text or piece of spoken language, and will have acquired methods for the analysis of their own preferred type of linguistic data.