Sebastian Drude, Freie Universität Berlin and Museu Paraense Emílio Goeldi

Digitizing and annotating texts and field recordings

Given that several initiatives worldwide currently explore the new
field of documentation of endangered languages, the E-MELD project
proposes to survey and unite procedures, techniques and results in
order to achieve its main goal, ''the formulation and promulgation of
best practice in linguistic markup of texts and lexicons''. In this
context, this year's workshop deals with the processing of recorded
texts.

I assume the most valuable contribution I could make to the workshop
is to show the procedures and methods used in the Awetí Language
Documentation Project. The procedures applied in the Awetí Project
are not necessarily representative of all the projects in the DOBES
program, and they may very well fall short in several respects of
being best practice, but I hope they might provide a good and concrete
starting point for comparison, criticism and further discussion.

The procedures to be exposed include:

* taping with digital devices,

* digitizing (preliminarily in the field, later definitely by the TIDEL-team at the Max Planck
Institute in Nijmegen),

* segmenting and transcribing, using the transcriber computer program,

* translating (on paper, or while transcribing),

* adding more specific annotation, using the Shoebox program,

* converting the annotation to the ELAN-format developed by the TIDEL-team, and doing
annotation with ELAN.

Focus will be on the different types of annotation. Especially, I
will present, justify and discuss Advanced Glossing, a text annotation
format developed by H.-H. Lieb and myself designed for language
documentation. It will be shown how Advanced Glossing can be applied
using the Shoebox program. The Shoebox setup used in the Awetí
Project will be shown in greater detail, including lexical databases
and semi-automatic interaction between different database types
(jumping, interlinearization).

( Freie Universität Berlin and Museu Paraense Emílio Goeldi, with
funding from the Volkswagen Foundation.)