Table Of Contents

Previous topic

ELTK documentation

Next topic

Meta

This Page

Getting started with the ELTK

The ELTK is intended:

  • to manipulate linguistics data
  • to translate un- or semi-structured data into RDF (according to an ontology)
  • to expose RDF data in various display forms (e.g., HTML or PDF)
  • to provide the basis for web frameworks and services for linguistics

This section outlines each of these broad aims by giving a general description. Specific uses of ELTK code are given in later sections.

Data manipulation

The KB (knowledge base) package contains the Meta and KBComponent modules. These modules are used together to import the OWL+RDFS+RDF data model into the Python OOP environment.

Data migration

Various readers are used to migrate so-called “legacy data”, ie files that are not yet in a format compatible with the GOLD Community of Practice—those that contain no semantic markup. For each type of legacy data file/format, a new reader class will need to be written. For instance, there are readers for:

  • Praat TextGrid files
  • Elan eaf files
  • Leipzig Glossing Rules (in plain txt format)
  • simple termsets in CVS format (author terms mapped to GOLD URIs)
  • BibTeX files

Quite separate in terms of functionality is the reader for LinkedData.

Data display

Once the data are migrated to an interoperable RDF format, various display structures (dictionaries, interlinear glossed text, tabular paradigms) are then needed for rendering. For instance, it is of little use to display raw RDF code as the result of a data search. Instead we opt for display formats such as formatted HTML, PDF, etc.