ELTK logo


Current version: 0.4

Get the ELTK

How to cite the ELTK:

Cite this paper as:

Farrar, Scott and Moran, Steve. (2008). "The e-Linguistics Toolkit". In Proceedings of e-Humanities–an emerging discipline: Workshop in the 4th IEEE International Conference on e-Science. IEEE/Clarin, IEEE Press.

Questions? Suggestions?

Join the Google group:

You can also take part in the development at the dev site.


Purpose: To lay the foundations for a cyberinfrastructure for linguistics.

The e-Linguistics Toolkit (ELTK) is a Python library for working with linguistics data. It is a key component in the e-Linguistics project. It has the following functions:

  • to manipulate linguistics data
  • to translate un- or semi-structured data into RDF (using an ontology )
  • to expose RDF data in various display forms
  • to provide the basis for web frameworks and services for linguistics

As such, the toolkit is meant to facilitate a cyberinfrastructure for the field of linguistics. That is, it is meant to create a pathway for data interoperability in terms of encoding, format, and content. In order to achieve data interoperability, significant manipulation (transformation, validation, and merging) of original source data is required. Finally, the ELTK seeks to leverage NLP techniques to serve the whole of linguistics.



We would like to gratefully acknowledge the National Science Foundation and the National Endowment for the Humanities for their support in developing these tools, specifically for the following grants:

  • DEL-0555303 (NSF) / FN‑50004‑06 (NEH), The Documentation and Preservation of Western Beboid Languages of Cameroon
  • BCS-0720670, Implementing the GOLD Community of Practice: Laying the Foundations for a Linguistics Cyberinfrastructure.
  • Copyright © 2009 Scott Farrar and Steve Moran