Character Converter¶

The character conversion module is used to convert between any two string conventions, e.g., unicode ipa and praat symbols. It is useful in converting Praat TextGrid annotation to LaTeX (and onto PDF). It is also useful for converting any data not in Unicode to Unicode IPA. Unicode IPA refers to IPA symbols encoded in UTF-8. There is the CharConverter class itself, its constructor and the convert method.

Illustration of usage¶

Here’s how to create a converter from XSampa (e.g., used in Elan) to unicode IPA:

>>> elan_converter=CharConverter('xsampa','uni')

Now convert a string in XSampa to Unicode IPA:

>>> elan_converter.convert('kO nEnE Oku')
'k\xc9\x94 n\xc9\x9bn\xc9\x9b \xc9\x94ku'

In order to see the results, use a print statement:

>>> print elan_converter.convert('kO nEnE Oku')
kɔ nɛnɛ ɔku

The converter can be used for common latex escape characters as well.

eltk.utils.CharConverter.latex_charmap¶: Another dictionary is used for conversion to LaTex. Keys are common characters used in latex docs, e.g., ö. Values are list which include the corresponding latex escape character, e.g, “{o}. The ipa_charmap is not used because many of the common latex characters are not IPA.

The usage is the same:

>>> latex_converter=CharConverter('uni','latex')
>>> print latex_converter.convert('ü')
\"{u}

eltk.utils.CharConverter.ipa_charmap¶

The data source for the character converterm. Keys are ipa unicode objects, and values are lists of symbols:

unicode: [informal name, tipa, praat, xsampa]

For example:

'ɖ':['vd retroflex plosive','\:d','\d.','d`']

Thus, ‘:d’ is the tipa symbol used for the voiced retroflex plosive.

CharConverter.__init__(self, source, target)¶

Creates an instance of a character converter. Source is a string: ‘uni’, ‘name’, ‘tipa’, ‘praat’, ‘xsampa’, or ‘latex’. Target is also a string, one of: ‘uni’,’name’,’tipa’,’praat’,’xsampa’,or ‘latex’.

Parameters:	source (string) – ‘uni’, ‘name’, ‘tipa’,’praat’, ‘xsampa’, or ‘latex’ target (unicode, string) – ‘uni’, ‘name’, ‘tipa’,’praat’, ‘xsampa’, or ‘latex’

CharConverter.convert(string, string)¶

Returns a new string by replacing string w. a new value based on map. String is the a string to be converted.

Parameter:	string (string) – string to be converted
Return type:	unicode

Table Of Contents

Previous topic

Next topic

This Page

Character Converter¶

Illustration of usage¶

Navigation

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Character Converter¶

Illustration of usage¶

Navigation