Table Of Contents

Previous topic

Store

Next topic

SPARQL utils

This Page

Miscellaneous Utilities

The functions module contains various utility functions, mostly for string processing, IGT element detection (redup. forms, clitic forms, etc), and tipa latex clean-up functions.

Linguistics utilities

These functions moved here from Leipzig reader. They are mostly used to process IGT.

eltk.utils.functions.validate(lingunits, glosses, translation)

Prints a warning to stdout if number of source items does not match number of gloss items.

Parameters:
  • lingunits (list) – units from line 1 of IGT
  • glosses (list) – units from line 2 of IGT
  • translation (str) – translation string from line 3 of IGT
Returns:

pass or fail

Return type:

boolean

eltk.utils.functions.makeID(string='')

Returns an randomly generated id, e.g., from ‘blah’, generate ‘blah2345’.

Parameter:string (string) – the prefix
Returns:a random id string
Return type:string
eltk.utils.functions.parseIGTWord(s)

Custom splitter when string is mixed with the following delimiters ‘-‘, ‘=’, and ‘~’, as is common within words of IGT lines. White space and ‘.’ are not considered delimiters.

Parameter:s (string) – A string containing IGT delimiters
Returns:list of items to which the delimiters apply
Return type:list
eltk.utils.functions.parseIGTLine(line)

Returns an arrary (list of lists) of items within a line of IGT. (Does not assign a linguistics category to any of the items).

Parameter:line (string) – A string containing a line of IG
Returns:list of items/words in the line of IGT
Return type:list
eltk.utils.functions.normalizeLine(line)

Normalize the string, getting rid of extra spaces, tabs, etc.

Parameter:line (string) – A typical line of IGT
Returns:A normalized line
Return type:string
eltk.utils.functions.findClitics(word)

Return a list of clitics for words containing =’s as delimiters. Strategy is to locate the stem based on relative length.

Parameter:word (string) – A word with =’s delimiters
Returns:A list of clitic forms
Return type:list
eltk.utils.functions.findRedups(word)

Return a list of reduplicated morphemes for words containing ~’s as delimiters.

Parameter:word (string) – A word with ~’s delimiters
Returns:The reduplicated form
Return type:string
eltk.utils.functions.findInfixes(word)

Return a list of infix forms, those surrounded by <...>’s.

Parameter:word (string) – A word with infixes delimited by <...>’s
Returns:list of infix forms
Return type:list
eltk.utils.functions.unpack(l, delimiter='')

Recursive function: Returns a string by concatenating all elements of a list. Inserts a space between material from each list element, and removes any delimiters, e.g., ‘-‘. For example, [[‘abc’],[‘c’],[‘d-‘,’e’]] returns ‘abc c de’

Parameters:
  • l (list) – A list of lists of strings
  • delimiter (str) – include a final delimiter, e.g., ‘-‘, default is ‘’.
Returns:

concatenated string of all list contents

Return type:

string