The
Online Database of
Interlinear Text
About ODIN's Advanced Search
Until recently, search across ODIN has been limited queries for specific language names and codes. The newly developed Advanced Search gives you the ability to compose more sophisticated queries across the database, allowing searches for data in specific language families, for examples that have certain specific grammatical content, and even for data that contain certain pre-defined constructions. Results are returned as a list of language names and codes, with a Data option that allows you to display the interlinear data relevant to your query by language.
Please note: All Interlinear Glossed Text (IGT) discovered by ODIN has been mined from online resources, many of which are copyrighted. The search features described here are intended to make the location of these resources easier and to test more sophisticated query and data manipulation techniques specifically designed for the linguistic community, as as envisaged within the GOLD Community Model (Farrar & Lewis 2005). Since all data returned by ODIN are cited (author, title, year, source URL), we ask that you in turn cite the source documents appropriately should you choose to use any of the data retrieved for your own research purposes. We also ask that you cite ODIN as the means you used to locate the data. Thank you.
Search Criteria
There are three basic search criteria that you can specify:
- Concept Search - With concept search, you can search the gloss line of IGT for markup that encodes certain grammatical concepts. Grammatical concepts include such notions as tense, aspect, and case and their values, such as Present Tense, Perfective Aspect and Ergative Case. Concept vocabularly is normalized to terms contained in the General Ontology of Linguistic Description (GOLD). For example, Perfective Aspect can be marked up in any number of ways, such as PERF, PFV, PF. Search across ODIN normalizes these terms to a common form contained in GOLD, in this case, PerfectiveAspect. Since much of the vocabularly normalization is being done done by hand, not all vocabulary for all examples and languages has been normalized. It should also be noted that the normalization is often done on a language or document basis, so that "over-normalization" does not occur. Examples include alternative uses of common markup terms, such as PERF to mean Perfect Tense instead of the more common Perfective Aspect.
- Language Family - Language Families were derived from the Ethnologue Database. We are working with LinguistList to increase the size of the language family inventory currently available.
- Constructions/Features - The Construction/Features query allows you to search for IGT examples that contain certain constructions or have certain lingustically relevant "features". For instance, you can search for Passive, Conditional, Counterfactual, and Raising Constructions, among others. You can also search for examples that have Negative Polarity (whether sentential or otherwise), examples that are Ungrammatical, or examples that contain multiple Wh words. Since IGT is not typically marked up above the morphological level, most constructions in IGT can only be guessed at, either by inferring their existence from the presence of certain markup vocabulary (such as a morpheme marked with PASS indicating a passive construction), or by examining the gloss and translation lines for clues as to constructions that may exist in the source language data. For example, a conditional structure in the English translation (a clause headed by if or when) probably indicates that a similar structure exists in the source language. Likewise, the existence of a passive in the English translation probabaly indicates the existence of a passive or passive-like structure in the source. The search engine thus "makes guesses" about what the language data contains, but the guesses are educated ones. You, the linguist, should review the results to ensure that the data does in fact contain structures you are interested in. You can see the full list of construction queries that are available by clicking here.
Search Results
Search results are returned first as a list of languages and associated links. The links allow you to access additional information about the languages whose data match your search criteria. These links are listed in columns, with the columns being:
- Language Name - The most common name for the language (according to the Ethnologue).
- Language Code - This is the Ethnologue Code for the language, which links to the Ethnologue description of this language.
- Language Profile - This a profile for the language, as determined from markup vocabulary mined from IGT for the language. The language profile follows the standard described in the GOLD Communinity of Practice (Farrar & Lewis 2005), and contains a list of terms used to describe grammatical concepts, as well as a listing of grammatical concepts used in describing data for the language. Profiles provided by ODIN are not meant to be authoritative or complete, but merely represent fragments of a language's grammar.
- Resources - Resources are the documents and websites where IGT for the language was discovered. You can pull up the full list of papers containing data on a particular language by clicking this link.
- Data - The data link lists IGT that matches the search criteria specified. The data returned include all instances of IGT that match your criteria exactly, and only those data that can be displayed. Some data is lost in the extraction and conversion process, and therefore cannot be displayed. Further data cannot be displayed because citation information was either not available or could not be found. (All IGT displayed directly by ODIN contains citation information (author, title, year, source URL), or it will not be displayed.)
ODIN Home Advanced Search