Gurung man speaking with team member Kristine Hildebrandt, June 2012. Photo credit: Oliver Bond

Motivations and Goals
There are many methods that can be used to compile information about a language’s grammar and lexicon in order to build an adequate descriptive account. One common and well-tested avenue is that of elicitation, through which phonological contrasts may be established, and where paradigms and other constructions may be built and compared to uncover the inflectional and derivational categories relevant for the language.

However, we must also recognize a language as more than simply the results of the combination of levels of grammatical analysis (phonetics-phonology, morpho-syntax, lexical & grammatical semantics), but also as a communicative event. As a communicative tool, utterances are tailored and shaped to particular grammatical (and social/cultural) functions. As such, to come to truly know about a language is to do so through the collection and analysis of texts from a wide range of representative speakers and of speech genres. This is by no means an easy task, and admittedly it is one the slowest dimensions of our project to develop, but we make constant and steady progress nevertheless…

Our approaches to text collection attempt to conform to Bird and Simons’ (1993) “best practices recommendations”, particularly:

  • Format: We aim to release text collections with open-access transcription and translation conventions, available through freely accessible applications like ELAN and Toolbox
  • Discovery: The texts are accompanied by metadata and metadescription such that the discourse event does not become an uninterpretable artifact over time; as such, the text becomes a vehicle for grammatical analysis, while important contextual information remains associated with the text
  • Access: Only texts that have been approved by the speech community are available for public access and analysis; in most cases, publicly available texts represent oral histories, personal experiences, object or procedural descriptions and demonstrations about practices and materials that are of some cultural significance to the community, and that the community wishes to share with the world.

Some texts have been acquired by dynamic-interactive stimuli (photos and video prompts) and guided interaction (e.g. a tour of a residence, demonstrations of agricultural techniques). Other exemplars include monologic procedural texts (e.g. activities, recipes, visual landscape or object descriptions), interviews between friends and relatives, and narrative autobiographies/histories.
Texts were recorded either for audio archive only (via Marantz recorder and Audio-Technica omnidirectional stereo microphone), or else with a Canon FS200 video camera (with external microphone input), or else with a Sony HD solid-state recorder, also with external mic input (which we love, by the way…beautiful sound and resolution).


Outputs &  Metadata

The range and detail of metadata and metadescription vary. Minimally we include place/time/participants/genre/duration/recording equipment data. [Under revision] Many of our recorded texts and discourses from Gyalsumdo speakers are now available via the SHANTI (Sciences Humanities and Arts Network of Technological Initiatives) media base collection (connected to the Tibetan Himalayan Library at the University of Virginia).

Access to Discourses

These projects are available under a Creative Commons Licensing Act. By downloading these files, you automatically accept the terms and conditions as related to the Attribution-Non-Commercial-Sharealike License 3.0)

The main page for the discourses can be found hereYou may need to disable the “https” setting in order to view the video in certain browsers (e.g. Safari)


1. Gurung (Tamu, Tamu kye, Ethnologue ISO 639-3 gvrGlottocode west2414)

2. Gyalsumdo (Lama Bhaasa, Glottocode gyal1235)

3. Nar and Phu (Chhyprung, Nar Toe, single ISO npaGlottocode narp1239)

4. Nyeshangte (Manange, Manang kye, ISO nmmGlottocode mana1288)

