Building the Resource
In the absence of the hundreds of years of traditional text-based historical-linguistic work available for Eurasian languages, digital methods emerge as the best means available for systematically compiling and exploring the existing textual data for language families and contact in Pre-Colonial South America.
Our next step is carefully annotate the texts for metadata and linguistic features through lemmatisation, part-of-speech tagging, sound-spelling equivalences, and morphological tagging.
Using standard annotation categories from the Text Encoding Initiative (TEI – an XML encoding) and bespoke tools developed in-house, we will create a searchable database for the textual material.
These methods will allow the research team –and future users– to draw links between individual related features over time and across languages, thus turning back the clock as far back as the data allows, in order to probe the links between them.