About

Open corpus of Veps and Karelian languages (abbr. VepKar) contains dictionaries and corpora of Finnic languages of Karelia peoples.

VepKar project is a continuation of work done in Veps language corpus. Karelian language corpus include three varieties which can be considered as dialects or separate languages: Karelian Proper, Livvi-Karelian and Ludic Karelian, all of which have newly created writing tradition (“младописьменный”, mladopis'mennyy type of languages).

The corpus is a search information system, including a set of texts written in some language in a digital form. The corpus is a more than a simple set of texts, the corpus should provide a search possibilities within these texts and the possibility to add (bind) some information to these texts. A search possibilities of the corpus are based on the electronic dictionary (as part of the corpus) connected with the texts of the corpus.

The developed software is open source (Dictorpus at GitHub), linguistics data has open license (CC-BY).

VepKar project involves researchers from institutes under the Karelian Research Centre of RAS.