About project VepKar

Livvin piämurdehes Vepsän-karjalkielizes avvokorpusas

Welcome to VepKar — an Open corpus of Veps and Karelian languages containing dictionaries and corpora of the Baltic-Finnish languages of Karelia peoples.

The VepKar project is a continuation of the work on the Veps Language corpus. Employees of the Karelian Research Centre of the Russian Academy of Sciences fill in the dictionary and add texts to the corpus of Veps and Karelian languages. The corpus of the Karelian language includes the Karelian Proper, Livvi-Karelian and Ludic Karelian dialects, which have newly created writing tradition (“младописьменный”, mladopis'mennyy type of languages).

The developed corpus manager is an open source project Dictorpus. Also the database, including dictionaries and texts (see the list of database dumps), have open license (CC-BY).

The name of the project "Dictorpus" indicates the union of the dictionary (DICTionary) and the corpus (cORPUS). The program Dictorpus is designed for teams of linguists working with the languages​ of the world. At the moment, the program supports and takes into account the features of Veps and Karelian languages.

See the publication list.

What is "the language corpus"

The corpus is an information and reference system based on the collection of texts in electronic form. Using the electronic dictionary, included in the corpus, you can quickly search and process texts. It is these corpora and dictionaries that we develop in the framework of the VepKar project.

VepKar in numbers

The Open corpus of Veps and Karelian languages was opened on July 24, 2016. At the moment in the corpus:
54653 articles
about words
2887 texts on 6 languages
и 46 dialects