About project VepKar

Kard’alan kielen lyydin murreh

Welcome to VepKar - an Open corpus of Veps and Karelian languages, containing dictionaries and corpora of the Baltic-Finnish languages ​​of Karelia peoples.

The VepKar project is a continuation of the work on the Vepsian Language corpus. Employees of the Karelian Research Center of the Russian Academy of Sciences fill in the dictionary and add texts to the Corpus of Vepsian and Karelian languages. The body of the Karelian language includes the Karelian Proper, Livvi-Karelian and Ludic Karelian dialects proper, which have newly created writing tradition (“младописьменный”, mladopis'mennyy type of languages).

The software shell of the VepKar case is an open source project Dictorpus developed by us and open data (CC-BY). The name of the project "Dictorpus" indicates the union of the dictionary (DICTionary) and the corpus (cORPUS). The program Dictorpus is designed for teams of linguists working with the languages ​​of the world. At the moment, the program includes support and takes into account the features of Vepsian and Karelian languages.

What is "the language corpus"

The corpus is an information and reference system based on the collection of texts in electronic form. Using the electronic dictionary, included in the corpus, you can quickly search and process texts. It is these corpora and dictionaries that we develop in the framework of the VepKar project.

VepKar in numbers

The Open corpus of Veps and Karelian languages was opened on July 24, 2016. At the moment in the corpus:
17992 articles
about words
1747 texts on 6 languages
и 42 dialects