About VepKar

Varžinaiskarielah da VepKar-korpussah näh

Welcome to VepKar — the Open corpus of Veps and Karelian languages containing dictionaries and corpora of the Baltic-Finnish languages of Karelia peoples.

The VepKar project is a continuation of the work on the Veps language corpus. Employees of the Karelian Research Centre of the Russian Academy of Sciences fill in the dictionary and add texts to the corpus of Veps and Karelian languages. The corpus of the Karelian language includes the Karelian Proper, Livvi-Karelian and Ludic Karelian dialects, which have newly created writing tradition (“младописьменный”, mladopis'mennyy type of languages).

The developed corpus manager is an open source project Dictorpus. Also the database, including dictionaries and texts (see the list of database dumps), have open license (CC-BY).

The name of the project "Dictorpus" indicates the union of the dictionary (DICTionary) and the corpus (cORPUS). The program Dictorpus is designed for teams of linguists working with the languages​ of the world. At the moment, the program supports and takes into account the features of Veps and Karelian languages.

See the publication list.

What is "the language corpus"

The corpus is an information and reference system based on the collection of texts in electronic form. This linguistic corpus includes texts and dictionaries stored in a database, and a computer program (corpus manager) for searching and processing data.

VepKar in numbers

The Open corpus of Veps and Karelian languages was opened on July 24, 2016. At the moment in the corpus:
64 458 articles
about words
3 956 texts on 46 dialects