News
June 28, 2022
Scientists are creating a speech corpus of Baltic Finnic languages of Karelia. Unique audio material reflecting the state of Karelian and Veps dialects will become available in open access. A field trip to Medvezhyegorsky District took place in June as part of this project. It is one of the areas in Karelia where the Karelian-Proper supradialect has been best preserved.
On June 20–24, researchers from the Institute of Linguistics, Literature and History (ILLH) KarRC RAS Irina Novak, Denis Kuzmin, Natalia Pellinen together with a colleague from the Institute of Applied Mathematical Research (IAMR) KarRC RAS Andrey Krizhanovsky visited Padany, Selgi, Syargozero, Shalgovaara, Vengigora, Yevgora, Karelskaya Maselga and other settlements in the Medvezhyegorsky District of Karelia.

“We’ve managed to collect and update linguistic, ethnographic, and folklore materials from residents of the Padany village cluster – it’s one of the few areas in the republic where the Karelian-Proper supradialect has been best preserved. Audio records of our conversations with speakers of local patois will be included in the future speech corpus”, – says Junior Researcher at Linguistics Section of ILLH KarRC RAS Natalia Pellinen.

The trip was part of the project funded by Russian Science Foundation grant # 22-28-20215 “Creating the speech corpus of Baltic Finnic languages of Karelia” (Project Leader – Alexandra Rodionova, Researcher, ILLH KarRC RAS). It envisages creating a sound module on the platform of the Veps and Karelian Open Corpus (VepKar), which was launched by staff of ILLH and IAMR KarRC RAS in 2016.

The corpus is a reference data system based on a collection of texts in electronic form. It comprises databased texts and dictionaries as well as software for text search and processing. VepKar is an open corpus of the Veps and Karelian languages. The new project will supplement the corpus with a collection of sounded texts in different dialects. They will be supplied with transcription, mark-up, and Russian translation.

“Having archival and field audio samples of Karelian and Veps speech digitalized in the Speech Corpus format will make subsequent processing and storage of the material easier and facilitate the introduction of unique audio materials reflecting the state of Karelian and Veps dialects since the mid-20th century into scientific circulation and open access. These materials are deposited at the Audio Records Archive of ILLH KarRC RAS and are in dire need of being digitalized to secure further storage”, – the project application asserts.

One of the project outputs will be a multimedia map of Baltic Finnic subdialects of Karelia. It will enable any user to become familiar with different variations of the republic’s indigenous languages. This resource will help present the versatile features of the living and the lost Baltic Finnic dialectal speech of Karelia. The map will be useful in educational activities and for developing tourism in the region.

See also: