News
October 28, 2022
Language and culture studies in the digital era were discussed by participants of the 19th Bubrikh’s Readings Conference, which took place at KarRC RAS on October 26–27. Scientists presented the results of their research and useful digital resources. One of them is VepKar – an open corpus with over 4 000 texts in Veps and Karelian. It is currently used by specialists to create a spoken corpus of Balto-Finnic languages of Karelia.
Bubrikh’s Readings – a scientific conference where experts from Russian regions discussed the different aspects of Finno-Ugric linguistics, folklore and literature studies, history, and modern ethno-social processes – has just finished at the Karelian Research Centre RAS. During the two conference days 28 oral presentations and three poster presentations were delivered.

On the second day, a group of young scientists from the Higher School of Economics (Moscow) participated in the conference. In July, staff of HSE University’s Arctic Social Sciences and Humanities Laboratory visited the Institute of Linguistics, Literature and History KarRC RAS on the way back from their expedition to northern districts of Karelia. In October, the researchers had the chance to share their observations and findings with participants of Bubrikh’s Readings.

The cross-cutting theme was the use of new technologies in language, folklore, and literature studies, ethnology, and history. Digitalization of the language and cultural heritage, digital collections, databases, electronic dictionaries, language corpora, innovative research methods – these and other topics were central in the discussions.



In particular, speaking of the concepts of “childhood” and “children’s” in Karelian, Junior Researcher at ILLH KarRC RAS Natalya Pellinen pointed out the important role of digitalized dictionaries. She mentioned the example of the dictionary of Tver dialects of Karelian by A.V. Punzhina, the electronic version of which became available in 2021. “The dictionary has a user-friendly interface and search system. As a result, it now takes hundreds of times less time to pick out the vocabulary”, – the researcher remarked. Another valuable resource for studies in Balto-Finnic phonetics, grammar and vocabulary that she mentioned is the Veps and Karelian Open Corpus VepKar.

The VepKar portal, which contains over 4000 texts in 46 dialects, is now familiar to most researchers in Finno-Ugric studies. It was developed jointly by specialists at the Institute of Linguistics, Literature and History KarRC RAS and the Institute of Applied Mathematical Research KarRC RAS. However, textual data alone is not enough for adequate phonetic research. So this year scientists have taken on another major task – to create a spoken corpus of the Balto-Finnic languages of Karelia. To this end, they applied for and received a grant from the Russian Science Foundation.

– The Spoken Corpus module to be developed will be a collection of audio texts in different Karelian and Veps dialects supplied with mark-up and Russian translations, – Project Leader, Researcher at ILLH KarRC RAS Alexandra Rodionova shared.



The Audio Record Archives of ILLH KarRC RAS store extensive amounts of archival and field-collected audio samples of Karelian and Veps speech which need to be digitalized.

– The application of modern technology and methods to the field material accumulated over many decades in combination with the latest data can help to fill many of the gaps previously identified by linguists in this system. Digitization of archival and field-collected audio samples of Karelian and Veps speech in a spoken corpus format will facilitate future material handling and storage, will pave the way into scientific discourse for unique audio material representing the state of Karelian and Veps dialects starting from the mid-20th century and will make them readily available, – Alexandra Rodionova remarked.



As part of this project, the audio map of the dialects of Balto-Finnic languages of Karelia will be produced. It will be filled using materials from expeditions, from open sources, such as TV broadcasts in ethnic languages, and from the ILLH Audio Record Archives. By the end of the project, the map will contain at least 100 audio samples in all major supradialects of Karelian and in Veps. It is noteworthy that the audio map is already linked to a fragment in the Valdai dialect. It was recorded in 1990 in the Novgorod Region by the Karelian philologist and dialectologist Alexandra Punzhina (1934–2020). The audio was discovered only recently in the scientist’s personal archives.

– Previously, Valdai Karelian speech records were not to be found in either our archives or in Kotus (Institute for the Languages of Finland), or in the Pushkin House archives. There used to be some in Tartu archives but were ruined by fire. Recording Valdai speech anew is no longer possible. In effect, this will remain the only record of Valdai speech samples, – Alexandra Rodionova shared.

Wrapping up the conference, Organizing Committee chairperson, Chief Researcher at ILLH KarRC RAS, Corr. Academician Irma Mullonen remarked that the Bubrikh’s Readings are not merely leaving behind a good impression but have also generated a clear understanding of ongoing activities and plans for the future.



– We now share a feeling and understanding that a transition to a new level of digital material use has taken place. We have mastered these methods, we apply them and create ones ourselves, – Irma Mullonen emphasized.

The best presentations from the conference will be published in the journals Proceedings of Petrozavodsk State University, Yearbook of Finno-Ugric Studies, and Finno-Ugric World. Bubrikh’s Readings is a biennial event. Its provisional host in 2024 will be the Petrozavodsk State University.

See also: