News

One of CLARIN's priorities is to support the young researchers. CLARIN develops teaching materials and supports participation of PhD students in CLARIN conferences and other community activities. To showcase doctoral research that makes use of CLARIN language resources and tools, this year's Tour de CLARIN features Young Scholars Special Edition. The latest issue includes an interview with RTU Liepāja doctoral student Guna Rābante-Buša.

In the interview, Guna Rābante-Buša discusses her research on Latvian speech and phonetics, with a particular focus on creating high-quality speech corpora and language resources. For research Guna uses several Latvian speech corpora, including LATE-conversational, LATE-media, the phonetically annotated corpus fonLATE, and BalsuTalka. She explains that reliable speech datasets are essential not only for linguistic research but also for development of modern speech technologies such as automatic speech recognition and speech synthesis. The interview also highlights the challenges of working with a less-resourced language such as Latvian.

The interview further emphasizes the importance of interdisciplinary collaboration between linguists, engineers, and computer scientists. CLARIN plays a key role in this process by providing infrastructure that enables language resources to be shared, accessed, and reused by the international research community.

Finally, Guna Rābante-Buša encourages young researchers to combine expertise in linguistics with digital technologies. She believes that the future of language research lies in interdisciplinary work and in developing open, FAIR language resources that can benefit both the scientific community and society through improved language technologies.

On 19 and 20 May, the annual Centre Meeting organized by CLARIN ERIC took place in Frankfurt in a hybrid format, gathering everyone involved in setting up and hosting CLARIN centres. The main focus of the meeting was to discuss the latest updates in technical infrastructure, with special attention dedicated to the development of CLARIN repositories and tools, security solutions, as well as the role of artificial intelligence and Large Language Models in resource searching and query generation.

The meeting also featured a Technical Centres Committee session, alongside several presentation slots and lightning talks. In these sessions, member state representatives discussed a common citation framework, DDoS mitigation and the use of Cloudflare, the new AAI proxy setup, and FAIR assessments, while also sharing experiences on "vibe coding" and upcoming DSpace versions.

On 11 -12 February, the Strategy Days organized by CLARIN ERIC took place in Athens.

The main objective of this two-day meeting was to define CLARIN’s strategic goals for the next five years across CLARIN focus areas - users, language resources and tools, technical infrastructure and overall governance. Special attention was given to artificial intelligence and how CLARIN can further strengthen its role as a trusted repository of language data and a knowledge hub for user support.

During the Strategy Days, the National Coordinator Forum (NCF)meeting also took place. At this meeting, national coordinators exchanged updates on major activities in the Member States and the CLARIN Board of Directors presented CLARIN ERIC’s recent achievements and outlined plans for the future. During this meeting national coordinators also discussed best ways to align national CLARIN activities and plans with the overall strategy of CLARIN ERIC.

On April 21–22, 2026, an international workshop organized by CLARIN ERIC CLARIN in University Curricula took place in Utrecht (Netherlands). The seminar brought together university teachers, researchers, and CLARIN national representatives from the different European countries. CLARIN-LV was represented at the workshop by Ilze Auziņa, a senior researcher at the Institute of Mathematics and Computer Science at the University of Latvia.

The workshop continued the strategic process initiated at the CLARIN@Universities workshop in 2019 and further discussed during the CLARIN Cafés: How to use CLARIN in (online) Higher Education (2020) and Towards guidelines for integrating CLARIN into Teaching – Lessons Learnt from UPSKILLS (2021). The workshop focused particularly on collaboration between national CLARIN consortia and CLARIN ERIC to strengthen CLARIN’s visibility at universities and develop solutions tailored to educational needs. Key topics included curriculum development, research data management, data citation, challenges related to artificial intelligence and large language models, as well as practical strategies for sustainable integration.

Within the working group activities the seminar participants discussed how CLARIN resources, tools and services can support the study process in fields such as linguistics, digital humanities, language technologies, as well as social sciences, history and literary studies, shared experience stories on integrating CLARIN resources and tools into the learning process and discussed the development of joint, reusable and add-on materials.

The year 2025 marked an important milestone in the activities of CLARIN Latvia (CLARIN-LV, as it continued to expand and enhance its repository of language resources and tools.

Throughout the year, CLARIN-LV actively introduced the CLARIN research infrastructure to students, academic staff, and researchers highlighting its value for research and innovation. CLARIN Latvia also strengthened national and international collaboration, fostered knowledge exchange within Latvian research community and CLARIN ERIC consortium.

To promote access to high-quality data for researchers in the humanities and social sciences, the CLARIN-LV repositorywas enriched with new digital language resources, including speech corpora, lexical databases, and dictionaries. The most viewed language resources from the repository were Tēzaurs.lv (more than 1000 views per month), the Balanced Corpus of Modern Latvian (around 250 views per month), and the LATE Dev&Test Set for ASR (around 220 views per month). Significant contributions to the repository’s content were made by the DHELI and Language Technology Initiativeprojects. Although most language resources are open access, more than 120 users have registered in the CLARIN-LV repository—not only from Latvia, but also from the Netherlands, Iceland, Poland, Sweden, and other countries.

In cooperation with other members of the CLARIN ERIC consortium, the CLARIN Flagship Project PressMint was launched to compile a multilingual, comparable, annotated, translated and interoperable set of corpora of European historical newspapers from around the start of the 20th century. Two CLARIN-LV consortium members - the National Library of Latvia and the Institute of Mathematics and Computer Science of the University of Latvia – participates in this project. CLARIN-LV also became a member of the CLARIN Knowledge Centre on Large Language Models for the Humanities and Social Sciences (LLMs4SSH), established in 2025.

CLARIN infrastructure and language resources were introduced to the computer science students in the course “Fundamentals of Language Technologies” as well as to linguistics students in the course “Introduction to Computational Linguistics.” In December, CLARIN-LV organized a practical workshop for university teachers on the Digital Humanities course registry, where participants learned how to register courses.

Page 1 of 7

Tour de CLARIN publishes interview with Guna Rābante-Buša

Representatives of CLARIN Technical Centers Meet at the Annual Centere Meeting

CLARIN National Coordinators Meet at the Annual CLARIN Strategy Days

CLARIN in University Curricula Workhop

CLARIN-LV: key developments in 2025