<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body style="font-size: 10pt; font-family: Verdana,Geneva,sans-serif">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<table width="640" cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td width="110"
valign="middle"
align="left"><img
src="cid:part1.ZxzIVvXc.0GZR3Jtd@ijs.si" alt="" class="" width="101"
height="30"></td>
<td width="20"
height="1"> </td>
<td valign="middle"
align="right">
<table width="100%"
cellspacing="0"
cellpadding="0"
border="0"
align="center">
<tbody>
<tr>
<td
style="font-family:
Poppins,sans-serif; font-size: 21px; line-height: 31.5px; font-weight:
bold; color:
#0080ad;"
align="right">CLASSLA
Mailing List</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td align="center">
<table
style="border-top:
3px double #ededf3;
border-collapse: initial;"
width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="line-height:
0px; min-height:
0px;" height="0"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table
style="border-radius:
2px;" width="560"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="padding:
0px 40px; border:
1px solid #e6e6e6;
border-radius: 2px;"
bgcolor="#FCFCFC"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0"
border="0"
align="center">
<tbody>
<tr>
<td
height="30"> </td>
</tr>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477bodyText-8"
style="font-family: Poppins,sans-serif; font-size: 14px; line-height:
21px; color:
#000000;"><b
style="font-weight:normal;"
id="docs-internal-guid-76a5ddbe-7fff-2ad0-085c-05d2275b990b">
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Dear all,</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">As the year comes to a close, we would like to share a brief summary of the main activities and progress made at the CLASSLA Knowledge Centre for South Slavic Languages during 2024.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">CLASSLA web corpora for South Slavic languages</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">This year, we set up a crawling infrastructure for the (bi)annual collection of web corpora for South Slavic languages – the </span><a
href="https://aclanthology.org/2024.lrec-main.291/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">CLASSLA-web corpora collection</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. The first version of corpora, CLASSLA-web 1.0, comprising 11 billion words in 7 languages, was </span><a
href="https://www.clarin.si/ske/#open" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">included to the CLARIN.SI concordancers</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> in 2023 and </span><a
href="https://www.clarin.si/repository/xmlui/discover?query=%22CLASSLA-web%22&submit=Search&filtertype_1=title&filter_relational_operator_1=contains&filter_1=%22CLASSLA-web%22&filtertype_2=title&filter_relational_operator_2=contains&filter_2=&query=&rpp=10&sort_by=dc.date.issued_dt&order=desc"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">released on the CLARIN.SI repository this year</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. The web corpora are linguistically annotated with an </span><a
href="https://zenodo.org/records/13936406" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">improved CLASSLA-Stanza</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> tool for linguistic annotation of South Slavic languages (</span><a
href="https://clarin.si/oznacevalnik/eng" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">available as a service here</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">) and a multilingual genre classifier </span><a
href="https://huggingface.co/classla/xlm-roberta-base-multilingual-text-genre-classifier"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">X-GENRE</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. Owing to their large size and recency, the CLASSLA-web corpora have already shown to be very useful for the development of large language models for South Slavic languages, and were included in the training datasets for the </span><a
href="https://huggingface.co/cjvt/GaMS-1B" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">GaMS</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> (Generative Model for Slovene) model and the </span><a
href="https://huggingface.co/gordicaleksa/YugoGPT"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">YugoGPT</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> model for Bosnian, Croatian, and Serbian. The next version of the CLASSLA web corpora has already been collected, and the release is planned for 2025.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">CLASSLA-Express</span><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> </span><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">workshop series</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">In collaboration with Ivana Filipović Petrović, Jelena Parizoska and Petya Osenova, we organized seven </span><a
href="https://www.clarin.si/info/k-centre/workshops/classla-express/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">CLASSLA-Express workshops</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> in five South Slavic countries, attended by over 120 participants. The workshops focused on introducing concordancers, CLASSLA-web corpora, and CLARIN.SI services to linguists, lexicographers, language teachers, digital humanities scholars, and students. Feedback was extremely positive, and we are planning additional workshops for 2025, with sessions to be held in Bulgaria, Croatia, and Slovenia, as well as expanding beyond the South Slavic region to locations such as Austria. The workshops will also feature new topics, including </span><a
href="https://www.clarin.si/info/k-centre/workshops/#September_2024_Round_Table_on_the_Usage_of_Large_Language_Models_in_Corpus-Linguistic_Research"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">the application of large language models in corpus linguistics and lexicography</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. Stay tuned for more details about the upcoming workshops!</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Benchmarking LLMs for South Slavic languages and dialects</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">The rapid advancements in large language models have also reached South Slavic languages, and evaluation of their capabilities has become crucial to understand the strengths and limitations of these models for our languages, and to guide future development in both academic and applied settings. To this end, we benchmarked large language models for South Slavic languages and </span><a
href="http://hdl.handle.net/11356/1766" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">dialects, including the Torlak, the Chakavian, and the Cerkno dialect</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, on the task of commonsense reasoning. </span><a
href="https://aclanthology.org/2024.vardial-1.18/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">The results</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> showed impressive capabilities of GPT models in handling South Slavic languages, showcasing not only their strong performance but also their ability to adapt to dialects. Remarkably, these models achieved high levels of accuracy in target dialects when provided with only a handful of examples. We are excited to continue our benchmarking activities as part of the LLM4DH and </span><a
href="https://alt-edic.eu/projects/llms4eu/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">LLMs4EU</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> projects, which will extend over the next few years.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Speech technologies</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">We continued dipping our toes into the world of speech technology. Our efforts included the development of the </span><a
href="https://huggingface.co/classla/whisper-large-v3-mici-princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">automatic speech recognition (ASR) system tailored to the Chakavian dialect</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> based on the </span><a
href="https://huggingface.co/datasets/classla/Mici_Princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Mići Princ dataset</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. We also worked on the </span><a
href="https://mezzanine.um.si/en/mezzanine-english/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Mezzanine</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, </span><a
href="https://arxiv.org/abs/2409.15397" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">ParlaSpeech</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> and </span><a
href="https://zenodo.org/records/13936420" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Mak na konac</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> projects, which focus on developing spoken corpora and benchmarking speech technologies for Slovenian, Croatian and Serbian. In addition to developing various speech technologies, such as the </span><a
href="https://huggingface.co/classla/wav2vecbert2-filledPause"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">classifier for filled pauses in speech (eem)</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> that works splendidly for a series of South Slavic languages, we started building the CLASSLA infrastructure for speech research by </span><a
href="https://www.clarin.si/ske/#dashboard?corpname=parlaspeech_hr"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">publishing ParlaSpeech corpora also on concordancers</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. We are currently working on further enriching these corpora with disfluency information, primary stress position, and boundaries of prosodic units.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Sharing knowledge on language resources for South Slavic languages</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">As a knowledge centre, one of our core activities is sharing valuable information and supporting users in their work with language resources and technologies. Over the past year, we have responded to numerous helpdesk inquiries regarding access to resources and their use. In addition to providing direct support, we also maintain informative materials to help users navigate available resources – the CLASSLA FAQs for </span><a
href="https://www.clarin.si/info/k-centre/faq4slovene/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Slovenian</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, </span><a
href="https://www.clarin.si/info/k-centre/faq4croatian/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Croatian</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, </span><a
href="https://www.clarin.si/info/k-centre/faq4serbian/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Serbian</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, </span><a
href="https://www.clarin.si/info/k-centre/faq4bulgarian/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Bulgarian</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, and </span><a
href="https://www.clarin.si/info/k-centre/faq4macedonian/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Macedonian</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">. Furthermore, we released a new </span><a
href="https://github.com/clarinsi/Slovenian-Language-Technologies-Overview/tree/main"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">overview of Slovenian language technologies</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, summarizing the state-of-the-art language technologies for Slovenian.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:700;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Monitoring the usage of language resources</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">We also actively supported our parent organization, </span><a
href="https://www.clarin.si/info/about/" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">CLARIN.SI</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">, by monitoring the usage of freely accessible language resources and concordancers provided by the CLARIN.SI infrastructure. This allowed us to gain valuable insights into which datasets, technologies, and corpora are used the most. We were pleased to discover significant usage from outside Slovenia, with users frequently querying corpora in over 18 different languages. We invite you to watch a brief </span><a
href="https://www.clarin.si/info/end-of-year-review-clarin-si-in-2024/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">1-minute video</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> presenting key statistics, including the number of visits, most popular resources, and a closer look at concordancer usage.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">We are also very happy with the uptake of our </span><a
href="https://huggingface.co/classla" style="text-decoration:none;"
moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Hugging Face page</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> from where our ParlaSpeech corpora have been downloaded more than 6,000 times in the last few months. Our models are also heavily used, with the recently published </span><a
href="https://huggingface.co/classla/multilingual-IPTC-news-topic-classifier"
style="text-decoration:none;" moz-do-not-send="true"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#1155cc;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;-webkit-text-decoration-skip:none;text-decoration-skip-ink:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">multilingual IPTC news topic classifier</span></a><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"> being downloaded almost 13,000 times in the past four months.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">We would like to take this opportunity to thank all our collaborators for another incredibly productive year and to express our gratitude to you for staying engaged with our activities. We look forward to another year of exciting developments and continued collaboration. Wishing you all a successful and fulfilling year ahead, both professionally and personally.</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Best wishes,</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:12pt;margin-bottom:12pt;"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Nikola Ljubešić, Taja Kuzman and other CLASSLAers</span><b
style="font-weight:normal;"
id="docs-internal-guid-76a5ddbe-7fff-2ad0-085c-05d2275b990b"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#2b2b2b;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"></span></b><b
style="font-weight:normal;"
id="docs-internal-guid-a3ea52a0-7fff-23c9-7eab-7ccea5e45852"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;"></span></b></p>
</b> </td>
</tr>
<tr>
<td
height="30"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#e6f4ff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#e6f4ff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 20px;
min-height: 20px;"
height="20"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="font-family:
Poppins,sans-serif;
font-size: 14px;
font-weight: bold;
line-height: 21px;
color: #111111;"
align="left"><a
href="https://www.clarin.si/info/k-centre/" target="_blank"
rel="noopener
noreferrer"
moz-do-not-send="true">CLASSLA: The Knowledge Centre for South Slavic
Languages</a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td height="10"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td align="center">
<table
style="width:
267px; min-width:
267px;"
width="267"
cellspacing="0"
cellpadding="0"
border="0"
align="left">
<tbody>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477footerText-10"
style="font-family: Poppins,sans-serif; font-size: 12px; line-height:
18px; color:
#111111;"
align="left">
<p
style="margin-top:
0px;
margin-bottom:
10px;"><a
href="http://clarin.si/" target="_blank" rel="noopener noreferrer"
moz-do-not-send="true">CLARIN.SI</a></p>
<p
style="margin-top:
0px;
margin-bottom:
10px;">Jožef
Stefan
Institute</p>
<p
style="margin-top:
0px;
margin-bottom:
0px;">Jamova
cesta 39,
Ljubljana<br>
Slovenia</p>
</td>
</tr>
<tr>
<td
height="25"> </td>
</tr>
</tbody>
</table>
<table
style="width:
267px; min-width:
267px;"
width="267"
cellspacing="0"
cellpadding="0"
border="0"
align="right">
<tbody>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477footerUnsubscribeText-10"
style="font-family: Poppins,sans-serif; font-size: 12px; line-height:
18px; color:
#111111;"
align="right">
<p
style="margin-top:
0px;
margin-bottom:
0px;"><br>
<span
style="font-size:
10px;"></span></p>
</td>
</tr>
<tr>
<td
height="10"> </td>
</tr>
<tr>
<td
style="font-family:
Poppins,sans-serif; font-size: 12px; line-height: 18px; color: #111111;"
align="right"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div dir="ltr"> </div>
</div>
</div>
<div class="pre"
style="margin: 0; padding: 0; font-family:
monospace;"> </div>
</body>
</html>