<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body style="font-size: 10pt; font-family: Verdana,Geneva,sans-serif">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<table width="640" cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td width="110"
valign="middle"
align="left"><img
src="cid:part1.6icJekrb.a0jv0hUi@ijs.si" alt="" class="" width="101"
height="30"></td>
<td width="20"
height="1"> </td>
<td valign="middle"
align="right">
<table width="100%"
cellspacing="0"
cellpadding="0"
border="0"
align="center">
<tbody>
<tr>
<td
style="font-family:
Poppins,sans-serif; font-size: 21px; line-height: 31.5px; font-weight:
bold; color:
#0080ad;"
align="right">CLASSLA
Mailing List</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td align="center">
<table
style="border-top:
3px double #ededf3;
border-collapse: initial;"
width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="line-height:
0px; min-height:
0px;" height="0"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#ffffff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table
style="border-radius:
2px;" width="560"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="padding:
0px 40px; border:
1px solid #e6e6e6;
border-radius: 2px;"
bgcolor="#FCFCFC"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0"
border="0"
align="center">
<tbody>
<tr>
<td
height="30"> </td>
</tr>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477bodyText-8"
style="font-family: Poppins,sans-serif; font-size: 14px; line-height:
21px; color:
#000000;"> <b
style="font-weight:normal;"
id="docs-internal-guid-a3ea52a0-7fff-23c9-7eab-7ccea5e45852"><span
style="font-size:11pt;font-family:Arial,sans-serif;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">
</span></b> <b style="font-weight:normal;"
id="docs-internal-guid-6ee16cf6-7fff-30de-72ab-3e256a4a6768">
<p dir="ltr"
style="line-height:1.38;background-color:#ffffff;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 15pt 0pt;"><font
size="2"
face="Arial"><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Dear all,
</span></font></p>
<p dir="ltr"
style="line-height:1.38;background-color:#ffffff;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 15pt 0pt;"><font
size="2"
face="Arial"><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">As you might have noticed, recently, we extended our efforts of providing language resources and technologies from standard South Slavic languages to South Slavic dialects as well (you might have heard about the COPA datasets in Cerkno, Torlak and Chakavian dialects which are the stars of </span><a
href="https://sites.google.com/view/vardial-2024/shared-tasks/dialect-copa"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">the DIALECT-COPA unshared task at the VarDial 2024 workshop</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> in Mexico City). Now, we are pleased to announce the first resources for speech technologies for Chakavian micro-dialects of Croatian: </span><a
href="http://hdl.handle.net/11356/1765" style="text-decoration:none;"
moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">the Mići Princ dataset</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> and an </span><a
href="https://huggingface.co/classla/whisper-large-v3-mici-princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">automatic speech recognition model for Chakavian</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, both openly available.</span></font></p>
<p dir="ltr"
style="line-height:1.38;background-color:#ffffff;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 15pt 0pt;"><font
size="2"
face="Arial"><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The </span><a
href="http://hdl.handle.net/11356/1765" style="text-decoration:none;"
moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">Mići Princ dataset</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> is a "text and speech" dialectal translation of Antoine de Saint-Exupéry's "Le Petit Prince" (The Little Prince) into various Chakavian micro-dialects, released by the Udruga Calculus and the Peek&Poke museum</span><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, both in form of a </span><a
href="https://www.peekpoke.hr/mici-princ-an-edition-of-the-little-prince-in-the-chakavian-dialect-book-presentation/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">printed book</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> and an </span><a
href="https://www.peekpoke.hr/mici-princ-the-little-prince-in-the-chakavian-dialect-audio-book-presentation-and-exhibition/"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">audio book</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Almost every character in the book was translated and narrated into a different micro-dialect (for which we would like to thank again the large team of translators and audio book narrators behind this, especially the main translator, Tea Perinčić).</span></font></p>
<p dir="ltr"
style="line-height:1.38;background-color:#ffffff;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 15pt 0pt;"><font
size="2"
face="Arial"><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Following the creation of the Mići Princ dataset, our colleagues Peter Rupnik and Nikola Ljubešić aligned the text and speech to develop the first openly-available dataset for Chakavian automatic-speech recognition (ASR). </span><a
href="http://hdl.handle.net/11356/1765" style="text-decoration:none;"
moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">The dataset is published on the CLARIN.SI repository</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, as well as on </span><a
href="https://huggingface.co/datasets/classla/Mici_Princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">Hugging Face, where you can listen to it</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.</span></font></p>
<font size="2"
face="Arial"><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Moreover, we are pleased to introduce an innovative outcome derived from this dataset: </span><a
href="https://huggingface.co/classla/whisper-large-v3-mici-princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">Whisper-large-v3-mici-princ</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, an automatic speech recognition model for Chakavian. Through fine-tuning OpenAI's Whisper model on the Mići Princ dataset, </span><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">we achieved a great character-error-rate reduction of 66%. You are welcome to </span><a
href="https://huggingface.co/classla/whisper-large-v3-mici-princ"
style="text-decoration:none;" moz-do-not-send="true"><span
style="color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;">try it out on Hugging Face</span></a><span
style="color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">!
Best regards,
The CLASSLA team</span></font></b> </td>
</tr>
<tr>
<td
height="30"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 10px;
min-height: 10px;"
height="10"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table width="640" cellspacing="0" cellpadding="0"
border="0" bgcolor="#e6f4ff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px; min-width:
640px;" width="640" cellspacing="0"
cellpadding="0" border="0"
bgcolor="#e6f4ff" align="center">
<tbody>
<tr>
<td>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td
style="line-height: 20px;
min-height: 20px;"
height="20"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td
style="font-family:
Poppins,sans-serif;
font-size: 14px;
font-weight: bold;
line-height: 21px;
color: #111111;"
align="left"><a
href="https://www.clarin.si/info/k-centre/" target="_blank"
rel="noopener
noreferrer"
moz-do-not-send="true">CLASSLA: The Knowledge Centre for South Slavic
Languages</a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td height="10"> </td>
</tr>
</tbody>
</table>
<table
style="width: 640px;
min-width: 640px;" width="640"
cellspacing="0" cellpadding="0"
border="0" align="center">
<tbody>
<tr>
<td style="padding: 0px 40px;"
align="center">
<table width="100%"
cellspacing="0"
cellpadding="0" border="0"
align="center">
<tbody>
<tr>
<td align="center">
<table
style="width:
267px; min-width:
267px;"
width="267"
cellspacing="0"
cellpadding="0"
border="0"
align="left">
<tbody>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477footerText-10"
style="font-family: Poppins,sans-serif; font-size: 12px; line-height:
18px; color:
#111111;"
align="left">
<p
style="margin-top:
0px;
margin-bottom:
10px;"><a
href="http://clarin.si/" target="_blank" rel="noopener noreferrer"
moz-do-not-send="true">CLARIN.SI</a></p>
<p
style="margin-top:
0px;
margin-bottom:
10px;">Jožef
Stefan
Institute</p>
<p
style="margin-top:
0px;
margin-bottom:
0px;">Jamova
cesta 39,
Ljubljana<br>
Slovenia</p>
</td>
</tr>
<tr>
<td
height="25"> </td>
</tr>
</tbody>
</table>
<table
style="width:
267px; min-width:
267px;"
width="267"
cellspacing="0"
cellpadding="0"
border="0"
align="right">
<tbody>
<tr>
<td
id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477footerUnsubscribeText-10"
style="font-family: Poppins,sans-serif; font-size: 12px; line-height:
18px; color:
#111111;"
align="right">
<p
style="margin-top:
0px;
margin-bottom:
0px;"><br>
<span
style="font-size:
10px;"></span></p>
</td>
</tr>
<tr>
<td
height="10"> </td>
</tr>
<tr>
<td
style="font-family:
Poppins,sans-serif; font-size: 12px; line-height: 18px; color: #111111;"
align="right"> </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div dir="ltr"> </div>
</div>
</div>
<div class="pre"
style="margin: 0; padding: 0; font-family:
monospace;"> </div>
</body>
</html>