<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="caret-color: rgb(0, 0, 0); font-family: Verdana, Geneva, sans-serif;"><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class=""><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td align="center" class="" style="padding: 0px 40px;"><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class=""><tbody class=""><tr class=""><td align="left" valign="middle" width="110" class=""><img id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477logoBlock-4" class="gmail_canned_response_image" border="0" style="display: block;" apple-inline="yes" src="cid:65E041C1-9E15-46B2-B001-60AE2E089AAF@Home"></td><td width="20" height="1" class=""> </td><td align="right" valign="middle" class=""><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class=""><tbody class=""><tr class=""><td align="right" class="" style="font-family: Poppins, sans-serif; font-size: 21px; line-height: 31.5px; font-weight: bold; color: rgb(0, 128, 173);">CLASSLA Mailing List</td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td height="10" class="" style="line-height: 10px; min-height: 10px;"> </td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class=""><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td align="center" class=""><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class="" style="border-top-width: 3px; border-top-style: double; border-top-color: rgb(237, 237, 243); border-collapse: initial;"><tbody class=""><tr class=""><td height="0" class="" style="line-height: 0px; min-height: 0px;"> </td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class=""><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#ffffff" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td height="10" class="" style="line-height: 10px; min-height: 10px;"> </td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td align="center" class="" style="padding: 0px 40px;"><table border="0" width="560" cellspacing="0" cellpadding="0" align="center" class="" style="border-top-left-radius: 2px; border-top-right-radius: 2px; border-bottom-right-radius: 2px; border-bottom-left-radius: 2px;"><tbody class=""><tr class=""><td align="center" bgcolor="#FCFCFC" class="" style="padding: 0px 40px; border: 1px solid rgb(230, 230, 230); border-top-left-radius: 2px; border-top-right-radius: 2px; border-bottom-right-radius: 2px; border-bottom-left-radius: 2px;"><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class=""><tbody class=""><tr class=""><td height="30" class=""> </td></tr><tr class=""><td id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477bodyText-8" class="" style="font-family: Poppins, sans-serif; font-size: 14px; line-height: 21px; color: rgb(0, 0, 0);"><p class="" style="margin-top: 0px; margin-bottom: 10px; line-height: 21px;">Hi, for the last time in 2021! </p><p class="">We are so happy to wrap-up this recent surge in CLASSLA reports (there was, and still is, quite a lot to catch up on) — with the news on the first open speech-to-text system for Croatian that we have developed in recent weeks. You might have heard that CLASSLA was working lightly, but persistently, on entering the world of speech throughout 2021 and this is the first tangible result of these bootstrapping efforts. There was some real bootstrapping needed as not a minute of training data was available before we started our journey! And we have just started, so expect many exciting developments in 2022.</p><p class="">You can check out the system at <a href="https://huggingface.co/classla/wav2vec2-xls-r-parlaspeech-hr" class="">https://huggingface.co/classla/wav2vec2-xls-r-parlaspeech-hr</a>. Feel free to try out the examples, but also upload or record your own speech. The system is currently based on (only) 72 hours of transcripts coming from the Croatian parliament, achieving already a rather low word error rate of 13% and character error rate of 5% (yes, in-domain test data). The system is an initial proof-of-concept, but can already be very useful on good-quality speech.</p><p class="">We are continuing our efforts in improving and extending the current dataset, already branded as ParlaSpeech-HR, and plan to publish its initial version in early 2022. We furthermore plan to pilot and improve our system on more demanding tasks, such as transcription of vernaculars. If you have interesting use cases, especially with existing recordings and transcripts for model adaptation, drop us an e-mail!</p><p class="">The presented results are joint effort of (in order of appearance) Nikola Ljubešić, Ivo-Pavao Jazbec, Vuk Batanović, Lenka Bajčetić, Danijel Korzinek and Peter Rupnik. These results would not have been possible without a wider collaboration around the ParlaMint project, and for that Darja Fišer, Tomaž Erjavec, Maciej Ogrodniczuk and Petya Osenova are to be thanked. Together we really are stronger!</p><div class="">We wish you holidays full of peace and joy and a collaborative 2022!</div><div class=""><br class=""></div><div class="">The CLASSLA team</div></td></tr><tr class=""><td height="30" class=""><br class=""> </td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td height="10" class="" style="line-height: 10px; min-height: 10px;"> </td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#e6f4ff" class=""><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" bgcolor="#e6f4ff" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td class=""><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td height="20" class="" style="line-height: 20px; min-height: 20px;"> </td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td align="center" class="" style="padding: 0px 40px;"><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class=""><tbody class=""><tr class=""><td align="left" class="" style="font-family: Poppins, sans-serif; font-size: 14px; font-weight: bold; line-height: 21px; color: rgb(17, 17, 17);"><a href="https://www.clarin.si/info/k-centre/" target="_blank" rel="noopener noreferrer" class="">CLASSLA: The Knowledge Centre for South Slavic Languages</a></td></tr></tbody></table></td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td height="10" class=""> </td></tr></tbody></table><table border="0" width="640" cellspacing="0" cellpadding="0" align="center" class="" style="width: 640px; min-width: 640px;"><tbody class=""><tr class=""><td align="center" class="" style="padding: 0px 40px;"><table border="0" width="100%" cellspacing="0" cellpadding="0" align="center" class=""><tbody class=""><tr class=""><td align="center" class=""><table border="0" width="267" cellspacing="0" cellpadding="0" align="left" class="" style="width: 267px; min-width: 267px;"><tbody class=""><tr class=""><td id="m_925030267947577449gmail-m_6504557075424313283gmail-m_-5089897522223699477footerText-10" align="left" class="" style="font-family: Poppins, sans-serif; font-size: 12px; line-height: 18px; color: rgb(17, 17, 17);"><p class="" style="margin-top: 0px; margin-bottom: 10px;"><a href="http://clarin.si/" target="_blank" rel="noopener noreferrer" class="">CLARIN.SI</a></p><p class="" style="margin-top: 0px; margin-bottom: 10px;">Jožef Stefan Institute</p><div class="" style="margin-top: 0px; margin-bottom: 0px;">Jamova cesta 39, Ljubljana<br class="">Slovenia</div></td></tr><tr class=""></tr></tbody></table><br class=""></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></body></html>