From taja.kuzman at ijs.si Fri Oct 17 11:13:04 2025 From: taja.kuzman at ijs.si (=?UTF-8?Q?Taja_Kuzman_Punger=C5=A1ek?=) Date: Fri, 17 Oct 2025 11:13:04 +0200 Subject: [CLASSLA] Parliamentary ParlaCAP Dataset and CAP Topic Classifier Message-ID: CLASSLA Mailing List CLARIN.SI is pleased to announce the release of the ParlaCAP dataset : an extension of the ParlaMint 5.0 collection enriched with sentiment and topic annotations, as well as extended metadata on parties and democracies. The dataset contains around 8 million speeches from 28 European parliaments, and is provided in a tabular format, enhancing the usability of the ParlaMint corpora for social and political science research. As part of the OSCARS ParlaCAP project , the dataset was published through the Croatian CESSDA node CROSSDA , promoting thereby collaboration between infrastructures. We also released the multilingual topic classifier using the CAP (Comparative Agendas Project) labels, and tutorials for analysing ParlaCAP data in Python . More information is available here . CLASSLA: The Knowledge Centre for South Slavic Languages CLARIN.SI Jožef Stefan Institute Jamova cesta 39, Ljubljana Slovenia -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: EgnHzq0OLDrCZSz9.png Type: image/png Size: 174960 bytes Desc: not available URL: