Irapuarani at Semeval-2025 Task 10, Subtask 2

This repository provides the source code for the Irapuarani team's participation in SemEval-2025 Task 10, Subtask 2.

DATA

In accordance with the data agreement terms for this task, as stated below, we do not release any data.

However, we make our source code and models available, using this repository as a reference.

The dataset may include content which is protected by copyright of third parties. It may only be used in the context of this shared task, and only for scientific research purposes. The dataset may not be redistributed or shared in part or full with any third party. You may not share you passcode with others or give access to the dataset to unauthorised users. Any other use is explicitly prohibited.

MODELS

All models are available in this Hugging Face collection.

CITING

Gabriel Assis, Lívia de Azevedo, João Vitor de Moraes, Laura Alvarenga, and Aline Paes. 2025. Irapuarani at SemEval-2025 Task 10: Evaluating Strategies Combining Small and Large Language Models for Multilingual Narrative Detection. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 38–48, Vienna, Austria. Association for Computational Linguistics.

BIBTEX

@inproceedings{assis-etal-2025-irapuarani,
    title = "Irapuarani at {S}em{E}val-2025 Task 10: Evaluating Strategies Combining Small and Large Language Models for Multilingual Narrative Detection",
    author = "Assis, Gabriel  and
      de Azevedo, L{\'i}via  and
      de Moraes, João Vitor  and
      Alvarenga, Laura  and
      Paes, Aline",
    editor = "Rosenthal, Sara  and
      Ros{\'a}, Aiala  and
      Ghosh, Debanjan  and
      Zampieri, Marcos",
    booktitle = "Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.semeval-1.7/",
    pages = "38--48",
    ISBN = "979-8-89176-273-2",
    abstract = "This paper presents the Irapuarani team{'}s participation in SemEval-2025 Task 10, Subtask 2, which focuses on hierarchical multi-label classification of narratives from online news articles. We explored three distinct strategies: (1) a direct classification approach using a multilingual Small Language Model (SLM), disregarding the hierarchical structure; (2) a translation-based strategy where texts from multiple languages were translated into a single language using a Large Language Model (LLM), followed by classification with a monolingual SLM; and (3) a hybrid strategy leveraging an SLM to filter domains and an LLM to assign labels while accounting for the hierarchy. We conducted experiments on datasets in all available languages, namely Bulgarian, English, Hindi, Portuguese and Russian. Our results show that Strategy 2 is the most generalizable across languages, achieving test set rankings of 21st in English, 9th in Portuguese and Russian, 7th in Bulgarian, and 10th in Hindi."
}

FUNDING

This research was financed by CNPq (National Council for Scientific and Technological Development), grant 307088/2023-5, FAPERJ -- \textit{Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro}, processes SEI-260003/002930/2024, SEI-260003/000614/2023, and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001. This work was also supported by compute credits from a Cohere Labs Research Grant. These grants are intended to support academic partners conducting research aimed at releasing scientific artifacts and data for socially beneficial projects.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data_eng		data_eng
scorer		scorer
src		src
.env.template		.env.template
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Irapuarani at Semeval-2025 Task 10, Subtask 2

DATA

MODELS

CITING

BIBTEX

FUNDING

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

MeLLL-UFF/irapuarani

Folders and files

Latest commit

History

Repository files navigation

Irapuarani at Semeval-2025 Task 10, Subtask 2

DATA

MODELS

CITING

BIBTEX

FUNDING

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages