Skip to content

Conversation

@federetyk
Copy link
Contributor

Addresses #33

Description

This PR generalizes dataset indexing within tasks from Language enum to arbitrary string identifiers (dataset_id). The current architecture limits each task to at most one dataset per language, which prevents supporting tasks with multiple monolingual datasets per language, cross-lingual datasets, or multilingual datasets.

The refactor introduces a languages_to_dataset_ids() method with a default 1:1 mapping that preserves backward compatibility for existing tasks. Tasks that require more complex dataset structures can override this method to return custom identifiers. A new get_dataset_language() method maps datasets back to their language for proper per-language result aggregation, returning None for cross-lingual or multilingual datasets.

Changes:

  • Rename lang_datasets: dict[Language, Dataset] to datasets: dict[str, Dataset] in Task base class
  • Add languages_to_dataset_ids(languages) -> list[str] method with default backward-compatible mapping
  • Rename load_monolingual_data(language, split) to load_dataset(dataset_id, split) across all tasks
  • Add get_dataset_language(dataset_id) -> Language | None method for per-language aggregation
  • Add language field to MetricsResult to track dataset language
  • Update _aggregate_per_language() to group by the language field, skipping datasets marked as cross-lingual or multilingual
  • Update all task implementations to use the new method signature
  • Add unit test for multi-dataset task scenarios
  • Fix minor issues in some files in examples/

All tests pass, and the output of examples/run_multiple_models.py produces results consistent with the main branch.

Checklist

  • Added new tests for new functionality
  • Tested locally with example tasks
  • Code follows project style guidelines
  • Documentation updated
  • No new warnings introduced

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant