feat: add new metric: Semantic R-precision by shamira-venturini · Pull Request #3 · uclanlp/KPEval

shamira-venturini · 2025-05-09T11:01:16Z

This pull request introduces Semantic R-Precision (SemR-p), a novel keyphrase evaluation metric designed to jointly assess semantic relevance and ranking quality. The metric, its motivation, and comprehensive evaluation are detailed in our paper: https://www.researchgate.net/publication/391552955_Meaning_in_Order_Order_in_Meaning_Semantic_R-precision_for_Keyphrase_Evaluation.

Summary of Changes:

Implemented Semantic R-Precision (SemR-p):
- Added the core logic for SemR-p calculation within metrics/semantic_matching_metric.py.
- SemR-p builds on the R-Precision framework and SemP, SemR and SemF1, incorporating exact stem matching and semantic similarity scoring (using Sentence Transformers and averaging over top_k references).
- It is calculated when the semantic_matching metric group is run and uses the new top_k parameter (defaulting to 3) in the .gin config for SemanticMatchingMetric.
Added Data Retrieval Utility:
* Included a new script doc_retriever.py to facilitate easy loading of source, target, and prediction data for specific document examples, by inputting dataset, modeland, doc_id, aiding qualitative analysis.

Design Rationale for SemR-p Integration:

SemR-p was integrated into SemanticMatchingMetric.py to efficiently reuse the existing Sentence Transformer embedding model infrastructure and align with KPEval's current approach of grouping metrics with the same underlying semantic similarity calculation method.

How to Use SemR-p (in this fork):

When running run_evaluation.py with metric_id='semantic_matching', SemR-p scores will be output under the key semantic_r_precision.
The top_k parameter for SemR-p can be configured in the .gin file within the SemanticMatchingMetric.top_k setting.

We believe these additions will be valuable to the KPEval toolkit and the broader keyphrase evaluation community.

…topk-configuration-1 Update sample_config_kptimes.gin

…topk-configuration Update sample_config_kp20k.gin

…implementation Update sem_matching_metric.py

shamira-venturini added 12 commits May 8, 2025 18:24

Update README.md

bcf80d0

Update requirements.txt

c3f4ead

Update sem_matching_metric.py

dd7e773

Update sample_config_kp20k.gin

2ade0dd

Update sample_config_kptimes.gin

8113683

retrieve and inspect models output

6001757

Update doc_retriever.py

2c15bd3

Update README.md

3a17198

Update README.md

5ccc77a

Merge pull request #1 from shamira-venturini/shamira-venturini-semrp-…

3d3effc

…topk-configuration-1 Update sample_config_kptimes.gin

Merge pull request #2 from shamira-venturini/shamira-venturini-semrp-…

65e2c43

…topk-configuration Update sample_config_kp20k.gin

Merge pull request #3 from shamira-venturini/shamira-venturini-semrp-…

647d85a

…implementation Update sem_matching_metric.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add new metric: Semantic R-precision#3

feat: add new metric: Semantic R-precision#3
shamira-venturini wants to merge 12 commits intouclanlp:mainfrom
shamira-venturini:main

shamira-venturini commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shamira-venturini commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant