Skip to content

RhizoNymph/kpe-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kpe-bench

This is a benchmark for keyphrase extraction using LLMs. It uses the Krapivin dataset from here which was obtaining from the huggingface dataset midas/krapivin.

It is in development and being run, and will only be run locally and on cheap models because benchmarking is expensive when your input dataset is 27M tokens over 2.6k messages.

The benchmark itself is computing the pairwise similarity of the generated and ground truth keyphrases. It embeds keyphrases uses hungarian optimal matching on cosine similarity to the ground truth keyphrases because what I care about is semantic similarity not exact matches.

About

benchmark for keyphrase extraction using LLMs and an unsloth GRPO trainer for optimizing on the benchmark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages