Cannot interpret fed results

Hello,
I've run the example provided in fed_demo.py and I am finding difficulty in interpreting the results. The scores I got are:

{'interesting': -0.28983132044474313, 'engaging': -0.40840943654378226, 'specific': -0.22960980733235647, 'relevant': 7.028880596160889, 'correct': 7.123448371887207, 'semantically appropriate': 0.2597320079803467, 'understandable': 0.21886169910430908, 'fluent': 0.23042782147725394, 'coherent': 7.030221144358317, 'error recovery': 6.849422454833984, 'consistent': 7.3398823738098145, 'diverse': 7.251625696818034, 'depth': 7.140579700469971, 'likeable': -0.23120896021525006, 'understand': 7.056127548217773, 'flexible': -0.09475564956665039, 'informative': -0.16989962259928415, 'inquisitive': -0.34922027587890625}

In the paper, all the scores are in the ranges: 1-3, 0-1, or 1-5. For the above scores, I don't know what is the upper and lower bounds. Also in fed.py, the score is calculated by: scores[metric] = (low_score - high_score)
Does this mean that a negative score is a good thing? Please advise on how to interpret these scores.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot interpret fed results #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cannot interpret fed results #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions