-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
I know that the following can calculate loss

However, why labels be input_id? After read the paper, maybe I think the code should be:
def calculatePerplexity(sentence, model1, model2, tokenizer):
"""
exp(loss)
"""
input_ids = torch.tensor(tokenizer.encode(sentence)).unsqueeze(0)
input_ids = input_ids.to(device)
outputs_ids = model2.generate(**input_ids, **gen_kwargs).to(device)
with torch.no_grad():
outputs = model1(input_ids, labels=output_ids)
loss, logits = outputs[:2]
return torch.exp(loss)
this can test and verify whether the output of the two models is the same. If different, maybe one of the model memorizes the train data.
Metadata
Metadata
Assignees
Labels
No labels