Skip to content

A question of the calculatePerplexity #5

@Ethan-Chen-plus

Description

@Ethan-Chen-plus

I know that the following can calculate loss
image
However, why labels be input_id? After read the paper, maybe I think the code should be:

def calculatePerplexity(sentence, model1, model2, tokenizer):
    """
    exp(loss)
    """
    input_ids = torch.tensor(tokenizer.encode(sentence)).unsqueeze(0)
    input_ids = input_ids.to(device)
    outputs_ids = model2.generate(**input_ids, **gen_kwargs).to(device)
    with torch.no_grad():
        outputs = model1(input_ids, labels=output_ids)
    loss, logits = outputs[:2]
    return torch.exp(loss)

this can test and verify whether the output of the two models is the same. If different, maybe one of the model memorizes the train data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions