Skip to content

Conversation

@ggerganov
Copy link
Member

rel #19305

Based on my analysis in #19305 (comment) I am just restoring the old chunking logic. This seems to resolve the reported issue.

Note that I am blindly restoring the old code, and still haven't understood in details what this logic actually does. So extra look, logprobs tests and validation would be needed before we merge this. Draft for now.

@github-actions github-actions bot added the model Model specific label Feb 4, 2026
@ggerganov
Copy link
Member Author

Superseded by #19324

@ggerganov ggerganov closed this Feb 4, 2026
ggml_row_size(core_attn_out->type, S_v),
ggml_row_size(core_attn_out->type, S_v * chunk_size * n_chunks),
ggml_row_size(core_attn_out->type, S_v * chunk_size * n_chunks * H_v), 0);
output_tokens = ggml_cont(ctx0, output_tokens);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngxson Btw, this cont seems redundant still.

Anyway, not very important. I think there is a lot to improve in this graph - will take a look in the next days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant