Skip to content

add conversation history summarization#2730

Open
blublinsky wants to merge 1 commit intoopenshift:mainfrom
blublinsky:history-retrival
Open

add conversation history summarization#2730
blublinsky wants to merge 1 commit intoopenshift:mainfrom
blublinsky:history-retrival

Conversation

@blublinsky
Copy link
Contributor

@blublinsky blublinsky commented Feb 3, 2026

Description

This PR is implemented as 2 commits

  1. prepares for the implementation of conversation summarization
  2. Actual summarization implementation
  3. Summarization e2e test

What this implementation is missing:

  1. Configuration for entries_to_keep - can be exposed in configuration. Is it important? now its 5
  2. Asynchronous summarization - performance optimization, but can be quite complex

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

@openshift-ci openshift-ci bot requested review from raptorsun and xrajesh February 3, 2026 13:33
@openshift-ci
Copy link

openshift-ci bot commented Feb 3, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xrajesh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@blublinsky blublinsky force-pushed the history-retrival branch 3 times, most recently from ceb314a to 9b9a793 Compare February 3, 2026 15:03
Copy link
Contributor

@onmete onmete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of how the summarization should work is:

  1. Retrieve full conversation history (as we do today)
  2. At prompt preparation time (in _prepare_prompt(), where limit_conversation_history() is called), check if history fits in available tokens
  3. If it doesn't fit: summarize ALL messages via an LLM call, inject the summary into the system prompt
  4. Store the summary in cache, replacing the original history
  5. Next request: summary is retrieved as the "history" - it's small, always fits
  6. This is a simple "summarize everything when needed" approach - no need to be clever about which messages to keep.

Essentially, what we are looking for is to replace this line https://github.com/openshift/lightspeed-service/blob/main/ols/src/query_helpers/docs_summarizer.py#L236, with summarization feature.

The PR's approach tries to optimize before fetching (limit what we retrieve), but with summarization, this optimization becomes unnecessary.
Once we summarize, the history is replaced with a compact summary. There's no scenario where we have "too many messages to fetch" because either:

  • history hasn't been summarized yet (small enough to fetch)
  • history was summarized (only summary exists)

Is my understanding reasonable? Is there a scenario that discards this? Can we try to not add more responsibilities to (already bloated) docs summarizer? :)

@blublinsky
Copy link
Contributor Author

My understanding of how the summarization should work is:

  1. Retrieve full conversation history (as we do today)
    Not quite. We retrieve only partial, based on the token budget. When retrieving, we check whether the history is too large, which we will use as a signal for summarization.
  2. At prompt preparation time (in _prepare_prompt(), where limit_conversation_history() is called), check if history fits in available tokens
    It is checked immediately after retrieval
  3. If it doesn't fit: summarize ALL messages via an LLM call, inject the summary into the system prompt
    This is next step
  4. Store the summary in cache, replacing the original history
    next step
  5. Next request: summary is retrieved as the "history" - it's small, always fits
  6. This is a simple "summarize everything when needed" approach - no need to be clever about which messages to keep.

Essentially, what we are looking for is to replace this line https://github.com/openshift/lightspeed-service/blob/main/ols/src/query_helpers/docs_summarizer.py#L236, with summarization feature.

The PR's approach tries to optimize before fetching (limit what we retrieve), but with summarization, this optimization becomes unnecessary. Once we summarize, the history is replaced with a compact summary. There's no scenario where we have "too many messages to fetch" because either:
It actually is - its our defence mechanism

  • history hasn't been summarized yet (small enough to fetch)
  • history was summarized (only summary exists)

Is my understanding reasonable? Is there a scenario that discards this? Can we try to not add more responsibilities to (already bloated) docs summarizer? :)

summary.
What is done in this Pr:

  1. Optimization of read and getting the signal that the history is too large, instead of checking its size every time. it is moved to the doc summarizer, because it is a place where we compute the available token budget.
  2. The actual summarization is a simple async function that is trivial to implement based on this
  3. Split these 2 to make PR smaller

@blublinsky
Copy link
Contributor Author

/retest

@blublinsky blublinsky changed the title Refactor history retrieval in preparation to summarization add conversation history summarization Feb 4, 2026
Conversation history:
{full_conversation}

Summary:"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please is probably a waste of tokens :P

I found this prompt somewhere:

You are an expert conversation summarizer. Your job is to create detailed, comprehensive summaries of chat conversations.

Your summary should include:
- What were the main subjects covered?
- Any agreements, choices, or conclusions made
- Revealed preferences, likes, dislikes, or constraints
- Significant Q&A exchanges
- Tasks mentioned or to be completed

Be comprehensive but concise. Focus on information that would be valuable for continuing the conversation later. Write in a natural, narrative style that another AI can easily understand and use as context.

Do not include:
- Pleasantries or greetings unless they reveal something important
- Repetitive information

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2026
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2026
@blublinsky blublinsky force-pushed the history-retrival branch 4 times, most recently from 928ce14 to 760470a Compare February 6, 2026 09:15
@blublinsky
Copy link
Contributor Author

/retest

1 similar comment
@blublinsky
Copy link
Contributor Author

/retest

@blublinsky blublinsky force-pushed the history-retrival branch 4 times, most recently from 36c48ac to 6dd787b Compare February 10, 2026 13:37
@blublinsky
Copy link
Contributor Author

/retest

2 similar comments
@blublinsky
Copy link
Contributor Author

/retest

@blublinsky
Copy link
Contributor Author

/retest

@blublinsky
Copy link
Contributor Author

/override "ci/prow/ols-evaluation"
Granite issue

@openshift-ci
Copy link

openshift-ci bot commented Feb 13, 2026

@blublinsky: Overrode contexts on behalf of blublinsky: ci/prow/ols-evaluation

Details

In response to this:

/override "ci/prow/ols-evaluation"
Granite issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link

openshift-ci bot commented Feb 13, 2026

@blublinsky: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants