add conversation history summarization#2730
add conversation history summarization#2730blublinsky wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
ceb314a to
9b9a793
Compare
onmete
left a comment
There was a problem hiding this comment.
My understanding of how the summarization should work is:
- Retrieve full conversation history (as we do today)
- At prompt preparation time (in _prepare_prompt(), where limit_conversation_history() is called), check if history fits in available tokens
- If it doesn't fit: summarize ALL messages via an LLM call, inject the summary into the system prompt
- Store the summary in cache, replacing the original history
- Next request: summary is retrieved as the "history" - it's small, always fits
- This is a simple "summarize everything when needed" approach - no need to be clever about which messages to keep.
Essentially, what we are looking for is to replace this line https://github.com/openshift/lightspeed-service/blob/main/ols/src/query_helpers/docs_summarizer.py#L236, with summarization feature.
The PR's approach tries to optimize before fetching (limit what we retrieve), but with summarization, this optimization becomes unnecessary.
Once we summarize, the history is replaced with a compact summary. There's no scenario where we have "too many messages to fetch" because either:
- history hasn't been summarized yet (small enough to fetch)
- history was summarized (only summary exists)
Is my understanding reasonable? Is there a scenario that discards this? Can we try to not add more responsibilities to (already bloated) docs summarizer? :)
9b9a793 to
dad939c
Compare
summary.
|
|
/retest |
dad939c to
e402813
Compare
| Conversation history: | ||
| {full_conversation} | ||
|
|
||
| Summary:""" |
There was a problem hiding this comment.
Please is probably a waste of tokens :P
I found this prompt somewhere:
You are an expert conversation summarizer. Your job is to create detailed, comprehensive summaries of chat conversations.
Your summary should include:
- What were the main subjects covered?
- Any agreements, choices, or conclusions made
- Revealed preferences, likes, dislikes, or constraints
- Significant Q&A exchanges
- Tasks mentioned or to be completed
Be comprehensive but concise. Focus on information that would be valuable for continuing the conversation later. Write in a natural, narrative style that another AI can easily understand and use as context.
Do not include:
- Pleasantries or greetings unless they reveal something important
- Repetitive information
5b5f910 to
9679bdc
Compare
9679bdc to
b377810
Compare
928ce14 to
760470a
Compare
|
/retest |
1 similar comment
|
/retest |
36c48ac to
6dd787b
Compare
|
/retest |
2 similar comments
|
/retest |
|
/retest |
6dd787b to
8a71cd1
Compare
|
/override "ci/prow/ols-evaluation" |
|
@blublinsky: Overrode contexts on behalf of blublinsky: ci/prow/ols-evaluation DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@blublinsky: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
This PR is implemented as 2 commits
What this implementation is missing:
Type of change
Related Tickets & Documents
OLS-2500
Checklist before requesting a review
Testing