Open
Conversation
The noise filter only contained English patterns, causing Chinese greetings, denials, and meta-questions to pass through unfiltered and pollute the memory database. Add Chinese patterns for all three categories: - Denial: 我不记得, 找不到相关记忆, 我没有相关信息, etc. - Meta-question: 你还记得吗, 我之前提到过, etc. - Boilerplate: 你好, 早上好, 好的, 谢谢, etc. Use a length-gated SHORT_BOILERPLATE_PATTERNS strategy to prevent false positives: acknowledgment prefixes (好的, 谢谢) are only filtered when total text ≤ 10 chars, so "谢谢你的帮助" is noise but "谢谢分享,我觉得这个思路很好" is kept. Includes test with 52 cases (38 noise + 8 false-positive guards + 6 integration). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
PR #49 Review SummaryTested on Gateway 2026.3.2 + claude-opus-4-6. Unit tests 52/52 pass, runtime memory_store verified on Discord. One issue to fix before merge:
Everything else looks good. The SHORT_BOILERPLATE + length-gate design is clever and well-tested. Approve after the regex fix. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #48
Problem
noise-filter.tsonly contains English patterns. Chinese greetings (你好), denials (我不记得), meta-questions (你还记得吗), and acknowledgments (好的,谢谢) pass through unfiltered and get stored as memories — polluting retrieval results over time.This is especially impactful with
autoCaptureenabled, as every好的and谢谢becomes a permanent memory entry.Solution
Add Chinese patterns to all three filter categories:
False-positive prevention
Chinese acknowledgment words (
好的,谢谢) often appear at the start of meaningful sentences:好的方案是使用Redis做缓存层— should NOT be filtered谢谢分享,我觉得这个思路很好— should NOT be filteredSolution: length-gated
SHORT_BOILERPLATE_PATTERNS— prefix matches like好的/谢谢are only treated as noise when total text length ≤ 10 characters. Short = filler, long = real content.Test Results
52/52 tests pass:
Changes
src/noise-filter.ts— Add Chinese patterns +SHORT_BOILERPLATE_PATTERNSwith length gatetest/noise-filter-chinese.mjs— New test: 52 cases covering all categories + false-positive protectionNo new dependencies. Existing
npm testpasses.