feat: added auto duplicated issue and pr detector#402
feat: added auto duplicated issue and pr detector#402Abhinandankaushik wants to merge 9 commits intoapache:mainfrom
Conversation
- Removed all .md doc files (DUPLICATE_DETECTION, HINDI_SUMMARY, etc.) - Removed test scripts (test-local.sh, test-local.ps1) - Simplified CONTRIBUTING.md duplicate detection section - Keep only essential files for detection functionality
There was a problem hiding this comment.
Pull request overview
This PR introduces a GitHub Actions-based automation to detect potentially duplicate issues and pull requests and then label/comment (and optionally close on “exact matches”) to reduce duplicate reports in the repo.
Changes:
- Added a new CI workflow plus a Python-based duplicate detection script and YAML configuration.
- Added contributor documentation noting the duplicate detection behavior.
- Added/updated various site/i18n content files and a
.gitattributesfile for line-ending normalization.
Reviewed changes
Copilot reviewed 11 out of 15 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/duplicate-detector.yml |
New workflow to run duplicate detection on issue/PR open/reopen events. |
.github/scripts/detect-duplicates.py |
Implements similarity-based duplicate detection and applies labels/comments (optional close). |
.github/scripts/requirements.txt |
Python dependencies for the workflow. |
.github/duplicate-detector-config.yml |
Configures thresholds, labels, and other behavior for the detector. |
CONTRIBUTING.md |
Documents that automated duplicate detection is used. |
.gitattributes |
Enforces consistent line endings and marks binaries. |
i18n/en-US/docusaurus-theme-classic/navbar.json |
Adds English i18n strings for navbar. |
i18n/en-US/docusaurus-theme-classic/footer.json |
Adds English i18n strings for footer. |
i18n/en-US/docusaurus-plugin-content-docs/current.json |
Adds docs i18n strings for the current version. |
i18n/en-US/docusaurus-plugin-content-blog/options.json |
Adds blog i18n strings. |
i18n/en-US/docusaurus-plugin-content-blog/authors.yml |
Adds blog author metadata. |
i18n/en-US/docusaurus-plugin-content-docs/current/.keep |
Adds placeholder file to keep directory in git. |
src/pages/home/css/tailwind.css |
Formatting/normalization changes. |
src/hooks/useAOS.tsx |
Formatting/normalization changes. |
src/constants/index.ts |
Formatting/normalization changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 15 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
hey @chaokunyang |
Summary
This PR adds an automated mechanism to detect and handle duplicate issues and pull requests in the repository through the CI/CD workflow.
Problem
Currently, duplicate issues and pull requests are frequently created, which leads to:
There was no automated system in place to proactively detect and flag such duplicates.
What this PR does
This PR introduces an automated duplicate detection mechanism that:
possible-duplicatelabel to suspected duplicates.Alternatives considered
The following alternatives were considered but rejected:
Additional context
Checklist
Fixed issue : #400