Skip to content
#

data-duplication

Here are 4 public repositories matching this topic...

Language: All
Filter by language
Data-Duplication-Removal-using-Machine-Learning
Data-Duplication-Remover-ML

A powerful machine learning based tool for detecting, analyzing, and removing duplicates in CSV datasets. Includes text similarity detection, numeric near-duplicate clustering, ML classification, visual analytics, and data cleaning. Features both Streamlit and Flask apps with ngrok support for easy deployment.

  • Updated Dec 27, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the data-duplication topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-duplication topic, visit your repo's landing page and select "manage topics."

Learn more