A Python toolkit for analyzing machine learning models and datasets.
-
Updated
Sep 8, 2023 - Python
A Python toolkit for analyzing machine learning models and datasets.
Final Year Project as Deletion of Duplicated data using Machine learning project with source code and Report.
Data quality analysis of DermaMNIST (MedMNIST), HAM10000, and Fitzpatrick17k datasets
A powerful machine learning based tool for detecting, analyzing, and removing duplicates in CSV datasets. Includes text similarity detection, numeric near-duplicate clustering, ML classification, visual analytics, and data cleaning. Features both Streamlit and Flask apps with ngrok support for easy deployment.
Add a description, image, and links to the data-duplication topic page so that developers can more easily learn about it.
To associate your repository with the data-duplication topic, visit your repo's landing page and select "manage topics."