Skip to content

BaseMax/qt-dataset-explorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qt Dataset Explorer

Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy).

Qt Dataset Explorer

Qt Dataset Explorer

Qt Dataset Explorer

Features

  • Dataset Loading: Support for CSV and Excel files
  • Dataset Preview: Interactive table view with up to 1000 rows displayed
  • Data Filtering: Apply filters using various operators (==, !=, >, <, >=, <=, contains, startswith, endswith)
  • Descriptive Statistics: Comprehensive statistical analysis including:
    • Mean, median, standard deviation, min, max
    • Variance, skewness, kurtosis
    • Missing value counts
    • Categorical value distributions
  • Visualization: Multiple plot types:
    • Histograms
    • Box plots
    • Scatter plots
    • Correlation heatmaps
    • Bar plots
  • Hypothesis Testing: Statistical tests including:
    • Independent T-Test
    • Paired T-Test
    • Chi-Square Test
    • ANOVA
    • Correlation Test (Pearson & Spearman)
  • Export Tools:
    • Export filtered data to CSV/Excel
    • Export statistics to text file
    • Save plots as PNG/PDF

Installation

  1. Clone the repository:
git clone https://github.com/BaseMax/qt-dataset-explorer.git
cd qt-dataset-explorer
  1. Install required dependencies:
pip install -r requirements.txt

Usage

Run the application:

python dataset_explorer.py

Quick Start Guide

  1. Load a Dataset:

    • Click "Load Dataset" button
    • Select a CSV or Excel file
    • The dataset will be loaded and displayed in the Preview tab
  2. Filter Data:

    • Go to the "Preview & Filter" tab
    • Select a column from the dropdown
    • Choose an operator (==, !=, >, <, contains, etc.)
    • Enter a value to filter by
    • Click "Apply Filter"
    • Click "Reset Filter" to show all data again
  3. View Statistics:

    • Go to the "Statistics" tab
    • Statistics are automatically calculated when you load data
    • Click "Refresh Statistics" to update after filtering
  4. Generate Plots:

    • Go to the "Plots" tab
    • Select a plot type (Histogram, Box Plot, Scatter Plot, etc.)
    • Choose columns for X and Y axes (if applicable)
    • Click "Generate Plot"
    • Click "Save Plot" to export as PNG or PDF
  5. Run Hypothesis Tests:

    • Go to the "Hypothesis Testing" tab
    • Select a test type
    • Choose columns to test
    • Click "Run Test"
    • View detailed results including p-values and conclusions
  6. Export Data:

    • Click "Export Data" to save filtered dataset
    • Click "Export Statistics" to save statistical summary

Sample Data

A sample dataset (sample_data.csv) is included for testing the application. It contains employee information with columns: Name, Age, Gender, Salary, Department, and Experience.

Requirements

  • Python 3.7+
  • PyQt5 5.15.10
  • pandas 2.0.3
  • matplotlib 3.7.2
  • seaborn 0.12.2
  • scipy 1.11.2
  • numpy 1.24.3
  • openpyxl 3.1.2

License

See LICENSE file for details.

Author

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy). Interactive desktop app for exploring datasets and statistics visually. Built using Python, PyQt5, and scientific libraries.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages