Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy).
- Dataset Loading: Support for CSV and Excel files
- Dataset Preview: Interactive table view with up to 1000 rows displayed
- Data Filtering: Apply filters using various operators (==, !=, >, <, >=, <=, contains, startswith, endswith)
- Descriptive Statistics: Comprehensive statistical analysis including:
- Mean, median, standard deviation, min, max
- Variance, skewness, kurtosis
- Missing value counts
- Categorical value distributions
- Visualization: Multiple plot types:
- Histograms
- Box plots
- Scatter plots
- Correlation heatmaps
- Bar plots
- Hypothesis Testing: Statistical tests including:
- Independent T-Test
- Paired T-Test
- Chi-Square Test
- ANOVA
- Correlation Test (Pearson & Spearman)
- Export Tools:
- Export filtered data to CSV/Excel
- Export statistics to text file
- Save plots as PNG/PDF
- Clone the repository:
git clone https://github.com/BaseMax/qt-dataset-explorer.git
cd qt-dataset-explorer- Install required dependencies:
pip install -r requirements.txtRun the application:
python dataset_explorer.py-
Load a Dataset:
- Click "Load Dataset" button
- Select a CSV or Excel file
- The dataset will be loaded and displayed in the Preview tab
-
Filter Data:
- Go to the "Preview & Filter" tab
- Select a column from the dropdown
- Choose an operator (==, !=, >, <, contains, etc.)
- Enter a value to filter by
- Click "Apply Filter"
- Click "Reset Filter" to show all data again
-
View Statistics:
- Go to the "Statistics" tab
- Statistics are automatically calculated when you load data
- Click "Refresh Statistics" to update after filtering
-
Generate Plots:
- Go to the "Plots" tab
- Select a plot type (Histogram, Box Plot, Scatter Plot, etc.)
- Choose columns for X and Y axes (if applicable)
- Click "Generate Plot"
- Click "Save Plot" to export as PNG or PDF
-
Run Hypothesis Tests:
- Go to the "Hypothesis Testing" tab
- Select a test type
- Choose columns to test
- Click "Run Test"
- View detailed results including p-values and conclusions
-
Export Data:
- Click "Export Data" to save filtered dataset
- Click "Export Statistics" to save statistical summary
A sample dataset (sample_data.csv) is included for testing the application. It contains employee information with columns: Name, Age, Gender, Salary, Department, and Experience.
- Python 3.7+
- PyQt5 5.15.10
- pandas 2.0.3
- matplotlib 3.7.2
- seaborn 0.12.2
- scipy 1.11.2
- numpy 1.24.3
- openpyxl 3.1.2
See LICENSE file for details.
- GitHub: @BaseMax
Contributions are welcome! Please feel free to submit a Pull Request.


