Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ notebooks/.env
notebooks/.DS_Store
.DS_Store
*.in
.virtual_documents/
anaconda_projects/
98 changes: 34 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,47 @@
# Project overview
...
## 💤 Sleep Health & Lifestyle Analysis
#### Business Case: Predicting Sleep Disorders

# Installation

1. **Clone the repository**:
### 📌 Project Overview
This project analyzes a Sleep Health & Lifestyle dataset to identify key factors associated with sleep disorders (Insomnia and Sleep Apnea).
The goal is to understand how lifestyle, physiological metrics, and stress levels contribute to sleep disorder risk and to support early intervention strategies.

```bash
git clone https://github.com/YourUsername/repository_name.git
```

2. **Install UV**
### 🎯 Business Problem
Sleep disorders increase medical costs, stress, and reduce quality of life.
Identifying high-risk individuals early enables:
- Preventive healthcare
- Reduced diagnosis costs
- Targeted wellbeing programs

If you're a MacOS/Linux user type:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### ❓ Research Questions
- Which lifestyle and physiological factors correlate with sleep disorders?
- Can stress, BMI, activity, and sleep patterns predict disorder presence?
- What differentiates insomnia from sleep apnea?

If you're a Windows user open an Anaconda Powershell Prompt and type :

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
### 🧪 Hypotheses
#### Primary Hypothesis (H1)
Individuals with high stress, high BMI, low sleep duration, and poor sleep quality are significantly more likely to have a sleep disorder.
**H0:** Sleep disorder presence is independent of these factors.

3. **Create an environment**
#### Secondary Hypotheses
- **H1a:** Obesity increases likelihood of sleep apnea.
- **H1b:** Higher stress correlates with insomnia.
- **H1c:** Sleeping <6 hours increases disorder risk.
- **H1d:** Low physical activity (<40 min/day) increases disorder prevalence.
- **H1e:** High heart rate / BP increases apnea risk.

```bash
uv venv
```

3. **Activate the environment**
### 🧹 Data Cleaning Summary
- Checked for missing values, incorrect data types, and duplicates.
- Standardized column names and trimmed string formatting.
- Normalized inconsistent categories (e.g., "Normal" vs "Normal Weight").
- Split Blood Pressure into numeric Systolic and Diastolic columns.
- Converted all relevant columns to numeric types.
- Filled missing Sleep Disorder values with "No Disorder".
- Removed duplicate rows (242 duplicates dropped).

If you're a MacOS/Linux user type (if you're using a bash shell):
Final result: a clean, consistent dataset ready for analysis.

```bash
source ./venv/bin/activate
```

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):

```bash
source ./venv/bin/activate.csh
```

If you're a Windows user type:

```bash
.\venv\Scripts\activate
```

4. **Install dependencies**:

```bash
uv pip install -r requirements.txt
```

# Questions
...

# Dataset
...

## Main dataset issues

- ...
- ...
- ...

## Solutions for the dataset issues
...

# Conclussions
...

# Next steps
...
Binary file added archive.zip
Binary file not shown.
4 changes: 2 additions & 2 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
input_data:
file: "../data/raw/raw_data_file.csv"
file: "../data/raw/sleep_health_and_lifestyle_dataset.csv"

output_data:
file: "../data/clean/cleaned_data_file.csv"
file: "../data/clean/sleep_health_project_clean.csv"
Empty file removed data/clean/cleaned_data_file.csv
Empty file.
133 changes: 133 additions & 0 deletions data/clean/sleep_health_project_clean.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
person_id,gender,age,occupation,sleep_duration,quality_of_sleep,physical_activity_level,stress_level,bmi_category,blood_pressure,heart_rate,daily_steps,sleep_disorder,systolic,diastolic
1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,No Disorder,126,83
2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,No Disorder,125,80
4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea,140,90
6,Male,28,Software Engineer,5.9,4,30,8,Obese,140/90,85,3000,Insomnia,140,90
7,Male,29,Teacher,6.3,6,40,7,Obese,140/90,82,3500,Insomnia,140,90
8,Male,29,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
11,Male,29,Doctor,6.1,6,30,8,Normal,120/80,70,8000,No Disorder,120,80
14,Male,29,Doctor,6.0,6,30,8,Normal,120/80,70,8000,No Disorder,120,80
17,Female,29,Nurse,6.5,5,40,7,Normal,132/87,80,4000,Sleep Apnea,132,87
18,Male,29,Doctor,6.0,6,30,8,Normal,120/80,70,8000,Sleep Apnea,120,80
19,Female,29,Nurse,6.5,5,40,7,Normal,132/87,80,4000,Insomnia,132,87
20,Male,30,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
21,Male,30,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
25,Male,30,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
26,Male,30,Doctor,7.9,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
31,Female,30,Nurse,6.4,5,35,7,Normal,130/86,78,4100,Sleep Apnea,130,86
32,Female,30,Nurse,6.4,5,35,7,Normal,130/86,78,4100,Insomnia,130,86
33,Female,31,Nurse,7.9,8,75,4,Normal,117/76,69,6800,No Disorder,117,76
34,Male,31,Doctor,6.1,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
35,Male,31,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
38,Male,31,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
44,Male,31,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
50,Male,31,Doctor,7.7,7,75,6,Normal,120/80,70,8000,Sleep Apnea,120,80
51,Male,32,Engineer,7.5,8,45,3,Normal,120/80,70,8000,No Disorder,120,80
53,Male,32,Doctor,6.0,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
54,Male,32,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
57,Male,32,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
63,Male,32,Doctor,6.2,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
67,Male,32,Accountant,7.2,8,50,6,Normal,118/76,68,7000,No Disorder,118,76
68,Male,33,Doctor,6.0,6,30,8,Normal,125/80,72,5000,Insomnia,125,80
69,Female,33,Scientist,6.2,6,50,6,Overweight,128/85,76,5500,No Disorder,128,85
71,Male,33,Doctor,6.1,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
75,Male,33,Doctor,6.0,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
81,Female,34,Scientist,5.8,4,32,8,Overweight,131/86,81,5200,Sleep Apnea,131,86
83,Male,35,Teacher,6.7,7,40,5,Overweight,128/84,70,5600,No Disorder,128,84
85,Male,35,Software Engineer,7.5,8,60,5,Normal,120/80,70,8000,No Disorder,120,80
86,Female,35,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
87,Male,35,Engineer,7.2,8,60,4,Normal,125/80,65,5000,No Disorder,125,80
89,Male,35,Engineer,7.3,8,60,4,Normal,125/80,65,5000,No Disorder,125,80
94,Male,35,Lawyer,7.4,7,60,5,Obese,135/88,84,3300,Sleep Apnea,135,88
95,Female,36,Accountant,7.2,8,60,4,Normal,115/75,68,7000,Insomnia,115,75
96,Female,36,Accountant,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
97,Female,36,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
99,Female,36,Teacher,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
101,Female,36,Teacher,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
104,Male,36,Teacher,6.6,5,35,7,Overweight,129/84,74,4800,Sleep Apnea,129,84
105,Female,36,Teacher,7.2,8,60,4,Normal,115/75,68,7000,Sleep Apnea,115,75
106,Male,36,Teacher,6.6,5,35,7,Overweight,129/84,74,4800,Insomnia,129,84
107,Female,37,Nurse,6.1,6,42,6,Overweight,126/83,77,4200,No Disorder,126,83
108,Male,37,Engineer,7.8,8,70,4,Normal,120/80,68,7000,No Disorder,120,80
110,Male,37,Lawyer,7.4,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
111,Female,37,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
126,Female,37,Nurse,7.5,8,60,4,Normal,120/80,70,8000,No Disorder,120,80
127,Male,38,Lawyer,7.3,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
128,Female,38,Accountant,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
138,Male,38,Lawyer,7.1,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
145,Male,38,Lawyer,7.1,8,60,5,Normal,130/85,68,8000,Sleep Apnea,130,85
146,Female,38,Lawyer,7.4,7,60,5,Obese,135/88,84,3300,Sleep Apnea,135,88
147,Male,39,Lawyer,7.2,8,60,5,Normal,130/85,68,8000,Insomnia,130,85
148,Male,39,Engineer,6.5,5,40,7,Overweight,132/87,80,4000,Insomnia,132,87
149,Female,39,Lawyer,6.9,7,50,6,Normal,128/85,75,5500,No Disorder,128,85
150,Female,39,Accountant,8.0,9,80,3,Normal,115/78,67,7500,No Disorder,115,78
152,Male,39,Lawyer,7.2,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
162,Female,40,Accountant,7.2,8,55,6,Normal,119/77,73,7300,No Disorder,119,77
164,Male,40,Lawyer,7.9,8,90,5,Normal,130/85,68,8000,No Disorder,130,85
166,Male,41,Lawyer,7.6,8,90,5,Normal,130/85,70,8000,Insomnia,130,85
167,Male,41,Engineer,7.3,8,70,6,Normal,121/79,72,6200,No Disorder,121,79
168,Male,41,Lawyer,7.1,7,55,6,Overweight,125/82,72,6000,No Disorder,125,82
170,Male,41,Lawyer,7.7,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
175,Male,41,Lawyer,7.6,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
178,Male,42,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
179,Male,42,Lawyer,7.8,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
185,Female,42,Teacher,6.8,6,45,7,Overweight,130/85,78,5000,Sleep Apnea,130,85
187,Female,43,Teacher,6.7,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
188,Male,43,Salesperson,6.3,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
190,Male,43,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
192,Male,43,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
202,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,Insomnia,130,85
204,Male,43,Engineer,6.9,6,47,7,Normal,117/76,69,6800,No Disorder,117,76
205,Male,43,Engineer,7.6,8,75,4,Overweight,122/80,68,6800,No Disorder,122,80
206,Male,43,Engineer,7.7,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
210,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
219,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,Sleep Apnea,130,85
220,Male,43,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Sleep Apnea,130,85
221,Female,44,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
222,Male,44,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
223,Male,44,Salesperson,6.3,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
238,Female,44,Teacher,6.5,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
248,Male,44,Engineer,6.8,7,45,7,Overweight,130/85,78,5000,Insomnia,130,85
249,Male,44,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,No Disorder,130,85
250,Male,44,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,No Disorder,130,85
251,Female,45,Teacher,6.8,7,30,6,Overweight,135/90,65,6000,Insomnia,135,90
253,Female,45,Teacher,6.5,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
257,Female,45,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
262,Female,45,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,No Disorder,135,90
264,Female,45,Manager,6.9,7,55,5,Overweight,125/82,75,5500,No Disorder,125,82
265,Male,48,Doctor,7.3,7,65,5,Obese,142/92,83,3500,Insomnia,142,92
266,Female,48,Nurse,5.9,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
268,Female,49,Nurse,6.2,6,90,8,Overweight,140/95,75,10000,No Disorder,140,95
269,Female,49,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
270,Female,49,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
274,Female,49,Nurse,6.2,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
277,Male,49,Doctor,8.1,9,85,3,Obese,139/91,86,3700,Sleep Apnea,139,91
279,Female,50,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Insomnia,140,95
280,Female,50,Engineer,8.3,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
281,Female,50,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,No Disorder,140,95
282,Female,50,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
283,Female,50,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
299,Female,51,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
303,Female,51,Nurse,7.1,7,55,6,Normal,125/82,72,6000,No Disorder,125,82
304,Female,51,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
305,Female,51,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
307,Female,52,Accountant,6.5,7,45,7,Overweight,130/85,72,6000,Insomnia,130,85
309,Female,52,Accountant,6.6,7,45,7,Overweight,130/85,72,6000,Insomnia,130,85
313,Female,52,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
316,Female,53,Engineer,8.3,9,30,3,Normal,125/80,65,5000,Insomnia,125,80
317,Female,53,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
319,Female,53,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
325,Female,53,Engineer,8.3,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
333,Female,54,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
339,Female,54,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
340,Female,55,Nurse,8.1,9,75,4,Overweight,140/95,72,5000,Sleep Apnea,140,95
342,Female,56,Doctor,8.2,9,90,3,Normal,118/75,65,10000,No Disorder,118,75
344,Female,57,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
345,Female,57,Nurse,8.2,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
350,Female,57,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
353,Female,58,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
359,Female,59,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
360,Female,59,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
361,Female,59,Nurse,8.2,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
365,Female,59,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
367,Female,59,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
Loading