Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
3ecb812
Add restaurant and income data for Barcelona
Jul 7, 2025
eddf2ed
Reddit + Offical Data loaded
levinX7 Jul 7, 2025
a7e3b2a
Merge pull request #1 from levinX7/branch_levin
levinX7 Jul 7, 2025
8f37652
Day_1_work
Jul 7, 2025
0849978
Merge pull request #2 from levinX7/Brenda
Brenvillag Jul 7, 2025
78653dc
Day1
Jul 7, 2025
d90aba9
Merge pull request #3 from levinX7/viktoria_branch
ViktoriaGluhovskya Jul 8, 2025
3a633a9
Changes in file structure
levinX7 Jul 8, 2025
db2dbac
Merge pull request #4 from levinX7/branch_levin
levinX7 Jul 8, 2025
3852b1d
Added python-dotenv library
Jul 8, 2025
7c7e201
Solved conflict
Jul 8, 2025
9f7645b
loading hotel data + analysis
levinX7 Jul 8, 2025
1aeb269
Merge pull request #5 from levinX7/branch_levin
levinX7 Jul 8, 2025
b52cc4d
Day_2
Jul 8, 2025
ff59eb8
Merge pull request #6 from levinX7/Brenda
Brenvillag Jul 8, 2025
e3b5df5
Added charts and filtered restaurant data
Jul 8, 2025
a50d6c0
Added charts and filtered restaurant data
Jul 8, 2025
0915019
Merge pull request #7 from levinX7/viktoria_branch
ViktoriaGluhovskya Jul 8, 2025
9add4ff
changed hotels notebook
levinX7 Jul 9, 2025
10f5e4b
Merge pull request #8 from levinX7/branch_levin
levinX7 Jul 9, 2025
4e815e5
Add cleaned restaurant data and top 10 restaurant types table
Jul 9, 2025
0cd67e9
Day3
levinX7 Jul 9, 2025
c224e4e
Merge pull request #9 from levinX7/branch_levin
levinX7 Jul 9, 2025
c66c23c
Merge pull request #10 from levinX7/viktoria_branch
ViktoriaGluhovskya Jul 9, 2025
97416b6
Day_3_work
Jul 9, 2025
5f22850
Merge pull request #11 from levinX7/Brenda
Brenvillag Jul 9, 2025
4256565
Day4_work
Jul 10, 2025
00a29a8
Delete notebooks/reddit_other_load.ipynb
levinX7 Jul 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12
187 changes: 110 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,110 @@
# Project overview
...

# Installation

1. **Clone the repository**:

```bash
git clone https://github.com/YourUsername/repository_name.git
```

2. **Install UV**

If you're a MacOS/Linux user type:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

If you're a Windows user open an Anaconda Powershell Prompt and type :

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```

3. **Create an environment**

```bash
uv venv
```

3. **Activate the environment**

If you're a MacOS/Linux user type (if you're using a bash shell):

```bash
source ./venv/bin/activate
```

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):

```bash
source ./venv/bin/activate.csh
```

If you're a Windows user type:

```bash
.\venv\Scripts\activate
```

4. **Install dependencies**:

```bash
uv pip install -r requirements.txt
```

# Questions
...

# Dataset
...

## Main dataset issues

- ...
- ...
- ...

## Solutions for the dataset issues
...

# Conclussions
...

# Next steps
...
Project Title: BARCELONA RESTAURANT OPPORTUNITY

# Project overview
Barcelona is known for its dynamic culinary scene and diverse neighborhoods, each with distinct characteristics. However, launching a restaurant in this competitive environment requires more than intuition demands a clear understanding of local income, tourist activity, and potential saturation.
This project aims to identify the most strategic district in Barcelona to open a new restaurant, using a data driven approach that balances economic indicators and tourism potential. Choosing the right district can provide the perfect mix of foot traffic, affluent residents, and manageable competition.

# Installation

1. **Clone the repository**:

```bash
git clone https://github.com/YourUsername/repository_name.git
```

2. **Install UV**

If you're a MacOS/Linux user type:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

If you're a Windows user open an Anaconda Powershell Prompt and type :

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```

3. **Create an environment**

```bash
uv venv
```

3. **Activate the environment**

If you're a MacOS/Linux user type (if you're using a bash shell):

```bash
source ./venv/bin/activate
```

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):

```bash
source ./venv/bin/activate.csh
```

If you're a Windows user type:

```bash
.\venv\Scripts\activate
```

4. **Install dependencies**:

```bash
uv pip install -r requirements.txt
```
# Dataset
We Downloaded:
1) “Hotels in the city of Barcelona” 2020
https://opendata-ajuntament.barcelona.cat/data/en/dataset/allotjaments-hotels
2) “Demographic indicators. Population density (inhabitants / ha) of the city of Barcelona” 2022
https://opendata-ajuntament.barcelona.cat/data/en/dataset/est-densitat
3) “Disposable income of households per capita(€) in the city of Barcelona” 2021
https://opendata-ajuntament.barcelona.cat/data/en/dataset/renda-disponible-llars-bcn
We used APIs for:
Reddit mentions for different food types in r/Food (steak, pizza, burger, pasta, ramen, sushi, paella, Korean bbq, tapas)
Google Reviews of top 55 restaurants in Barcelona (name, rating, address, longitude, latitude, type)

## Main dataset issues
- Paywall for APIs. We consistently ran into APIs that were only accessible with significant payments.
- API restrictions. We had to accept only loading 55 restaurants as the API was restricted.
- The districts were in the address column in the google reviews data frame to merge them with the others. The district names also had slightly different wording that had to be adapted manually.

## Solutions for the dataset issue-To merge the data frames, the district name had to be extracted from the adress column in the google review data frame. The district names also had slightly different wording that had to be adapted manually.-
For the APIs we decided to also extract information from reddit to gain more variety in data

## Strenghts and Weaknesses
Strengths:
The dataset is well-structured and tabular, making it easy to load and manipulate.
It contains several numeric columns like rating, reviews, income, and density, which support statistical analysis.
Latitude and longitude enable spatial techniques such as clustering or heatmap creation.
It includes categorical and textual data such as district, neighbourhood, and types, allowing for grouping and classification.
Weaknesses:
The sample size is small, which limits the reliability of any statistical conclusions.
The variable rating is subjective, and may not fully capture business performance.
The column types is unstructured, with mixed labels and formatting, requiring significant cleaning.
The dataset lacks a time dimension, preventing any time series or trend analysis.

# Question
¿What are the ideal locations and most suitable food type for opening a premium restaurant in Barcelona?

# Methodology
1) Loading data: API and dowloading datasets
2) Cleaning data, merged dataframes, delt null and duplicates: numpy and pandas
3) Charts, plots, correlation table: matplotlib, seaborn
4) Heatmap: folium ...

# Conclussions
Our analysis identified three standout districts: Eixample, Sarrià-Sant Gervasi, and Les Corts—each with strategic potential depending on the target market.
Ultimately, Eixample emerged as the ideal location, it offers the best balance of year-round demand, high tourist flow with the highest hotel concentration, and strong rating potential, while Les Corts and Sarrià-Sant Gervasi share similar income levels and quality indicators, their lower population density suggests reduced foot traffic—making Eixample the most compelling choice for visibility, volume, and long-term success.
Further questions
What pricing strategy fits the income level and tourist profile of the area?
When is demand highest in this area?


Presentation: https://docs.google.com/presentation/d/1Ny8zftGfeMyTNozQgHw9vF9VJ0c4j-FLqNYIiMX6Q2w/edit?slide=id.g36e2651165f_2_245#slide=id.g36e2651165f_2_245
Extra sources: https://trello.com/b/h3utlBtt/firstproject
Expand Down
Binary file added data/clean/Location map.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file removed data/clean/cleaned_data_file.csv
Empty file.
Loading