A database of every Portuguese student who's ever attended university.
View Demo
·
Start Developing
📋 Table of Contents
FindYourFriendUniversity started as POC to scrape and navigate over the data on DGES website. After the success I decided to create an Elixir Phoenix App, that alongside the initial Python script, indexes all the candidatures and collocations of all the students that were already candidates to Portuguese Public Universities.
Basically a Python script scrapes the data from DGES and saves it on a JSON file. Then a Phoenix App reads the JSON file and populates the database with the data and starts a webserver where you can search for students, courses, universities and applications.
This was my first "big" Phoenix project, so I decided let a walkthrough about what I did.
Here you can see a demo over the real data from DGES (only showing me), and then the complete Website Frontend with fake data. You can also see a screenshot of the website responsiveness on mobile phones.
fake-data-demo.mp4
Below, you have the instructions to run the project in development and deployment mode. Choose the one that fits you better.
Start by installing the following tools:
- Git - Version Control System
- Python - Programming Language
- Docker - Containerization Platform
- asdf - Version Manager (optional, but recommended)
- Elixir - Functional Programming Language (install with asdf)
Now clone the repository to your local machine. You can do this using Git:
$ git clone git@github.com:darguima/FindYourFriendUniversity.git
# or
$ git clone https://github.com/darguima/FindYourFriendUniversity.gitLet's start by getting the seeds to populate the database. You have two options, generate fake data or scrape the real data from DGES.
For both options you have python scripts to do that and both will output at ./seeds/ folder.
All the files that are present at ./seeds/ folder and respect the naming pattern will be used, so remember to maintain this folder clean. Applications files should be named like applications_*.json.
If you are thinking about scraping the data take in mind that this is the slowest option to setup, you will need install the beautifulsoup4 package and scrape real data can go against the GDPR.
To generate fake data:
$ python faker_seeds.pyTo scrape the official data from DGES:
# Preparing the environment
$ python -m venv .venv # Just run once
$ source .venv/bin/activate # Run on new each terminal session
$ pip install -r requirements.txt # Just run once
# Scraping the data
$ python applications_scraper.pyStart by installing asdf. Now you can install Elixir and Erlang:
$ asdf installThen you need create the Database. I like to use Docker for that:
$ docker run --name fyfu_db -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres
# Start the container at computer startup
$ docker run --restart=always --name fyfu_db -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgresNow you can run the setup script and start the Phoenix server:
$ mix setup
$ mix phx.serverAnd the server will be running at localhost:4000.
To deploy the server I prepared a Docker Container to easily deploy it anywhere. Edit the ./docker/.env file with your environment variables and run the following commands:
docker compose -f docker/docker-compose.yml up --buildNow you can access the server at the localhost on the port you defined in the .env file.
Then you need to populate the database.
# Clean the seeds folder (just if you want to remove the previously added seeds)
docker exec -it fyfu_app rm /app/seeds/ -rf
# Copy seeds available now at seeds folder
docker cp ./seeds/ fyfu_app:/app/seeds/
# Run the seeds
docker exec -it fyfu_app mix ecto.setupTo completely remove the container and the volumes, run:
docker compose -f docker/docker-compose.yml down -vContributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
With this Python script you will be able to scrape real personal data from DGES website. Although this is illegal due GDPR in Europe. Be careful when dealing with others personal information online. This was just a study case.
