Skip to content

GalKoaz/Final-assignment

Repository files navigation

Final-assignment

Content
  1. About The Project
  2. Languages and Tools
  3. Contact
  4. Acknowledgements
  5. Support

About The Project

Introduction to Data Science Final Project 2020

In this project we will use all the tools we learned during the course on machine learning, in the first stage we will deal with the ability to solve probability exercises, in the second stage we will focus on our internalization and realization of tools learned for pandas, numpy and visualization, in the last part we will focus on machine learning. Which includes classification and regression, ie the ability to take a data set and analyze it to predict exact values and use appropriate models for this.

"The world is just a one Big Data problem 😄"


First part: Probability

Within this part we were asked to solve probable questions. Some of which were solved with the help of Bayes' theorem, random variables, inference and understanding from graphs.


Second part: Programming

In this section we were asked to answer a number of questions that indicate the ability to understand and internalize what is learned throughout the course related to pandas, numpy etc...


Third part: Machine Learning

At this part you will be impressed by the ability to analyze information with the help of machine learning which gives us the ability to predict numerical predictions which are beneficial to man and even to the whole globe, In the following lines I will explain exactly all the data on which I performed the work.

So, grab the coffee and cookies and tighten the belt before takeoff!

In Each Notebook (Classification, Regression) we following in these steps for make sure we working by the book.

  • Step 1 - Import Libraries and load the data.
  • Step 2 - Data Cleaning, checking for nulls value and fill them.
  • Step 3 - Variable Descriptions, Discover and Visualize the Data to Gain Insights.
  • Step 4 - Split the data into train and test.
  • Step 5 - Applying Machine Learning Models.
  • Step 6 - Evaluting each Model and fine tuning in order to achieve the best results.
  • Step 7 - Determining which Model is optimal for our data set.

Classification

Who does not like mushrooms 🍄🍄🍄 in his pizza or pasta ?!?

I chose to work on a dataset to predict whether a mushrooms is edible or poisonous!

mushrooms.csv dataset includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family Mushroom drawn from The Audubon Society Field Guide to North American Mushrooms (1981). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one.

  • Time period: Donated to UCI ML 27 April 1987

I'm pretty sure its make u a little curious, So what are you waiting for ??

get in quickly to explore my work! 😀


Regression

Who does not like to drink a good Wine 🍷🍷🍷 with the pizza or pasta ?!?

I chose to work on a dataset to predict the quality of the Wine (to make sure u always drinks the best)!

The winequality-red.csv dataset are related to red variants of the Portuguese "Vinho Verde" wine. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

This dataset can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are many more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.


Languages and Tools

Acknowledgements

Contact

Gal - koazgal@gmail.com

Project Link: https://github.com/GalKoaz/Final-assignment

Support

Give a ⭐️ if this project helped you in some way (For The Good Karma 😇)!

About

Final assignment data science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published