Skip to content

secil-carver/Multiple-Linear-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Multiple Linear Regression Analysis

Research Question and Purpose of Analysis

Can we predict the Tenure duration of our customers with the variables we have?

The Goal of the Data Analysis

The goal of this analysis is to predict customer Tenure based on the information we have about them such as their patterns of use of the services we offer like Bandwidth, Device Protection, and Streaming Movies, as well as the type of contract they have made.

Technique Justification

The method used in this data analysis is Multiple Linear Regression. Multiple Linear Regression attempts to model the relationship between two or more predictor variables with the target variable by fitting a linear equation to the observed data.

The Assumptions of the Multiple Regression Model are:

a) There is a linear relationship between the dependent and the independent variables

b) The independent variables are not highly correlated to each other (multicollinearity)

c) The residuals are normally distributed

d) The observations are independent of each other

The tool for analysis was Python. Python and its packages are very efficient for visualizations, quick statistical summaries, feature selection methods, and linear regression models. Regression analysis was an appropriate technique to answer my research question since I had a bimodal continuous dependent variable (Tenure) and multiple numeric and categorical independent variables with various amounts values (such as Bandwidth, StreamingTV, InternetServices, etc.). Also, not only did I want to find out about the correlation and the magnitude of the relationship between my predictor and target variables, but also I wanted to utilize this model for future predictions.

Heatmap

The Heatmap helps to quickly visualize the strength relationship of variables across two axes.

image

Q-Q Plot

Q-Q Plots (Quantile Quantile Plots) are visual representations of a sample distribution against a theoretical distribution. Q-Q plots help us determine if our dataset is following a particular type of probability distribution, such as normal, or exponantial.

image

Model Metrics

R-squared

f-statistic

Residual standard error(RSE)

About

Predicting customer tenure duration with multiple linear regression.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published