Multiple Linear Regression (MLR)

Machine Learning Tutorial Introduction to Machine Learning Evolution of Machine Learning Types Of Human Learning What is Machine Learning? How do machines learn? Types of Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Difference between Supervised, Unsupervised, and Reinforcement Learning

Machine Learning Activities Types of Data in Machine Learning Types of Data in Machine Learning-2

Selection of Model Training A Model (For Supervised Learning) Model Representation and Interpretability

Basics of Feature Engineering Feature Transformation Feature Subset Selection

Classification Introduction Classification Model Classification Learning Steps k-Nearest Neighbour (kNN) Decision Tree Random Forest Model

Regression Simple Linear Regression Multiple Linear Regression (MLR)

Unsupervised Learning Clustering

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that predicts the outcome of a response variable by combining several explanatory variables. Modeling the linear relationship between the explanatory (independent) variables and response (dependent) variables is the aim of multiple linear regression. In essence, multiple regression is the extension of ordinary least-squares (OLS) regression because it involves more than one explanatory variable.

Key Points in Multiple Linear Regression (MLR)

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable.
Multiple regression is an extension of linear (OLS) regression that uses just one explanatory variable.
MLR is widely used in econometrics and financial analysis.

Assumptions of multiple linear regression

Multiple linear regression makes all of the same assumptions as simple linear regression:

Homogeneity of variance (homoscedasticity): The size of the error in our prediction does not vary significantly across independent variable values.

Independence of observations: The dataset's observations were gathered using statistically valid sampling methods, and there are no hidden relationships between variables.

Normality: A normal distribution can be inferred from the data.

Linearity: The line of best fit through the data points is a straight line, rather than a curve or some sort of grouping factor.

In multiple linear regression, it is possible that some of the independent variables are actually correlated with one another, so it is important to check these before developing the regression model. If two independent variables are too highly correlated (r2 > ~0.6), then only one of them should be used in the regression model.

How to Perform a Multiple Linear Regression

The formula for a multiple linear regression is:

y_i=β₀+β₁x_i1+β2x_i2+...+β_px_ip+ϵ

where, for i=n observations:

y_i =dependent variable

x_i = explanatory variables

β₀ =y -intercept (constant term)

β_p= slope coefficients for each explanatory variableϵ=the model’s error term (also known as the residuals)

Introduction to Machine Learning

Process of Machine Learning

Modelling and Evaluation

Feature Engineering

Supervised Learning: Classification

Supervised Learning: Regression

Unsupervised Learning

Multiple Linear Regression (MLR)

How to Perform a Multiple Linear Regression