New Discovery In Regression For Machine Learning, Mathematics And Statistics
Improvisation is the basic theme for overall human development from the Stone Age till the modern scientific world. Machine learning is also part of such improvisation over the years. Today we’re going to see one such improvisation which leads to knowledge, and a better understanding of our model following the caveat of Perfection-precision-prediction-probable-programmable Philosophy.
“Humans involve to highly evolve — Ankit Shende”
Let just consider a problem to understand the entire concept in depth. Suppose you want to buy a house within your budget. There are bunch-full of brokers contacting you to present the site, features of the house, and price of the house. Therefore, you will analyze to find the best buy for your money.
Simple Mean Prediction
The first analysis involves basic simple average prediction without the use of any formula. Generally, this is the first thing we do whether buying house, car, and clothes. One looks at the locality and checks the buying rate and then immediately calculates the average to have a rough idea about it. This entire process has two situations rolling together known as prediction error.
One in which, we overestimate the price and end up spending more money eventually leading to loss. On the other hand, one falls into underestimating the price leading to loss of opportunity to buy as the deal slips. This situation was tackled by the way of mean squared error. In simple language, it is nothing but the difference of predicted and actual value divided by the number of observations taken into account.
Further ahead to increase the accuracy of the decision many techniques evolved out of which regression is one.
What is Regression?
The modeling technique evolved in statistics for analysis to derive results by estimating the relationship between dependent and independent variables. Broad classification of regression involves 15 different techniques but I will cover only three of them which are widely used. The reason for not mentioning all 15 is time constraint and the main focus is on the new technique I discovered out of creativity.
Linear Regression
Whenever there is a linear relationship between the independent and dependent variable, it is called linear regression
When there is one independent variable and β0 represents the intercept while β1 is marked by the slope of the line.
The line of best fit has the purpose of comparison between the predicted and actual values. The closer the points approach towards the line, the lesser the will be the error. Rest for faraway points, they are treated as noise and hence discarded. The error obtained out of the difference between actual and prediction is known as residuals.
# import packages and classes
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
# provide data
x = np.array([7, 4, 18, 22, 36, 45, 56, 50]).reshape(-1, 1)
y = np.array([5, 20, 14, 32, 22, 38, 13, 25])
# create model and fit
model = LinearRegression()
model.fit(x, y)
# Get result
r_sq = model.score(x, y)
print(‘Coefficient of Determination:’, r_sq)
# Intercept
print(‘Intercept:’, model.intercept_)
print(‘slope:’, model.coef_)
#Predict response
y_pred = model.predict(x)
print(‘predicted response:’, y_pred)
# fitting data in scatter plot
fig_rect, axs = plt.subplots()
axs = plt.scatter(x, y)
plt.show()
Multiple Regressions
Multivariate regression comes into play when there are two or more independent variables which will be clear through example below
# step — 1 Import packages and class
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# step two provide data
x = [[2, 6, 5], [5, 1, 7], [15, 2, 6], [25, 5, 8], [35, 11, 10], [45, 15, 13], [55, 34, 17], [60, 35, 19]]
y = [4, 5, 20, 14, 32, 22, 38, 43]
x, y = np.array(x), np.array(y)
# step three create a model and fit it
model = LinearRegression().fit(x, y)
# Step four get results
r_sq = model.score(x, y)
print(‘Coefficient of Determination:’, r_sq)
print(‘Intercept:’, model.intercept_)
print(‘Slope:’, model.coef_)
# step five predict response
y_pred = model.predict(x)
print(‘predicted response:’, y_pred)
# generate heat map
figure_circle, axs = plt.subplots()
imfig = axs.imshow(x)
plt.show()
Polynomial regression
When there is polynomial kind of relationship existing between dependent and Independent variables then it becomes polynomial regression. For example,
# step one import packages
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
# Step two provide data
x = [22, 26, 20, 33, 48, 60]
y = [20, 25, 30, 15, 10, 40]
# Step three convert data into array
x = np.array(x).reshape(-1, 1)
y = np.array(y)
# step four transform the input data
transformer = PolynomialFeatures(degree=2, include_bias=False)
# step five fit the data
transformer.fit(x)
x_ = transformer.transform(x)
x_ = PolynomialFeatures(degree=2, include_bias=False).fit_transform(x)
# step six model and fit it
model = LinearRegression().fit(x_, y)
# step seven results
r_sq = model.score(x_, y)
print(‘coefficient of determination:’, r_sq)
print(‘intercept:’, model.intercept_)
print(‘coefficient:’, model.coef_)Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Derivative Regression
The entire system can be improvised by welcoming the new statistical light on regression. Till now we have seen three of them and the remaining 12 regressions have to include. This scientific and justifiable practice for more precision and better result. There are three variable not two in any formula of regression that is
- Dependent
- Independent
- Interdependent
Y (Dependent Variable) = (β0 + β1 X1 are Interdependent) * Independent variable
Here Independent variable is the derivative that has a relationship with both Dependent and Interdependent with having confluence on them but vice-versa is not true. The Independent variable act as the bridge between dependent and Interdependent.
For example, the Movement of prices in the stock market where we’re trying to predict the price of a particular Index or Share. Price becomes a dependent variable where several Interdependent factors affecting it like buyers, sellers, the volume of trade or news or profit margin of the company. Finally comes the Independent variable which has a significant effect on the dependent variable and the Interdependent variable if plotted but completely free of both variables affecting his running cycle. Here Independent variable is (derivative of time) but in other example situations, there will be the change of the Independent variable increasing the efficiency and accuracy. Time can be plotted against another factor also which is Interdependent on time like Buyers and Sellers quantity match is a function of time in the real-time stock market. I calculated the value and saw the experiment results even though it is at a nascent stage but sharing was important.
Another example is of a car moving on the road. The car moving is dependent variable relying on Interdependent (engine, wheel) and Independent (petrol or diesel). Here also the movement of car stops along with engine if fuel ends or fuel supply is off. Engine and Movement have zero significance on fuel. Remember the Independent variable derivative should be multiplied which has proof from Richard Feynman’s lecture accuracy giving an example of Achilles and tortoise.
Another example of the Shopkeeper wants to sell (Dependent variable) while the customer wants to buy, cost of the product (Interdependent) while product itself is an Independent variable. Product will be manufactured, sell-by another shopkeeper, and bought by some other customer.
Key Takeaway
Regression analysis is just moving towards certainty from uncertain grounds. The more your model leads you towards accuracy the more model is reliable. Mathematician and Statistics people can implement and check the accuracy with it. This wide change will bring good results in regression analysis. Ahead I will also bring the more topics of machine learning covered in my blog stay tuned.