normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Normal Equation

Given a matrix equation, the normal equation is one which minimizes the sum of the square differences between the left and right sides

Basics of Machine Learning Series

Introduction

Gradient descent is an algorithm which is used to reach an optimal solution iteratively using the gradient of the loss function or the cost function. In contrast, normal equation is a method that helps solve for the parameters analytically i.e. instead of reaching the solution iteratively, solution for the parameter \(\theta\) is reached at directly by solving the normal equation.

Intuition

Consider a one-dimensional equation for the cost function given by,

According to calculus, one can find the minimum of this function by calculating the derivative and solving the equation by setting derivative equal to zero, i.e.

Similarly, extending (1) to multi-dimensional setup, the cost function is given by,

And similar to (2), the minimum of (3) can be found by taking partial derivatives w.r.t. individual \(\theta_i \forall i \in (0, 1, 2, \cdots, n) \) and solving the equations by setting them to zero, i.e.

Through derivation one can find that \(\theta\) is given by,

Feature scaling is not necessary for the normal equation method. Reason being, the feature scaling was implemented to prevent any skewness in the contour plot of the cost function which affects the gradient descent but the analytical solution using normal equation does not suffer from the same drawback.

Comparison between Gradient Descent and Normal Equation

Given m training examples, and n features

Gradient DescentNormal Equation
Proper choice of \(\alpha\) is important\(\alpha\) is not needed
Iterative MethodDirect Solution
Works well with large n. Complexity of algorithm is O(\(kn^2\))Slow for large n. Need to compute \((X^TX)^<-1>\). Generally the cost for computing the inverse is O(\(n^3\))

Generally if the number of features is less than 10000, one can use normal equation to get the solution beyond which the order of growth of the algorithm will make the computation very slow.

Non-invertibility

Matrices that do not have an inverse are called singular or degenerate.

Reasons for non-invertibility:

Calculating psuedo-inverse instead of inverse can also solve the issue of non-invertibility.

Implementation

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Derivation of Normal Equation

Given the hypothesis,

Let X be the design matrix wherein each row corresponds to the features in \(i^

\) sample of the m samples. Similarly, y is the vector with all the target values for all the m training samples. The cost function for the hypothesis (6) is given by (3). The cost function can be vectorized as follows for replacing the sigma operation with the sum over terms for matrix multiplication,

Since \(X\theta\) and \(y\) both are vectors, \((X\theta)^Ty = y^T(X\theta)\). So (7) can be further simplified as,

Π˜ΡΡ‚ΠΎΡ‡Π½ΠΈΠΊ

Normal Equation in Linear Regression

Author(s): Saniya Parveez

Machine Learning

Gradient descent is a very popular and first-order iterative optimization algorithm for finding a local minimum over a differential function. Similarly, the Normal Equation is another way of doing minimization. It does minimization without restoring to an iterative algorithm. Normal Equation method minimizes J by explicitly taking its derivatives concerning theta j and setting them to zero.

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Below is a data-set to predict house price:

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Gradient Descent Vs Normal Equation

Gradient Descent

Normal Equation

Linear Regression with Normal Equation

Load the Portland data

Visualize The Area against the Price:

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Visualize the Number of Rooms against the Price of the House:

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Here, the relationship between the Number of Rooms, and the Price of the House, appears to be Linear.

Define Feature Matrix, and Outcome/Target Vector:

Visualize Cost Function:

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Split Data

Normal Equation

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Prediction using Normal Equation theta value

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Prediction using Linear Regression

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Here, the predictions from the Normal Equation and Linear Equation are the same.

Normal Equation Non-Invertibility

A squared matrix that does not have an inverse a matrix is singular if and only if it is determined is zero.

The inverse of Matrix:

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅Error

Problem due to Non-Invertibility:

How to solve if there are too many features?

Conclusion

Gradient Descent gives one way to minimizing J. Normal Equation is another way of doing minimization. It does minimization without restoring to an iterative algorithm. But, Normal Equation is very slow if the data-set size is very large

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Normal Equation in Linear Regression was originally published in Towards AI β€” Multidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Π˜ΡΡ‚ΠΎΡ‡Π½ΠΈΠΊ

ML | Normal Equation in Linear Regression

Normal Equation is an analytical approach to Linear Regression with a Least Square Cost Function. We can directly find out the value of ΞΈ without using Gradient Descent. Following this approach is an effective and time-saving option when are working with a dataset with small features.
Normal Equation is a follows :

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the Machine Learning Foundation Course at a student-friendly price and become industry ready.

In the above equation,
ΞΈ: hypothesis parameters that define it the best.
X: Input feature value of each instance.
Y: Output value of each instance.

Maths Behind the equation –

Given the hypothesis function

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

where,
n: the no. of features in the data set.
x0: 1 (for vector multiplication)
Notice that this is a dot product between ΞΈ and x values. So for the convenience to solve we can write it as :

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

The motive in Linear Regression is to minimize the cost function :

where,
x i : the input value of i ih training example.
m: no. of training instances
n: no. of data-set features
y i : the expected result of i th instance
Let us representing the cost function in a vector form.

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

we have ignored 1/2m here as it will not make any difference in the working. It was used for mathematical convenience while calculation gradient descent. But it is no more needed here.

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

x i j: value of j ih feature in i ih training example.
This can further be reduced to

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

But each residual value is squared. We cannot simply square the above expression. As the square of a vector/matrix is not equal to the square of each of its values. So to get the squared value, multiply the vector/matrix with its transpose. So, the final equation derived is

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Therefore, the cost function is

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

So, now getting the value of ΞΈ using derivative

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

So, this is the finally derived Normal Equation with ΞΈ giving the minimum cost value.

Π˜ΡΡ‚ΠΎΡ‡Π½ΠΈΠΊ

РусскиС Π‘Π»ΠΎΠ³ΠΈ

[МашинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅ ΠŸΡ€ΠΈΠΌΠ΅Ρ‡Π°Π½ΠΈΡ 1.1] РСшСниС Π½ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½Ρ‹Ρ… ΡƒΡ€Π°Π²Π½Π΅Π½ΠΈΠΉ Π»ΠΈΠ½Π΅ΠΉΠ½ΠΎΠΉ рСгрСссии

ΠžΠ±Π·ΠΎΡ€ Π»ΠΈΠ½Π΅ΠΉΠ½ΠΎΠΉ рСгрСссии

Π”Π°Π²Π°ΠΉΡ‚Π΅ сначала рассмотрим ΠΏΡ€ΠΎΡΡ‚Π΅ΠΉΡˆΠΈΠΉ случай, Ρ‚. Π•. ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ Π²Ρ…ΠΎΠ΄Π½Ρ‹Ρ… Π°Ρ‚Ρ€ΠΈΠ±ΡƒΡ‚ΠΎΠ² Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΎΠ΄Π½ΠΎ, ΠΈ линСйная рСгрСссия пытаСтся Π½Π°ΡƒΡ‡ΠΈΡ‚ΡŒΡΡ [1].

Π’Π΅ΠΏΠ΅Ρ€ΡŒ запрос E ( w β†’ ) » role=»presentation» style=»position: relative;»> E ( w β†’ ) МинимальноС Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ E ( w β†’ ) » role=»presentation» style=»position: relative;»> E ( w β†’ ) Π’Π΅Ρ€Π½Ρ‹ΠΉ w β†’ » role=»presentation» style=»position: relative;»> w β†’ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄Π½Ρ‹ΠΉ

ΠŸΡ€ΠΈΠΌΠ΅Ρ€ ΠΊΠΎΠ΄Π°

Как ΡΡƒΠ΄ΠΈΡ‚ΡŒ ΠΎ качСствС ΠΌΠΎΠ΄Π΅Π»ΠΈ

ΠŸΠΎΡ‡Ρ‚ΠΈ любой Π½Π°Π±ΠΎΡ€ Π΄Π°Π½Π½Ρ‹Ρ… ΠΌΠΎΠΆΠ΅Ρ‚ Π±Ρ‹Ρ‚ΡŒ смодСлирован с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ Π²Ρ‹ΡˆΠ΅ΡƒΠΊΠ°Π·Π°Π½Π½ΠΎΠ³ΠΎ ΠΌΠ΅Ρ‚ΠΎΠ΄Π°, Ρ‚Π°ΠΊ ΠΊΠ°ΠΊ ΠΎΡ†Π΅Π½ΠΈΡ‚ΡŒ качСство этих ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ? [2] Π‘Ρ€Π°Π²Π½ΠΈΡ‚Π΅ Π΄Π²Π° ΠΏΠΎΠ΄Π³Ρ€Π°Ρ„Π° Π½Π° рисункС Π½ΠΈΠΆΠ΅.Если Π²Ρ‹ Π²Ρ‹ΠΏΠΎΠ»Π½ΠΈΡ‚Π΅ Π»ΠΈΠ½Π΅ΠΉΠ½ΡƒΡŽ Ρ€Π΅Π³Ρ€Π΅ΡΡΠΈΡŽ для Π΄Π²ΡƒΡ… Π½Π°Π±ΠΎΡ€ΠΎΠ² Π΄Π°Π½Π½Ρ‹Ρ…, Π²Ρ‹ ΠΏΠΎΠ»ΡƒΡ‡ΠΈΡ‚Π΅ Ρ‚ΠΎΡ‡Π½ΠΎ Ρ‚Π°ΠΊΡƒΡŽ ​​ТС модСль (ΠΏΠΎΠ΄Π³ΠΎΠ½ΠΊΠ° ΠΏΠΎ прямой Π»ΠΈΠ½ΠΈΠΈ). ΠžΡ‡Π΅Π²ΠΈΠ΄Π½ΠΎ, Ρ‡Ρ‚ΠΎ эти Π΄Π°Π½Π½Ρ‹Π΅ Ρ€Π°Π·Π½Ρ‹Π΅, Ρ‚Π°ΠΊ насколько эффСктивны ΠΌΠΎΠ΄Π΅Π»ΠΈ Π½Π° этих Π΄Π²ΡƒΡ…? Как ΠΌΡ‹ Π΄ΠΎΠ»ΠΆΠ½Ρ‹ ΡΡ€Π°Π²Π½ΠΈΠ²Π°Ρ‚ΡŒ эти эффСкты? БущСствуСт способ Π²Ρ‹Ρ‡ΠΈΡΠ»ΠΈΡ‚ΡŒ ΡΡ‚Π΅ΠΏΠ΅Π½ΡŒ соотвСтствия ΠΌΠ΅ΠΆΠ΄Ρƒ прСдсказанным Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ΠΌ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ yHat ΠΈ истинным Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ΠΌ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ y, Ρ‚ΠΎ Π΅ΡΡ‚ΡŒ Π²Ρ‹Ρ‡ΠΈΡΠ»ΠΈΡ‚ΡŒ коэффициСнт коррСляции Π΄Π²ΡƒΡ… ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚Π΅ΠΉ.
normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Π Π΅ΡˆΠΈΡ‚Π΅ ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρƒ Π²Ρ€Π΅ΠΌΠ΅Π½ΠΈ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ Π½ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½Ρ‹Ρ… ΡƒΡ€Π°Π²Π½Π΅Π½ΠΈΠΉ X T X » role=»presentation» style=»position: relative;»> X T X НСобратимоС Ρ€Π΅ΡˆΠ΅Π½ΠΈΠ΅

Π§Ρ‚ΠΎ касаСтся Π½Π΅ΠΎΠ±Ρ€Π°Ρ‚ΠΈΠΌΠΎΠΉ ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρ‹, ΠΌΡ‹ Ρ‚Π°ΠΊΠΆΠ΅ Π½Π°Π·Ρ‹Π²Π°Π΅ΠΌ Π΅Π΅ особой ΠΈΠ»ΠΈ Π²Ρ‹Ρ€ΠΎΠΆΠ΄Π΅Π½Π½ΠΎΠΉ ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Π΅ΠΉ. НСобратимая ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Π° ΠΎΠ±Ρ‹Ρ‡Π½ΠΎ ΠΈΠΌΠ΅Π΅Ρ‚ ΡΠ»Π΅Π΄ΡƒΡŽΡ‰Π΅Π΅ [3-4.7]:

ΠšΡ€ΠΎΠΌΠ΅ Ρ‚ΠΎΠ³ΠΎ, ΠΌΠ΅Ρ‚ΠΎΠ΄ Π³Ρ€Π°Π΄ΠΈΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ спуска Ρ‚Π°ΠΊΠΆΠ΅ ΠΌΠΎΠΆΠ΅Ρ‚ Π±Ρ‹Ρ‚ΡŒ использован для Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ, ΠΊΠΎΠ³Π΄Π° ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Π° Π½Π΅ΠΎΠ±Ρ€Π°Ρ‚ΠΈΠΌΠ° (me: Ρ€Π΅Π°Π»ΡŒΠ½ΠΎΠ΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΠ΅, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠ΅ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ Π½ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ уравнСния, ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½ΠΎΠ΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΠ΅, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠ΅ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ Π³Ρ€Π°Π΄ΠΈΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ спуска). Π‘Ρ€Π°Π²Π½Π΅Π½ΠΈΠ΅ Π³Ρ€Π°Π΄ΠΈΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ спуска ΠΈ Π½ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ уравнСния ΠΏΠΎΠΊΠ°Π·Π°Π½ΠΎ Π² ΡΠ»Π΅Π΄ΡƒΡŽΡ‰Π΅ΠΉ Ρ‚Π°Π±Π»ΠΈΡ†Π΅: [3-4.6]

Π˜ΡΡ‚ΠΎΡ‡Π½ΠΈΠΊ

Normal Equation in Python: The Closed-Form Solution for Linear Regression

Machine Learning from scratch: Part 3

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Mar 23 Β· 5 min read

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

In this article, we will implement the Normal Equation which is the closed-form solution for the Linear Regression algorithm where we can find the optimal value of theta in just one step without using the Gradient Descent algorithm.

We will first recap with Gradient Descent Algorithm, then talk about calculating theta using a formula called Normal Equation and finally, see the Normal Equation in Action and plot predictions for our randomly generated data.

Machine Learning from scratch series β€”

Linear Regression from scratch in Python

Machine Learning from Scratch: Part 1

Locally Weighted Linear Regression in Python

Machine Learning from Scratch: Part 2

Gradient Descent Recap

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Gradient Descent Algorithmβ€”

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

First, we initialize the parameter theta randomly or with all zeros. Then,

Normal Equation

Gradien t Descent is an iterative algorithm meaning that you need to take multiple steps to get to the Global optimum (to find the optimal parameters) but it turns out that for the special case of Linear Regression, there is a way to solve for the optimal values of the parameter theta to just jump in one step to the Global optimum without needing to use an iterative algorithm and this algorithm is called the Normal Equation. It works only for Linear Regression and not any other algorithm.

Normal Equation is the Closed-form solution for the Linear Regression algorithm which means that we can obtain the optimal parameters by just using a formula that includes a few matrix multiplications and inversions.

This is the Normal Equation β€”

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

If you know about the matrix derivatives along with a few properties of matrices, you should be able to derive the Normal Equation for yourself.

You might think what if X is a non-invertible matrix, which usually happens if you have redundant features i.e your features are linearly dependent, probably because you have the same features repeated twice. One thing you can do is go and find out which features are repeated and fix them or you can use the np.pinv function in NumPy which will also give you the right answer.

The Algorithm

Check the shapes of X and y so that the equation matches up.

Normal Equation in Action

Let’s take the following randomly generated data as a motivating example to understand the Normal Equation.

Here, n =1 which means the matrix X has only 1 column and m =500 means X has 500 rows. X is a (500×1) matrix and y is a vector of length 500.

normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ Ρ„ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π‘ΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΡƒ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. ΠšΠ°Ρ€Ρ‚ΠΈΠ½ΠΊΠ° ΠΏΡ€ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅. Π€ΠΎΡ‚ΠΎ normal equation машинноС ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅

Find Theta Function

Let’s write the code to calculate theta using the Normal Equation.

Π˜ΡΡ‚ΠΎΡ‡Π½ΠΈΠΊ

Π”ΠΎΠ±Π°Π²ΠΈΡ‚ΡŒ ΠΊΠΎΠΌΠΌΠ΅Π½Ρ‚Π°Ρ€ΠΈΠΉ

Π’Π°Ρˆ адрСс email Π½Π΅ Π±ΡƒΠ΄Π΅Ρ‚ ΠΎΠΏΡƒΠ±Π»ΠΈΠΊΠΎΠ²Π°Π½. ΠžΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ поля ΠΏΠΎΠΌΠ΅Ρ‡Π΅Π½Ρ‹ *