R Squared Value Explained – For Regression Models

The R Squared value is a useful parameter for interpreting statistical results. However, it is often used without a clear understanding of its underlying principles.

The ordinary least squares method is used for finding the best fit line for a simple linear regression model. With this method, you find the sum of all the squared differences between the actual values and the predicted values on the regression line. This sum is found for all regression lines. The line with the least sum becomes the regression model, because it’s the best fitting line. The sum itself can be referred to as the sum of squares of residuals (SSres).

sum of squares of residuals
The sum of squares of residuals is sum of all the squared differences between the actual values and the predicted values on the regression line.

Now, consider the average line. For example, in the salary vs. experience example, the average line represents the average salary. If you take the squared some of differences between the actual observation points, and the corresponding points on the average line, then you have what is called the total sum of squares (SStot). Once you have this, you can find the R Squared value.

R Squared Value
Understanding how the R Squared value is calculated will give insight to the meaning of its value.

The R Squared Value Close to 1 is Good

Since the ordinary least squares method finds the minimum SSres value, then the smaller SSres value you have will result in R2 being closer to 1. The closer your R2 value is to 1 indicates a better regression line. And it could indicate that your regression model will make better predictions for test data.

To say the R Squared value in words; it is one minus the sum of squares of residuals, divided by the total sum of squares.

Leave a Reply

Your email address will not be published. Required fields are marked *