Last updated on 13 September 2025
The least square method is used in statistical analysis to find the best fit line for a data set. It has wide applications in fields like regression analysis, data analytics, and machine learning. In this topic, we will discuss the least square method and its applications.
Least square method refers to a statistical technique used to understand relationships between two variables, make predictions, and summarize data. The technique is implemented by finding the best-fitting line through a set of data points. Here, the best fit line is the line drawn across the scatter plot to show the relationship between the variables.
The method can be visualized on a scatter plot. Imagine a graph with data points in x and y. The process begins by analyzing each data set to find the residual value, which is the difference between the actual y value and the predicted y value. After finding the residual, we need to square them and add them all up. We try to make the sum as small as possible to find the best fit line. It is commonly used in regression analysis to find the relationship between the dependent variable and independent variables.
The least square method formula to find the slope and the intercept is given below:
Slope (m) = \(n\sum xy \space - \space (\sum x)(\sum y) \over n\sum x^2 \space - \space (\sum x)^2\)
Intercept (c) = ȳ - mx̄ where \(x̄ = {\sum x \over n}\) and \(ȳ = {\sum y \over n}\)
\(c = {\sum y \space- \space m (\sum x) \over n}\)
Here,
We follow a certain method to calculate the least squares. Here, we shall analyze the method step-by-step:
Step 1: Consider the independent variable values as xi and the dependent variable as yi
Step 2: Finding the average values of xi and yi as x̄ and ȳ.
Step 3: Let’s say the line of best fit is y = mx + c. Here, c is the intercept of the line on the y-axis and m is the slope
Step 4: So, the slope \(m ={ \sum {[(x_i - \bar x)(y_i - \bar y)]} \over {\sum (x_i - x̄)^2}}\)
Step 5: Then the intercept(c) = ȳ - mx̄
So, the best fit line is y = mx + c
The least square method works by minimizing the differences between the actual data and the predicted value on the line. Now let’s see how the least square method graph looks like:
The data points are marked in red points. The x-axis represents the independent variable, and the y-axis represents the dependent variable. The line of best fit should minimize the vertical distances (residuals) from all data points to the line. This shows the method can be used to obtain the equation of the best fit line.
The least square method is considered as the best way to find the line of best fit, but also it has some disadvantages. Here are some of the pros and cons of the least square method.
Pros |
Cons |
It is easy to understand and use |
Although easy to use, it is only applicable for two variables |
As it is only applicable for two variables, it highlights the best relationship between them |
The method is not effective when there are outliers, as they may distort the final result. |
It helps predict stock market trend, and can make other economic-related predictions |
Since the method assumes a linear relationship, it may not be useful for all datasets |
The least square method is used in various fields. It is mostly used to predict stock prices and analyze scientific data. Here, we'll be looking at some real-life applications of the least square method:
In this section, let’s discuss a few common mistakes students tend to make. Here are a few common mistakes and the ways to avoid them.
A teacher records the number of hours students study and their exam scores (tables shown below). Using the least square method, the regression equation obtained is: y = 5x + 45. What is the predicted exam score for a student who studies for 6 hours?
The predicted exam score for a student who studies for 6 hours is 75.
Table:
Study Hours (x) |
Exam Score (y) |
1 |
50 |
2 |
55 |
3 |
65 |
4 |
70 |
5 |
75 |
Given: The regression equation is: y = 5x + 45
Here, y is the exam score
x is the hours studied, so x = 6
y = (5 × 6) + 45 = 75
Therefore, the student studying for 6 hours can score 75 on the exam.
A store tracks how advertising spend affects sales revenue. The least square regression equation derived is: y = 5x + 5. What is the predicted sales revenue if the store spends $6000 on ads? (Use the given table)
The predicted sales revenue is $35000.
Table:
Ads Spend ($1000s) |
Sales Revenue ($1000s) |
1 |
10 |
2 |
15 |
3 |
20 |
4 |
25 |
5 |
30 |
Given: The regression equation is: y = 5x + 5
Here, y is the sale revenue
x is the ads spend,
For $6000 ad spend, so x = 6
y = (5 × 6) + 5 = 35
As y is in $1000s, so the predicted sales revenue is 35 × 1000 = $35,000
A shop observes that ice cream sales depend on temperature. The regression equation obtained is: y = 20x - 200. Follow the table given below and find the predicted ice cream sales at 27°C?
The predicted ice cream sales at 27℃ are $340.
Table:
Temperature (℃) |
Ice Cream Sales ($) |
20 |
200 |
22 |
250 |
25 |
300 |
28 |
350 |
30 |
400 |
Given: The regression equation is: y = 20x - 200
Here, y is the ice-cream sale
x is the temperature, so x = 27
y = (20 × 27) - 200 = 340
The predicted ice cream sales at 27℃ are $340.
A company examines the relationship between experience and salary. The regression equation derived is: y = 5x + 25. What is the predicted salary for an employee with 8 years of experience?
The predicted salary for 8 years of experience is $65000.
Table:
Experience (years) |
Salary ($1000s) |
1 |
30 |
3 |
40 |
5 |
50 |
7 |
60 |
10 |
75 |
Given: The regression equation is: y = 5x + 25
Here, y is the salary
x is the experience, so x = 8
y = (5 × 8) + 25 = 65
The predicted salary is 65 × 1000 = $65000
A fitness coach studies how exercise affects weight loss. The regression equation obtained is: y = 1x - 1. What is the predicted weight loss after 12 hours of exercise?
The predicted weight loss is 11 kg.
Table:
Exercise (hours) |
Weight loss (kg) |
2 |
1 |
4 |
3 |
6 |
5 |
8 |
7 |
10 |
9 |
Given: The regression equation is: y = 1x - 1
Here, y is the weight loss
x is the hours of exercise, so x = 12
y = (1 × 12) - 1 = 11
The predicted weight loss is 11 kg.
Jaipreet Kour Wazir is a data wizard with over 5 years of expertise in simplifying complex data concepts. From crunching numbers to crafting insightful visualizations, she turns raw data into compelling stories. Her journey from analytics to education ref
: She compares datasets to puzzle games—the more you play with them, the clearer the picture becomes!