Our Programs

CodeCHAMPS

Learn More

Table Of Contents

How to Identify an Outlier in a Dataset

Identifying Outlier Using Box Plot

Identifying Outlier Using Scatter Plot

Identifying Outlier Using Isolation Forest Algorithm

How to Calculate Outliers

Calculating Outlier by Sorting Method

Calculating Outlier by Interquartile Range (IQR) Method

Tips and Tricks to Master Outliers

Common Mistakes and How to Avoid Them in Outlier

Real-Life Applications of Outlier

Solved examples for Outlier

FAQs of Outlier

Explore More data

Summarize this article:

1602 Learners

Last updated on November 22, 2025

Outlier

Q: What is the 1.5 IQR rule for outliers?

The $1.5 IQR$ rule is used to find the value of outlier in a dataset. It is based on the idea that the outlier falls 1.5 times the IQR, hence the formula lower bound =$ Q1 - 1.5 × IQR$ and upper bound = $Q1 + 1.5 × IQR$.

Q: How many deviations is an outlier?

For a data point to be an outlier, it needs to be 3 standard deviations away from the mean.

Q: What does IQR stand for?

IQR stand for interquartile range, which is the half of the 50% of the data, that is$ IQR = Q3 - Q1$, where Q1 is the middle value of the lower half and Q3 is the middle of the upper half.

Q: Is the z-score an outlier?

No, z-score is not the outlier. If the value of the z-score is greater than or less than ±3 then the value is the outlier.

Q: How to eliminate outliers?

An outlier can be eliminated by identifying and removing the outlier from the data set. Or else by replacing the outlier with the mean, median, or mode of the data set without the outlier.

Professor Greenline Explaining Math Concepts

Outliers are extreme values and an essential part of a dataset. Outliers provide valuable insights into data and can significantly impact the results of analysis. Let us now learn more about an outlier.

What is an Outlier?

Outliers are data points that stand out because they are much higher or lower than the rest of the data. Outliers can disproportionately affect statistical measures such as the mean, standard deviation, and regression models, skewing results and leading to misguided conclusions. This is because statistical measures, such as the mean, standard deviation, and regression models, are sensitive to extreme values.

For example,
A teacher records the marks of 10 students in a math test. $45, 48, 50, 47, 49, 46, 50, 48, 47, 95$.
Here, all the scores are around 45 – 50, but 95 is much higher than the rest.
So, 95 is an outlier because it is unusually high compared to the other data points.

How to Identify an Outlier in a Dataset

Finding outliers is an integral part of data analysis because unusual values can heavily influence the results. Outliers can be identified using two main approaches: visualization techniques and statistical methods.

Using Visualization Methods
Visualization involves presenting data in graphical form, making it easier to spot patterns and detect unusual values. Two commonly used visual tools are:

1. Box Plot
A box plot displays the minimum, first quartile (Q₁), median, third quartile (Q₃), and maximum.
Any point that lies outside the “whiskers” (beyond$Q_1 - 1.5 \times \mathrm{IQR} \quad \text{or} \quad Q_3 + 1.5 \times \mathrm{IQR} $) is considered an outlier.
Example:
Data: 12, 14, 15, 16, 17, 18, 40
In the box plot, 40 appears far outside the upper whisker, making it an outlier.

2. Scatter Plot
A scatter plot displays individual data points on a graph. Outliers appear as points that lie far away from the general cluster.
Example:
If you plot students’ study hours vs. exam scores, and most points form a cluster, but one point. For example, 10 hours of study and only 5 marks that lies far away, that point is an outlier.

Identifying Outlier Using Box Plot

A box plot is a statistical chart that provides a visual summary of how data is distributed. Outliers in a box plot can be identified using these steps.

Step 1: Sort the data in ascending order and determine the median.

Step 2: Calculate the Interquartile Range (IQR), which represents the central 50% of the dataset.

Step 3: Determine the lower and upper bounds (also called fences) using the IQR.

Step 4: Any data point that lies below the lower bound or above the upper bound is considered an outlier.

Explore Our Programs

Grade 1

2741+ Enrolled

Math Mastery 1

4.73 (9,840 ratings)

2741 students

Mathematics Course for Grade 1

$2499

$2599

($21 per class)

Enroll Now

1487+ Enrolled

Math Mastery 2

4.76 (15,960 ratings)

1487 students

Mathematics Course for Grade 2

$2499

$2599

($21 per class)

Enroll Now

1955+ Enrolled

Math Mastery 3

4.66 (8,040 ratings)

1955 students

Mathematics Course for Grade 3

$2499

$2599

($21 per class)

Enroll Now

3335+ Enrolled

Math Mastery 4

4.59 (10,800 ratings)

3335 students

Mathematics Course for Grade 4

$2499

$2599

($21 per class)

Enroll Now

656+ Enrolled

Math Mastery 5

4.81 (6,840 ratings)

656 students

Mathematics Course for Grade 5

$2499

$2599

($21 per class)

Enroll Now

4920+ Enrolled

Math Mastery 6

4.67 (16,200 ratings)

4920 students

Mathematics Course for Grade 6

$2499

$2599

($21 per class)

Enroll Now

2725+ Enrolled

Math Mastery 7

4.82 (12,360 ratings)

2725 students

Mathematics Course for Grade 7

$2499

$2599

($21 per class)

Enroll Now

4534+ Enrolled

Math Mastery 8

4.81 (13,560 ratings)

4534 students

Mathematics Course for Grade 8

$2499

$2599

($21 per class)

Enroll Now

4297+ Enrolled

Math Mastery 9

4.75 (13,560 ratings)

4297 students

Mathematics Course for Grade 9

$2499

$2599

($21 per class)

Enroll Now

1201+ Enrolled

Math Mastery 10

4.65 (6,600 ratings)

1201 students

Mathematics Course for Grade 10

$2499

$2599

($21 per class)

Enroll Now

5239+ Enrolled

Math Mastery 11

4.79 (12,240 ratings)

5239 students

Mathematics Course for Grade 11

$2499

$2599

($21 per class)

Enroll Now

3558+ Enrolled

Math Mastery 12

4.79 (9,720 ratings)

3558 students

Mathematics Course for Grade 12

$2499

$2599

($21 per class)

Enroll Now

2108+ Enrolled

Math Mastery 1 - Group

4.73 (9,840 ratings)

2108 students

Mathematics Course for Grade 1

$1299

($11 per class)

Enroll Now

811+ Enrolled

Math Mastery 2 - Group

4.76 (15,960 ratings)

811 students

Mathematics Course for Grade 2

$1299

($11 per class)

Enroll Now

1901+ Enrolled

Math Mastery 3 - Group

4.66 (8,040 ratings)

1901 students

Mathematics Course for Grade 3

$1299

($11 per class)

Enroll Now

3212+ Enrolled

Math Mastery 4 - Group

4.59 (10,800 ratings)

3212 students

Mathematics Course for Grade 4

$1299

($11 per class)

Enroll Now

614+ Enrolled

Math Mastery 5 - Group

4.81 (6,840 ratings)

614 students

Mathematics Course for Grade 5

$1299

($11 per class)

Enroll Now

3915+ Enrolled

Math Mastery 6 - Group

4.67 (16,200 ratings)

3915 students

Mathematics Course for Grade 6

$1299

($11 per class)

Enroll Now

259+ Enrolled

Math Mastery 7 - Group

4.82 (12,360 ratings)

259 students

Mathematics Course for Grade 7

$1299

($11 per class)

Enroll Now

466+ Enrolled

Math Mastery 8 - Group

4.81 (13,560 ratings)

466 students

Mathematics Course for Grade 8

$1299

($11 per class)

Enroll Now

1097+ Enrolled

Math Mastery 9 - Group

4.75 (13,560 ratings)

1097 students

Mathematics Course for Grade 9

$1299

($11 per class)

Enroll Now

226+ Enrolled

Math Mastery 10 - Group

4.65 (6,600 ratings)

226 students

Mathematics Course for Grade 10

$1299

($11 per class)

Enroll Now

2973+ Enrolled

Math Mastery 11 - Group

4.79 (12,240 ratings)

2973 students

Mathematics Course for Grade 11

$1299

($11 per class)

Enroll Now

841+ Enrolled

Math Mastery 12 - Group

4.79 (9,720 ratings)

841 students

Mathematics Course for Grade 12

$1299

($11 per class)

Enroll Now

Identifying Outlier Using Scatter Plot

A scatter plot is used to visualize the relationship between two continuous variables, with each point represented as a dot. In this plot, any points far from the central cluster of data are considered outliers.

Using Statistical Methods
To detect the outliers numerically, statistical techniques are used. Some commonly used methods include Z-score, DBSCAN, and the Isolation Forest algorithm.

Identifying Outliers Using the Z-Score
The Z-score method measures how many standard deviations a data point is from the mean. It is calculated using the formula:

$Z=X-μσ$

Where:

X = the data point

μ = mean of the dataset

σ = standard deviation of the dataset

A data point is considered an outlier if its Z-score is greater than +3 or less than –3.

Identifying Outlier Using Isolation Forest Algorithm

The Isolation Forest algorithm is an anomaly detection technique that uses decision trees to separate data points. It works by randomly partitioning the dataset. Points that are isolated in fewer steps are considered outliers because unusual values stand out and are easier to separate from the rest.

For example,
Consider the dataset representing daily sales (in units):
50, 52, 55, 53, 58, 60, 54, 300

Most values are close to each other, between 50 and 60. But 300 is hugely different.

When the Isolation Forest algorithm creates random partitions:

The values around 50–60 need many splits to isolate because they are very similar.
The value 300 gets isolated very quickly because it stands out from the rest.

Since 300 requires fewer splits, the algorithm marks it as an outlier.

How to Calculate Outliers

Outliers can be identified using different methods depending on the complexity of the data, the time available, and the level of accuracy needed. Here are four commonly used methods, along with the steps:

Sorting Method
Step 1: Arrange the data in ascending order.

Step 2: Look for values that appear unusually high or low compared to the rest.

Step 3: Verify by checking whether these extreme points are significantly different from the dataset's general pattern.

Step 4: Mark these extreme values as outliers.
For example,
Data: 10, 12, 13, 14, 15, 80
80 is clearly separated and becomes the outlier.
Using Visualization
Step 1: Plot the data using a scatter plot or box plot.

Step 2: Observe the data distribution.

Step 3: Identify points that are distant from the central cluster (scatter plot) or outside the whiskers (box plot).

Step 4: Mark these distant points as outliers.
For example,
In a scatter plot, if one dot lies far away from the cluster, it is an outlier.
Statistical Outlier Detection (Z-Score Method)
Step 1: Calculate the mean (μ) and standard deviation (σ) of the dataset.

Step 2: Use the formula,

$Z= X-μσ $

Step 3: Compute the Z-score for each data point.

Step 4: If a Z-score is greater than +3 or less than -3, that data point is an outlier.

For example,
If a score has Z = 3.2 → it is an outlier.
Interquartile Range (IQR) Method
Step 1: Arrange the data in ascending order.

Step 2: Find Q₁ (first quartile) and Q₃ (third quartile).

Step 3: Calculate the $ IQR = Q₃ – Q₁$.

Step 4: Compute the outlier fences:

Lower Fence = $Q₁ – 1.5 × IQR$
Upper Fence =$ Q₃ + 1.5 × IQR$
Any data point below the lower fence or above the upper wall is classified as an outlier.
For example,
If the upper fence is 40, and one value is 55 → 55 is an outlier.

Calculating Outlier by Sorting Method

In this method the data is arranged in ascending order and sorting the data visually scanning the extreme values.

Step 1: Arrange the data in ascending, that is, from small to big

Step 2: The value which is higher than the other values are considered to be the outlier

Calculating Outlier by Statistical Outlier Detection (Z-score Method)

The z-score is calculated by using the formula,$ z = X - μ/σ$,

Here, X is the data point

μ is the mean of the data set

σ (sigma) is the standard deviation.

If the value is greater than or less than ±3, then the value is an outlier. That is an outlier is more than 3 times a standard deviation.

Calculating Outlier by Interquartile Range (IQR) Method

Interquartile range is the median of the half of the data set. In this method, we find the outlier by following these steps,

Step 1: Arranging the data in ascending order, that is, low from high.

Step 2: Finding the value of Q1 and Q3, Q1 is the middle value of the lower half and Q3 is the middle of the upper half.

Step 3: Calculate the value of IQR. So, $IQR = Q3 - Q1$.

Step 4: Finding the value of lower bound and upper bound, here the lower bound = $Q1 - 1.5 × IQR$ and the upper bound = $Q3 + 1.5 × IQR$.

Tips and Tricks to Master Outliers

To master the topic outliers, some tips and tricks are mentioned below.

Outliers can be used to spot unusual credit card transactions by comparing them against a customer’s typical spending behavior.
In healthcare, outliers help identify abnormal vital signs in a patient’s records, which may indicate conditions that require immediate attention.
Outliers are useful in detecting unexpected spikes in website traffic from certain locations or unusual user behavior, which could signal a bot attack or viral activity.
In cybersecurity, outliers help flag suspicious login attempts, irregular access times, or abnormal data transfers that may point to hacking or malware.
In manufacturing, outliers help identify defective products or irregular machine performance, ensuring consistent production quality.
Parents can encourage children to notice patterns and identify values that “don’t belong,” helping them build intuitive understanding.
Teachers can use box plots and scatter plots during lessons to visually demonstrate how outliers appear in a dataset.

Common Mistakes and How to Avoid Them in Outlier

Now let’s learn a few common mistakes that students tend to repeat when working on outlier. But learning these students can master outlier

Mistake 1

Confusing extreme values with outliers

Students tend to think that any extreme value automatically qualifies as an outlier. In skewed and non-normal data, the extreme values are the natural tail of the distribution. So try to understand the difference between an extreme value and a statistically significant outlier.

Mistake 2

Not using the IQR method

IQR is the method used to find the outlier by using the middle of the data, which is 50%. Not using this method can cause errors. So students should use the IQR method to find the outlier. That is finding the Q1 and Q2, then finding IQR.

Mistake 3

Calculation errors in the mean and standard deviation approach

Calculation errors are common among students when finding the outlier, and it can lead to incorrect calculations. So, double-check the calculation and verify whether the equation is correct or not.

Mistake 4

Thinking that an outlier must be a single number

Thinking that an outlier can only be a single number is wrong, there can be multiple outliers or even clusters of outliers in the data set. To avoid analyzing the data both numerically and visually, try to understand the pattern.

Mistake 5

Not considering the order in the data set

Ordering the data set is important because not ordering the data can lead to misinterpretation of the trends or the nature of outliers. So when working on sequential data, try to order the data using a time-series plot or line graph.

Real-Life Applications of Outlier

Outlier is used in different fields such as finance, environment monitoring, cybersecurity, and so on. Let’s learn a few real-life applications of outliers.

To identify fraud detection in finance outlier is used, as it can identify any unusual transactions using credit cards by analyzing the spending patterns of the customer.
In health monitoring, an outlier is used to analyze any abnormal vital signs in the patient's medical records. It is helpful as it could indicate the health issues that need immediate attention.
To identify the unusual spikes in website traffic from a special location or user behavior.

Cybersecurity Threat Detection: Outliers help in detecting unusual login attempts, irregular access times, or abnormal data transfers in networks, which could indicate hacking or malware attacks.

Manufacturing and Quality Control: In production, outliers can highlight defective items or abnormal machine behavior, helping industries quickly address errors and maintain product quality.

Hey!

Solved examples for Outlier

Ray, the Character from BrightChamps Explaining Math Concepts

Problem 1

A teacher records the ages of students in a class: 12, 13, 14, 15, 12, 13, 14, 12, 13, 27. Find the outlier in the dataset.

Ray, the Boy Character from BrightChamps Saying "Let’s Begin"

Okay, lets begin

The outlier is 27.

Explanation

Arranging the data: 12, 12, 12, 13, 13, 13, 14, 14, 15, 27

The data set has 10 numbers

Here, Q1 is 12

Q3 is 14

So,$ IQR = Q3 - Q1 = 14 - 12 = 2$

Lower bound =$ Q1 - 1.5 × IQR = 12 - 1.5 × 2 = 9$

Upper bound = $Q3 + 1.5 × IQR = 14 + 1.5 × 2 = 17$

Any value below 9 or above 17 is the outlier. Here the outlier is 27.

Max from BrightChamps Praising Clear Math Explanations

Well explained 👍

Problem 2

A runner records his daily running distance (in miles) over 7 days: 3, 4, 3.5, 3.8, 4.2, 3.9, 10. Identify the outlier.

Okay, lets begin

The outlier is 10.

Explanation

Sorting the data: 3, 3.5, 3.8, 3.9, 4, 4.2, 10

Here the median is 4th value: 3.9

The lower half is 3, 3.5, 3.8. So, $Q1 = 3.5$

The upper half is 4, 4.2, 10. So, $Q3 = 4.2$

So, $IQR = Q3 - Q1 = 4.2 - 3.5 = 0.7$

Finding the lower bound,

Lower bound = $Q1 - 1.5 × IQR $

= $3.5 - 1.5 × 0.7 = 2.45$

Finding the upper bound,

Upper bound = $Q3 + 1.5 × IQR$

= $4.2 + 1.5 × 0.7 = 5.25 $

The number below 2.45 and above 5.25 is the outlier

Here the outlier is 10.

Well explained 👍

Problem 3

A bakery records daily cupcake sales: 25, 30, 28, 35, 27, 500, 32. Find the outlier.

Okay, lets begin

The outlier is 500.

Explanation

Sorting the data: 25, 27, 28, 30, 32, 35, 500

The 4th value is the median, so the median is 30

The lower half is 25, 27, 28. So, Q1 is 27

The upper half is 32, 35, 500. So, Q3 is 35

$IQR = Q3 - Q1 $

So, $IQR = 35 - 27 = 8$

Now let’s find the lower bound,

Lower bound = $Q1 -1.5 × IQR = 27 - 1.5 × 8 = 15$

Upper bound = $Q3 +1.5 × IQR = 35 + 1.5 × 8 = 47$

Here, the outlier is below 15 and above 47, so the outlier is 500.

Well explained 👍

Problem 4

A group of friends records their heights in inches: 60, 61, 62, 63, 64, 65, 90. Identify the outlier.

Okay, lets begin

The outlier here is 90.

Explanation

Sorting the data in ascending order: 60, 61, 62, 63, 64, 65, 90

Here the median is the 4th value, which is 63

Therefore, the lower half is 60, 61, 62. So, Q1 is 61.5

The upper half is 64, 65, 90. So, Q3 is 64.5

$IQR = Q3 - Q1 = 64.5 - 61.5 = 3$

Lower bound = $Q1 - 1.5 × IQR$

$= 61.5 - 1.5 × 3 = 61.5 - 4.5 = 57$

Upper bound =$ Q3 + 1.5 × IQR = 64 + 1.5 × 4 $

= $64.5 + 1.5 × 3 = 64.5 + 4.5= 69$

Any value above 69 is an outlier. As $90 > 69$, it is the outlier.

Well explained 👍

Problem 5

A company records the number of employees working overtime each week: 5, 7, 6, 8, 6, 50, 7. Identify the outlier.

Okay, lets begin

The outlier here is 50.

Explanation

Sorting the data in ascending order: 5, 6, 6, 7, 7, 8, 50

Here the median is the 4th value, which is 7

Therefore, the lower half is 5, 6, 6. So Q1 is 6

The upper half is 7, 8, 50. So, Q3 is 8

$IQR = Q3 - Q1 = 8 - 6 = 2$

Lower bound = $Q1 - 1.5 × IQR$

$= 6 - 1.5 × 2 = 3$

Upper bound = $Q3 + 1.5 × IQR$

= $8 + 1.5 × 2 = 11$

Any value above 11 and below 3 is an outlier. As $50 > 11$ it is the outlier.

Well explained 👍

FAQs of Outlier

1.What is the 1.5 IQR rule for outliers?

2.How many deviations is an outlier?

3.What does IQR stand for?

4.Is the z-score an outlier?

5.How to eliminate outliers?

Jaipreet Kour Wazir

About the Author

Jaipreet Kour Wazir is a data wizard with over 5 years of expertise in simplifying complex data concepts. From crunching numbers to crafting insightful visualizations, she turns raw data into compelling stories. Her journey from analytics to education ref