What is an Outlier in Math?
Discover methods for identifying the numbers that stand out
Published:
Jan 2025
Key takeaways
- • An outlier refers to a piece of data that differs from all the other data in a set of variables.
- • There are three types of outliers: global, contextual, and collective. Each represents a variety of outliers you may encounter in your data, whether a single number or a group of numbers stands out.
- • Outliers can occur for several reasons. They may have happened naturally during the information gathering or experiment, but they are usually caused by an error.
Table of contents
Definition of an Outlier
Anytime you’re working with data, you might end up with an outlier.
An outlier refers to a piece of data with an extreme value, differing from all the other data in a set of variables. It’s like if you had a grocery list with variables being apples, oranges, carrots, and pears. Carrots would be the outlier.
In statistics, you will often encounter outliers. You must be able to identify an outlier so it doesn’t skew your data.
Get more practice on odd and even numbers with DreamBox Math
DREAMBOX MATH
Get started for FREE today!
Types of Outliers
There are three types of outliers: global, contextual, and collective. Each represents a variety of outliers you may encounter in your data, whether a single number stands out or a group.
Global Outliers
You’re most likely to find global outliers. These numbers stand out from the rest by themselves in a group. Within global outliers, you’ll have two types: univariate and multivariate.
Univariate Outliers
A univariate outlier represents data points for one variable.
Example: If you had a list of people’s ages that included 75, 77, 76, 70, and 8, the 8 would be your univariate outlier.
Multivariate Outliers
Unlike univariate outliers, multivariate outliers are pieces of data representing more than one variable. Instead of a chart with ages, you could have ages and the number of children.
Example: With two variables, you might have:
Ages: 66, 67, 68, 70, 71, 45
Number of Children: 1, 2, 2, 1, 2, 6
45 and 6 would be the outliers.
Contextual (Conditional) Outliers
A contextual outlier is a data point that is much different from the others but with good reason.
Example: If a store recorded its sales and had a huge increase on Black Friday, that number would be a contextual outlier.
In this situation, the larger increase in sales is abnormal compared to average sales but could be considered normal in a busier shopping season.
Collective Outliers
When several pieces of data differ from the overall data, you have a group of collective outliers. These data points make sense together but don’t necessarily make sense with all the other points.
Example: If a video creator has a video go viral, the viewer data from the day it goes viral and several days after might be significantly higher than the rest of their views. However, the high number of views for those days might not look like outliers if you compare the numbers next to each other.
Causes of Outliers
Outliers can occur for several reasons. They may have happened naturally during the information gathering or experiment, but they are usually caused by an error.
Data Entry Error
You may enter the wrong information when entering the data points into a document. Since the experiment didn’t cause it, this is known as a data entry error.
For example, you may accidentally type “43” instead of “34” for the height of a plant you’re measuring daily.
Experimental Errors
Outliers are experimental errors if they happen because something in the experiment goes wrong.
An example would be if a measurement tool was incorrectly used or the environment surrounding the experiment changed. So, if you record the outside temperature every day for two weeks at 3:00 PM but check the temperature at 9:00 PM one day, you may have an experimental error.
Natural Variability
Sometimes, an outlier isn’t an error at all and is just a naturally occurring situation. Outliers can represent these natural shifts, especially variables that change quickly, like weather patterns.
For instance, a series of data points on daily winter temperatures in Fahrenheit leading up to a cold front could read 32, 34, 36, 29, 30, -2, -6, and 30. The natural variability outliers are -2 and -6.
Start Your Free Trial
See how DreamBox can help your seventh grader with math by signing up for a free trial today.
No credit card required!
Detecting Outliers
There are various ways to detect outliers. Each can help with different types of data, but all are important for identifying outliers in data.
Visual Methods
These methods show your data visually so you can easily see the outliers.
- Box Plot: A box plot is a graph with a box showing where the middle of the numerical data is. It has lines (or whiskers) extending to the highest and lowest numbers.
- Histogram: A histogram is a bar chart that shows the frequency of numeric data points on a chart. Outlier bars are often much higher or lower than all the other bars.
Scatter Plot: A scatter plot is a chart with each data point represented by a dot. These dots appear scattered across the chart, often in clumps. Outliers are usually by themselves, away from the groups of dots.
Statistical Methods
Finding outliers with statistics is especially helpful if you have a lot of data or want to study a data pattern.
- Dixon’s Q Test: Best for small numerical data sets, you use the Q test to find a single outlier.
- Grubbs’ Test: Used for univariate data sets and done to find (and remove) one outlier at a time until there are no more outliers.
- Interquartile Range (IQR): A way of finding outliers by dividing the data into quartiles or quarters. You then subtract the first quartile median or middle number from the third quartile median.
Z-Score: Identifies an outlier but also shows how many standard deviations the outlier is from the mean or average.
Handling Outliers
When outliers affect your data sample, you may need to deal with them.
One simple way to do that is to remove them if you have determined they were errors in your data. Another way is to replace errors with your other data’s average or middle point.
You could also perform a statistical move called a trimmed mean. In that case, you remove the highest and lowest values and take the average of the remaining numbers.
Real-World Examples of Outliers
Outliers happen in the real world all the time. Sometimes, they even end up in places like the Guinness Book of World Records. Here are some examples of those real-life outliers:
- The tallest living man in the world is Sultan Kösen at 8’ 2 ⅘”, and the average height of a man on Earth is 5’ 7 ½”. This is a univariate outlier.
In New York City, the highest annual snowfall was 75 1/3” from 1995 to 1996. However, the city’s average annual snowfall is 29 1/5”. This is a contextual outlier since a blizzard caused this record-breaking snowfall.
Conclusion
Anytime you have a dataset, it’s possible to have an outlier. They can happen because of mistakes or because the data naturally shifts. No matter why an outlier exists in your work, it’s vital to know how to find them and remove them for more accurate results.
FAQs about Outliers
There are several ways to find outliers. You can sort the data into a chart to visually represent where the out-of-place variables are. You could also use a formula like z-score or the interquartile range to find them.
An outlier is a piece of data much higher or lower than the average of the other data points.
Yes! An outlier can be a minimum or maximum depending on how far off it is from your other data points. It’s very likely that a single data point, either the minimum or maximum, is an outlier if it’s not close to the middle.
Yes! Depending on what your data represents, you can have a negative outlier. For example, if you’re measuring temperatures in winter, you may have negative numbers as outliers.
Take at home math practice to the next level
Empowering parents and educators to make math practice more impactful. Plus, your kids will love it.