Measuring Central Tendency
Last updated
Was this helpful?
Last updated
Was this helpful?
The Statsville Health Club prides itself on its ability to find the perfect class for everyone. Whether you want to learn how to swim, practice martial arts, or get your body into shape, they have just the right class for you. The staff at the health club have noticed that their customers seem happiest when they’re in a class with people their own age, and happy customers always come back for more. It seems that the key to success for the health club is to work out what a typical age is for each of their classes, and one way of doing this is to
calculate the average
. The average gives a representative age for each class, which the health club can use to help their customers pick the right class.
Then mean is
Mean is represented with u ( MEW)
Sometimes mean will be pulled or impacted by outliers - An extreme high or low value that stands out from the rest of the data. When outliers “pull” the data to the left or right, it is called Skewed Data.
If the mean becomes misleading because of skewed data and outliers, then we need some other way of saying what a typical value is. We can do this by, quite literally, taking the middle value. This is a different sort of average, and it’s called the median
. To find the median of the Kung Fu class, line up all the ages in ascending order, and then pick the middle value
If you have an even set of numbers, just take the mean of the two middle numbers
(add them together, and divide by 2), and that’s your median.
The mean and median for the class are both 17, even though there are no 17-year-olds in the class!
But what if there had been an odd number of people in the class. Both the mean and median would still have been misleading.
Additional Problem is
Why do you think the mean and median both failed for this data? Why are they misleading?
Both the mean and median are misleading for this set of data because neither fully represents the typical ages of people in the class. The mean suggests that teenagers go to the class, when in fact there are none. The median also has this problem, but it can fluctuate wildly if other people join the class.
If you had to pick one age to represent this class, what would it be? Why?
It’s not really possible to pick a single age that fully represents the ages in the class. The class is really made up of two sets of ages, those of the children and those of the parents. You can’t really represent both of these groups with a single number.
The mode of a set of data is the most popular value, the value with the highest frequency. Unlike the mean and median, the mode absolutely has to be a value in the data set, and it’s the most frequent value.
If a set of data has two modes, then we call the data bimodal
. The mode doesn’t just work with numeric data; it works with categorical data, too. When you’re dealing with categorical data, the mode is the most frequently occurring category.
When do you think the mode is most useful?
When the data set has a low number of modes, or when the data is categorical instead of numerical. Neither the mean nor the median can be used with categorical data.
When is the mode least useful?
When there are many modes
Which is to use When