The arithmetic mean, or simply the mean, is the calculated value that is the most equidistant from all other values in a dataset. That is, when all values are considered in relation to each other, the mean is the value that represents the minimum possible distance among all of them. The mean can be considered as a parametric measure of central tendency and as the proper average when doing statistical analysis at interval or ratio levels.
Interpreting a mean
The mean is simply interpreted as the average value in a dataset, althgouh, literally, it is a calculated value which is the most equidistant from all others in the dataset.
For example, if we have a set of values like 1, 2, 3, their mean is 2. The distances between values and mean are -1, 0, +1. When added up, they result in 0. Thus, the mean 2 is the "middle" value that better represents all other values in the dataset. If we have the set of values 1, 2, 9, their mean is 4. The distances between values and mean are -3, -2, +5. When added up, they also result in 0. Again, the mean 4 is the "middle"value that better represents all other values in the dataset.
Notice that the mean does not need to be one of the "real" values in the sample. The mean is, simply, the result of a formula. This is why you can read things like the mean population growth of a country is 1.3 children per couple per year (even when nobody can actually have neither 1.3 nor 0.3 of a child).
When used in a quasi-inferential manner, the mean informs of the most equidistant value in a future dataset.
For example, imaging that you need to fly to the same destination above in the near future and you want to save money for the ticket. Yet, you need to administer your money well: you don't want to save too little and find yourself with no money for the airfare, nor you want to save to much and go hungry. How much money should you save? If you have no other information than your past experience, then you can consider the mean as the most accurate value you can rely on and, thus, expect your future airfare to be $150. This is the amount of money that represents the best compromise for your financial goals, and, thus, the amount that you should save for your future ticket.
- The mean is, probably, the most popular measure of central tendency. This is why you often find results such as that a population growth is 1.3 children per couple per year. We may understand the meaning but the interpretation is, nonetheless, unrealistic.
- The mean is used with interval and ratio variables. It can be used with ordinal variables, although the result is more of a median than a mean proper. It, however, does not make any sense to calculate the mean of nominal or categorical variables.
- Because the mean is the most equidistant value to all other values in the sample, it is highly sensitive to extreme values, to the point of results becoming ridiculous.
For example, imagine that you do a survey at work and collect data from a sample of 5 workers regarding average pay at your company. The dataset is something like the following (in thousands of $): 4, 2, 3, 3, 3; mean = 3 ($3,000/month). The mean seems to represent well the average monthly pay in the sample and, probably, in the organization. However, if by chance, you get a big earner (an extreme case) in your sample, the results may be quite misleading. E.g. 39, 2, 3, 3, 3; mean = 10 ($10,000/month). In this second case, the extreme value has such an effect that the mean is increased by $7,000, definitely not a representative average. Ways of correcting the effect of extreme cases are, for example, to eliminate such extreme cases or use a truncated mean.
Want to know more?
- Wikipedia - Arithmetic mean
- You can learn a bit more about the (arithmetic) mean in this Wikipedia page.
- Khan Academy - Mean, median and mode
- Khan teaches about central tendency measures (mean, median and mode) in this video.
|Khan Academy (undated - embedded from YouTube on 28 April 2012)|