Vocabulary words: 8 | Slideshow version |
Statistics are unavoidable, so a basic understanding of how they work will make you at least an informed consumer, and maybe you take that knowledge into further education. So, let’s get a couple of definitions out of the way.
Data consists of information coming from observations, counts, measurements, or responses.
Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
The first concept we’ll talk about are populations and samples. Where populations include everyone in interest, a sample would be just a subset of the population.
A population is the collection of all outcomes, responses, measurements, or counts that are of interest.
A sample is a subset, or part, of a population.
Getting data from an entire population, say the height of every person living in Manville, would be difficult. So instead, you sample the population (yes, it’s a verb, too) to give you an idea of what the entire population looks like.
In our made up scenario just above, we were getting the heights of everyone in Manville, after which we could find the average height of the town. This brings us to our next two definitions.
A parameter is a numerical description of a population characteristic.
A statistic is a numerical description of a sample characteristic.
So, if we wanted the average height of everyone in this class, which would be the population, the average height is a parameter. Think of it as a cold, hard fact.
But, if we got a sample of people living in town and came up with an average, it’s a statistic instead. Not a fact, more of an educated guess.
Lastly, we’re going to distinguish two studies of statistics.
Descriptive statistics is the branch of statistics that involves the organization, summarization, and display of data.
Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.
New scenario. A large sample of men, aged 48, was studied for 18 years. For unmarried men, approximately 70% were alive at age 65. For married men, 90% were alive at age 65. Those last two statements are descriptive statistics. They are summaries of the data found in the study.
Inferential statistics would be taking that information and claiming that married men, in general, live longer. We used information gathered from a sample to make a conclusion about a population. This is the trickier part of statistics, because how to do “prove” that claim is correct?