Page 98 - DLIS401_METHODOLOGY_OF_RESEARCH_AND_STATISTICAL_TECHNIQUES
P. 98
Unit 6: Sampling Techniques
6.1 Population Definition Notes
Successful statistical practice is based on focused problem definition. In sampling, this includes
defining the population from which our sample is drawn. A population can be defined as
including all people or items with the characteristic one wishes to understand. Because there
is very rarely enough time or money to gather information from everyone or everything in a
population, the goal becomes finding a representative sample (or subset) of that population.
Sometimes that which defines a population is obvious. For example, a manufacturer needs to
decide whether a batch of material from production is of high enough quality to be released
to the customer, or should be sentenced for scrap or rework due to poor quality. In this case,
the batch is the population.
Although the population of interest often consists of physical objects, sometimes we need to
sample over time, space, or some combination of these dimensions. For instance, an investigation
of supermarket staffing could examine checkout line length at various times, or a study on
endangered penguins might aim to understand their usage of various hunting grounds over
time. For the time dimension, the focus may be on periods or discrete occasions.
In other cases, our ‘population’ may be even less tangible. For example, Joseph Jagger studied
the behaviour of roulette wheels at a casino in Monte Carlo, and used this to identify a biased
wheel. In this case, the ‘population’ Jagger wanted to investigate was the overall behaviour of
the wheel (i.e., the probability distribution of its results over infinitely many trials), while his
‘sample’ was formed from observed results from that wheel. Similar considerations arise when
taking repeated measurements of some physical characteristic such as the electrical conductivity
of copper.
This situation often arises when we seek knowledge about the cause system of which the
observed population is an outcome. In such cases, sampling theory may treat the observed
population as a sample from a larger ‘superpopulation’. For example, a researcher might
study the success rate of a new ‘quit smoking’ program on a test group of 100 patients, in
order to predict the effects of the program if it were made available nationwide. Here the
superpopulation is “everybody in the country, given access to this treatment”— a group which
does not yet exist, since the program isn’t yet available to all.
Note also that the population from which the sample is drawn may not be the same as the
population about which we actually want information. Often there is large but not complete
overlap between these two groups due to frame issues etc. Sometimes they may be entirely
separate for instance, we might study rats in order to get a better understanding of human
health, or we might study records from people born in 2008 in order to make predictions
about people born in 2009.
Time spent in making the sampled population and population of concern precise is often well
spent, because it raises many issues, ambiguities and questions that would otherwise have
been overlooked at this stage.
6.2 Sampling Frame
In the most straightforward case, such as the sentencing of a batch of material from production
(acceptance sampling by lots), it is possible to identify and measure every single item in the
population and to include any one of them in our sample. However, in the more general case
this is not possible. There is no way to identify all rats in the set of all rats. There is no way
to identify every voter at a forthcoming election (in advance of the election).
LOVELY PROFESSIONAL UNIVERSITY 93