Page 98 - DLIS401_METHODOLOGY_OF_RESEARCH_AND_STATISTICAL_TECHNIQUES
P. 98

Unit 6: Sampling Techniques




          6.1    Population Definition                                                             Notes


          Successful statistical practice is based on focused problem definition. In sampling, this includes
          defining the population from which our sample is drawn. A population can be defined as
          including all people or items with the characteristic one wishes to understand. Because there
          is very rarely enough time or money to gather information from everyone or everything in a
          population, the goal becomes finding a representative sample (or subset) of that population.
          Sometimes that which defines a population is obvious. For example, a manufacturer needs to
          decide whether a batch of material from production is of high enough quality to be released
          to the customer, or should be sentenced for scrap or rework due to poor quality. In this case,
          the batch is the population.
          Although the population of interest often consists of physical objects, sometimes we need to
          sample over time, space, or some combination of these dimensions. For instance, an investigation
          of supermarket staffing could examine checkout line length at various times, or a study on
          endangered penguins might aim to understand their usage of various hunting grounds over
          time. For the time dimension, the focus may be on periods or discrete occasions.
          In other cases, our ‘population’ may be even less tangible. For example, Joseph Jagger studied
          the behaviour of roulette wheels at a casino in Monte Carlo, and used this to identify a biased
          wheel. In this case, the ‘population’ Jagger wanted to investigate was the overall behaviour of
          the wheel (i.e., the probability distribution of its results over infinitely many trials), while his
          ‘sample’ was formed from observed results from that wheel. Similar considerations arise when
          taking repeated measurements of some physical characteristic such as the electrical conductivity
          of copper.
          This situation often arises when we seek knowledge about the cause system of which the
          observed population is an outcome. In such cases, sampling theory may treat the observed
          population as a sample from a larger ‘superpopulation’. For example, a researcher might
          study the success rate of a new ‘quit smoking’ program on a test group of 100 patients, in
          order to predict the effects of the program if it were made available nationwide. Here the
          superpopulation is “everybody in the country, given access to this treatment”— a group which
          does not yet exist, since the program isn’t yet available to all.
          Note also that the population from which the sample is drawn may not be the same as the
          population about which we actually want information. Often there is large but not complete
          overlap between these two groups due to frame issues etc. Sometimes they may be entirely
          separate for instance, we might study rats in order to get a better understanding of human
          health, or we might study records from people born in 2008 in order to make predictions
          about people born in 2009.
          Time spent in making the sampled population and population of concern precise is often well
          spent, because it raises many issues, ambiguities and questions that would otherwise have
          been overlooked at this stage.

          6.2    Sampling Frame


          In the most straightforward case, such as the sentencing of a batch of material from production
          (acceptance sampling by lots), it is possible to identify and measure every single item in the
          population and to include any one of them in our sample. However, in the more general case
          this is not possible. There is no way to identify all rats in the set of all rats. There is no way
          to identify every voter at a forthcoming election (in advance of the election).





                                           LOVELY PROFESSIONAL UNIVERSITY                                    93
   93   94   95   96   97   98   99   100   101   102   103