Chapter 9 - More About Telephone Surveys
It's dinner time. The telephone rings.
No, a survey interviewer this time.
Your initial reaction is not to cooperate.
The interviewer explains that your household was carefully selected and that obtaining information from you is important to the success of the survey. How would you respond?
Certainly, many questions are raised by calls like this, including the following:
- How did the interviewer get your unlisted telephone number?
- Why won't the interviewer take a polite refusal as final?
So how did the interviewer get your number? And why did the interviewer say it was so important that your household be in the survey? If you have been reading the other chapters in this What Is a
Survey booklet you know the answer to the second question, but what about the first?
Generally, it is estimated that 96 percent, or even more, of all U.S. households have at least one telephone. For many topics studied in market research or opinion polling the differences between telephone and non-telephone households are relatively small.
When exactly are telephone households "representative" of all households? Households without a telephone are more common in the South, in rural areas, and on Indian reservations. Somewhat more often they have African-American members, low incomes, and either only one person or six or more persons. Children under age 14 and unemployed adults are also slightly more likely to live in households without telephones.
If the survey topic is related to these characteristics, omitting households without telephones will lead to a bias in the survey results. An example where this bias could be important is in studying crime victimization.
The decision to use a survey of telephone households to obtain data on a specific topic is not based entirely on the expected level of bias or error that may occur when non-telephone households are not included in the sample. The cost, timeliness, and overall quality of findings are also major considerations.
Telephone surveys are timelier and less expensive than those done face to face. Interviewer effects can be better controlled in telephone surveying. Self-administered mail surveys are less costly to conduct than telephone surveys but generally take more calendar time. See the chapter, More About Mail Surveys, for additional comparisons.
The U.S. Telephone System
Once you decide to conduct a telephone survey, an important issue is where to obtain a sample of telephone households. All are familiar with the 10-digit system of telephone numbers (a 3-digit area code, a 3-digit prefix, and a 4-digit suffix). Lately there have been many changes, such as the increase in area codes from splitting existing ones. Until recently, area codes have not crossed state lines. The introduction of number portability across geographic areas is causing some disruption to this system. For the most part, knowing the area code for a number still tells you in what state the number is located and sometimes in what part of the state.
Prefixes are assigned within area codes to an "exchange." Exchanges are geographic areas set by public service commissions within each state. Exchange boundaries seldom correspond to political boundaries.
Metropolitan areas usually have more than one prefix, rural exchanges often just one. Rural exchanges are typically the same size geographically as urban exchanges, even though they have much smaller populations and lower service needs. A single prefix of 10,000 numbers is more than adequate to meet rural requirements. For most such exchanges only a small share of the 10,000 available numbers are being used for residential or commercial service.
Because most of the geographic area of the United States is rural, most exchanges have only a single prefix; on the average, those rural exchange prefixes have a very low density of numbers currently in use.
Using Telephone Directories
An obvious source for sampling residential numbers would seem to be telephone directories. Approximately 5,000 are published in the United States each year. Not all working residential numbers appear in directories. Excluded are new numbers that were added since the directories were published, plus households choosing not to appear in telephone directories.
As a result, roughly 30 percent or more of all telephone households are not found in directories, although this varies quite a bit across states. What really matters is that unlisted telephone households are different. They are more likely to have lower (not higher!) incomes, to be single-person households, and to be concentrated in metropolitan locations, particularly central cities.
Using Completely Random Telephoning
If directories will not work, then why not simply generate telephone numbers randomly and call them? After all, for each 6-digit area code/prefix combination, one can create a full "telephone number" by appending a randomly generated 4-digit number.
This approach avoids bias but it requires you to call many, many nonworking telephone numbers to obtain the sample you want. The extra numbers called make completely random telephone surveys quite expensive to run, especially in rural areas.
In urban locations, a telephone number that is not in service is often (but not always) attached to a system that alerts the caller by a "tri-tone" followed by a message that the number is not in service. Many rural systems do not have such a recording but instead are attached to a recording of a ringing telephone. Screening randomly generated rural telephone numbers is very expensive because of this feature.
Exactly how bad is this problem of "ring no answers"? If only a small percentage of telephone numbers did not have tri-tones but were connected to ringing recordings, the cost of screening would be low enough so that randomly generated numbers could be used in a survey. Unfortunately, the presence of so many rural exchanges with a single prefix, only partially used, leads to perhaps 75 to 80 percent of the randomly generated numbers being unusable-a rate that makes it simply too expensive to randomly generate numbers and then just call them.
If we could determine the location of working residential telephone numbers within a given area code/prefix, telephone sampling would be straightforward. Working residential numbers are known to be clustered, but the location of these clusters is not known.
Calling a local telephone company would be time consuming and costly; moreover, they usually will not give out this information.
A Clever Idea
A statistician then working for CBS News, Warren Mitofsky, developed a method based on the clustering of telephone numbers. His method greatly improved telephone surveying, making it economically feasible on a large scale. The approach was two-phased. In phase one, he generated a relatively small sample of completely random telephone numbers by appending random 4-digit suffixes to known area code/prefix combinations and had interviewers call those numbers. Only 25 percent turned out to be working residential numbers.
In phase two, he had interviewers call additional numbers "close" to those that turned out to be working residential. He defined "close" to be numbers that were in the same "100-bank"-a set of numbers that have the same first two digits of the suffix. For example, suppose that the randomly generated telephone number 734-555-6789 was a working residential telephone number. Mitofsky would have interviewers dial other numbers selected at random in the sequence from 734-555-6700 to 734-555-6799. When he did this, 65 percent of the numbers dialed within those 100-banks were working residential-a big improvement over the 25 percent working residential in the first stage of the sampling. This two-stage design greatly increases the "hit rate" of working residential numbers in the second stage and considerably improves the efficiency of telephone sampling operations.
Mitofsky was unsure of some of the statistical properties of his approach, so he asked a colleague at Westat, Joseph Waksberg, to optimize it. Waksberg found several useful properties of the design. The design became known as the two-stage Mitofsky-Waksberg method.
Their method rapidly became standard for selecting telephone samples of households (and, in a few instances, even of business firms). It was inexpensive to obtain a list of all area codes and prefixes, generate numbers randomly for the first stage, and then call them to find out which were working and residential. In the second stage, within a "working residential 100-bank,"the higher hit rate reduced the amount of dialing that had to be done by interviewers.
Another Clever Idea
The Mitofsky-Waksberg method was not without a few problems, and researchers continued to look for other ways to select samples more efficiently. They eventually went back to the telephone directories and augmented them in a way that incorporated Mitofsky's essential insight and reduced costs still further. This method, known as "list-assisted," employs a commercial list as the starting point for sampling.
Commercial firms that mail advertisements to households need lists of households with complete addresses, including zip codes. There is no master list of households in the United States available from public sources, so a commercial firm, MetroMail, Inc., developed such a list from telephone directories.
Their list is updated continuously as telephone directories are published throughout the year. Approximately 65 million U.S. telephone households are maintained on the file. The list is supplemented with lists of automobile registrations from more than 30 states that sell these lists. The resulting file contains more than 75 million households.
A second firm, R.H. Donnelley, Inc., utilizes a computer program that matches addresses to zip codes and assigns a zip code to each entry on the MetroMail file. They also assign data from the most recent Census of Population and Housing to each household. However, the census data is limited to information about the block or the census tract where the household is located.
Even after supplementation, the combined commercial list is almost entirely made up of listed telephone numbers. A sample from it would yield selections subject to the same kinds of concerns that are raised for directory-based samples. The commercial list does contain valuable information about the location of telephone numbers within area code/prefix combinations and a mailing address that can be used to do follow-up mailings to nonrespondents.
If sorted by telephone number, the commercial list provides a way to screen out 100-banks that did not have any listed numbers without having to do a first stage of sample selection. This allows telephone survey organizations to drop 100-banks that did not have any listed numbers and draw samples at random from within the remaining 100-banks. This design became known as "list-assisted" because the random selections were "assisted" by preliminary screening based on listed working residential numbers.
The list-assisted method has become a popular alternative to the Mitofsky-Waksberg design. "Hit rates" among randomly generated numbers drawn from 100-banks with one or more listed numbers were initially around 55 percent, a drop from the 65 percent of the second stage of the Mitofsky-Waksberg method. But, the list-assisted method proved to be easier to administer and had slightly better properties in terms of the reliability of estimates derived from its samples.
Several commercial firms began to purchase the counts of listed numbers by 100-bank from Donnelley. These firms selected samples from those 100-banks with one (or sometimes two) or more listed numbers and sold the samples to various market research and public opinion survey organizations. Now, a survey organization no longer had to generate its own sample. It could simply buy it!
Over time, samples have become increasingly sophisticated. Sampling firms link information about the geography of the exchange, or even the prefix, or 100-bank to each sample number and sell "targeted" samples that would have higher proportions of households with specific characteristics. For example, a researcher may want a sample that would have a higher yield of households with annual incomes above a certain level. Telephone samples based on income information linked to the bank, or prefix, for the number are readily available.
The telephone system continues to change as new services and, with deregulation, new providers enter the market. Consider three challenges:
Currently, there are nearly 70 million cell-phone subscribers in the United States. Most can still be contacted via a traditional (land line) telephone in a household. Because of this, cell phone numbers can, and typically are now, excluded from sampling to begin with since they are classified by NXX codes. Conceivably, cell-phone subscribers may begin using their cell telephones for residential purposes, requiring that samples of such numbers be taken. Therefore, households with both traditional and cell phones would get a higher chance of being selected than households without cell phones. To deal with this "overrepresentation," we could correct the probabilities of selection, just as is now being done for households with multiple line-telephone numbers.
Answering Machines and caller ID's
Answering machines and caller-ID services pose a growing challenge to telephone survey organizations. Recent data show that as many as 55 percent of all telephone households report that they use an answering machine to screen calls most of the time, or always. Organizations conducting telephone surveys often leave messages on answering machines with toll-free numbers for households to call. A surprising number of households using an answering machine to screen calls can eventually be reached through toll-free numbers or repeated attempts to reach the household when the answering machine or caller ID is not being used for screening.
Falling Response Rates
As survey researchers learn more and more about features of the telephone system, they continue to modify telephone sampling procedures to make them more efficient. One challenge that they have not yet fully addressed is the near-saturation calling conducted by telemarketers and the effect this has had on lowering survey cooperation rates. Survey researchers must work to reverse this trend in order to maintain the scientific validity of telephone surveys. Otherwise, telephone surveys, as we know them, could disappear within the next five years.
Where Can I Get More Information
Since this Chapter was written response rates have continued to fall in virtually all surveys, but especially telephone surveys. Even so, fortunately, the statement that "telephone surveys, as we know them, could disappear within the next five years" has proved premature.
Efforts to improve the modeling and measurement of nonresponse biases have gone hand-in-hand with more use of mixed mode surveys that mix together mail surveys, which remain relatively cheap with Internet surveys which can be cheaper still but have their own nonresponse problems.
The explosion of cell phone use bears watching, as does the use of the "No Call" list which while it is aimed at telemarketers also affects telephone survey response rates in ways that are yet to be determined.
Two list serves where news on changes in this data collection mode regularly appears are SRMSNET and AAPORNET, sponsored respectively by the ASA Section on Survey Research Methods an the American Association of Public Opinion Research.