Analytic
The assignment due date is January 24th. Please read the instructions carefully.
The assignment must be APA format with In-Text Citation and research References must be included.
2 Chapter 1 | The Where, Why, and How of Data Collection
Business
Statistics
A collection of procedures and techniques that are used to convert data into meaningful information in a business environment.
1.1
Chapter 18 provides an overview of business analytics and introduces you to Microsoft analytics software called Microsoft Power BI. People working in this field are referred to as “data scientists.” Doing an Internet search on data mining will yield a large number of sites that describe the field.
In today’s workplace, you can have an immediate competitive edge over other new employees, and even those with more experience, by applying statistical analysis skills to real-world decision making. The purpose of this text is to assist in your learning and to complement your instructor’s efforts in conveying how to apply a variety of important statistical procedures.
Cell phone companies such as Apple, Samsung, and LG maintain databases with information on production, quality, customer satisfaction, and much more. Amazon collects data on customers’ online purchases and uses the data to suggest additional items the customer may be interested in purchasing. Walmart collects and manages massive amounts of data related to the operation of its stores throughout the world. Its highly sophisticated database systems contain sales data, detailed customer data, employee satisfaction data, and much more. Governmental agencies amass extensive data on such things as unemployment, interest rates, incomes, and education. However, access to data is not limited to large companies. The relatively low cost of computer hard drives with massive data storage capacities makes it possible for small firms and even individuals to store vast amounts of data on desktop computers. But without some way to transform the data into useful information, the data these companies have gathered are of little value.
Transforming data into information is where business statistics comes in—the statistical procedures introduced in this text are those that are used to help transform data into information. This text focuses on the practical application of statistics; we do not develop the theory you would find in a mathematical statistics course. Will you need to use math in this course? Yes, but mainly the concepts covered in your college algebra course.
Statistics does have its own terminology. You will need to learn various terms that have special statistical meaning. You will also learn certain dos and don’ts related to statistics. But most importantly, you will learn specific methods to effectively convert data into information. Don’t try to memorize the concepts; rather, go to the next level of learning called understanding. Once you understand the underlying concepts, you will be able to think statistically.
Because data are the starting point for any statistical analysis, Chapter 1 is devoted to discussing various aspects of data, from how to collect data to the different types of data that you will be analyzing. You need to gain an understanding of the where, why, and how of data and data collection, because the remaining chapters deal with the techniques for transforming data into useful information.
What Is Business Statistics?
Articles in your local newspaper and on the Internet, news stories on television, and national publications such as The Wall Street Journal and Fortune discuss stock prices, crime rates, government-agency budgets, and company sales and profit figures. These values are statistics, but they are just a small part of the discipline called business statistics, which provides a wide variety of methods to assist in data analysis and deci- sion making.
Business statistics can be segmented into two general categories. The first category involves the procedures and techniques designed to describe data, such as charts, graphs, and numerical measures. The second category includes tools and techniques that help decision makers draw inferences from a set of data. Inferential procedures include estimation and hypothesis testing. A brief discussion of these techniques follows.
Excel 2016 Instructions:
1. Open File: Independent Textbook.xlsx.
BUSINESS APPLICATION Describing Data
Independent Textbook Publishing, Inc. Independent Textbook Publishing, Inc. publishes 15 college-level texts in the business and social sciences areas. Figure 1.1 shows an Excel spreadsheet containing data for each of these 15 textbooks. Each column in the spread- sheet corresponds to a different factor for which data were collected. Each row corresponds to a different textbook. Many statistical procedures might help the owners describe these text- book data, including descriptive techniques such as charts, graphs, and numerical measures.
FIGURE 1.1 Excel 2016 Spreadsheet of Independent Textbook Publishing, Inc.
Charts and Graphs Chapter 2 will discuss many different charts and graphs—such as the one shown in Figure 1.2, called a histogram. This graph displays the shape and spread of the distribution of number of copies sold. The bar chart shown in Figure 1.3 shows the total number of textbooks sold broken down by the two markets, business and social sciences.
Bar charts and histograms are only two of the techniques that can be used to graphically analyze the data for the textbook publisher. In Chapter 2, you will learn more about these and other techniques.
Independent Textbook Publishing, Inc. Distribution of Copies Sold
Descriptive Statistics
1.1 What Is Business Statistics? | Chapter 1 3
FIGURE 1.2 Histogram Showing the Copies Sold Distribution
8 7 6 5 4 3 2 1
0
Under 50,000
50,000 < 100,000
100,000 < 150,000
150,000 < 200,000
Number of Copies Sold
Number of Books
4 Chapter 1 | The Where, Why, and How of Data Collection
FIGURE 1.3 Bar Chart Showing Copies Sold by Sales Category
Social Sciences
Business
Total Copies Sold by Market Class
100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000
Total Copies Sold
Statistical Inference Procedures
Procedures that allow a decision maker to reach a conclusion about a set of data based on a subset of that data.
In addition to preparing appropriate graphs, you will compute a variety of numerical measures. Chapter 3 introduces the most important measures that are used along with graphs, charts, and tables to describe data.
Inferential Procedures
Advertisers pay for television ads based on the audience level, so knowing how many view- ers watch a particular program is important; millions of dollars are at stake. Clearly, the networks don’t check with everyone in the country to see if they watch a particular program. Instead, they pay a fee to the Nielsen company (www.nielsen.com/), which uses statistical inference procedures to estimate the number of viewers who watch a particular television program.
There are two primary categories of statistical inference procedures: estimation and hypothesis testing. These procedures are closely related but serve very different purposes.
Estimation In situations in which we would like to know about all the data in a large data set but it is impractical to work with all the data, decision makers can use techniques to esti- mate what the larger data set looks like. These techniques arrive at estimates by looking closely at a subset of the larger data set.
For example, energy-boosting drinks such as Red Bull, Rockstar, Monster, and Full Throttle have become very popular among college students and young professionals. But how do the companies that make these products determine whether they will sell enough to warrant the product introduction? A typical approach is to do market research by introduc- ing the product into one or more test markets. People in the targeted age, income, and educational categories (target market) are asked to sample the product and indicate the likelihood that they would purchase the product. The percentage of people who say that they will buy forms the basis for an estimate of the true percentage of all people in the tar- get market who will buy. If that estimate is high enough, the company will introduce the product.
In Chapter 8, we will discuss the estimating techniques that companies use in new prod- uct development and many other applications.
Hypothesis Testing Media advertising is full of product claims. For example, we might hear that “Goodyear tires will last at least 60,000 miles” or that “more doctors rec- ommend Bayer Aspirin than any other brand.” Other claims might include statements like “General Electric light bulbs last longer than any other brand” or “customers prefer McDonald’s over Burger King.” Are these just idle boasts, or are they based on actual data? Probably some of both! However, consumer research organizations such as Consumers Union, publisher of Consumer Reports, regularly test these types of claims. For example, in the hamburger case, Consumer Reports might select a sample of customers who would be asked to blind taste test Burger King’s and McDonald’s hamburgers, under the hypothesis that there is no difference in customer preferences between the two restaurants. If the sam- ple data show a substantial difference in preferences, then the hypothesis of no difference would be rejected. If only a slight difference in preferences was detected, then Consumer Reports could not reject the hypothesis. Chapters 9 and 10 introduce basic hypothesis- testing techniques that are used to test claims about products and services using informa- tion taken from samples.
0
Market Classification
1.2 Procedures for Collecting Data | Chapter 1
5
1.1 EXERCISES
Skill Development
1. 1-1. For the following situation, indicate whether the statistical application is primarily descriptive or inferential.
“The manager of Anna’s Fabric Shop has collected data for 10 years on the quantity of each type of dress fabric that has been sold at the store. She is interested in making a presentation that will illustrate these data effectively.”
2. 1-2. Consider the following graph that appeared in a company annual report. What type of graph is this? Explain.
6. 1-6. Locate a business periodical such as Fortune or Forbes or a business newspaper such as The Wall Street Journal. Find three examples of the use of a graph to display data. For each graph,
a. Give the name, date, and page number of the periodical in which the graph appeared.
b. Describe the main point made by the graph.
c. Analyze the effectiveness of the graphs.
7. 1-7. The human resources manager of an automotive supply
store has collected the following data showing the number of employees in each of five categories by the number of days missed due to illness or injury during the past year.
$45,000 $40,000 $35,000 $30,000 $25,000 $20,000 $15,000 $10,000
$5,000
Food Store Sales
Missed Days Employees
0–2 days 159
3–5 days 67
6–8 days 32
8–10 days 10
$0
Fruit and Meat and Canned Goods Cereal and Other
Construct the appropriate chart for these data. Be sure
to use labels and to add a title to your chart.
8. 1-8. Suppose Fortune would like to determine the average
age and income of its subscribers. How could statistics
be of use in determining these values?
9. 1-9. Locate an example from a business periodical or
newspaper in which estimation has been used.
1. What specifically was estimated?
2. What conclusion was reached using the estimation?
3. Describe how the data were extracted and how they
were used to produce
the estimation.
4. Keeping in mind the goal of the estimation, discuss
whether you believe that the estimation was
successful and why.
5. Describe what inferences were drawn as a result of
the estimation.
10. 1-10. Locate one of the online job websites and pick several
job listings. For each job type, discuss one or more situations in which statistical analyses would be used. Base your answer on research (Internet, business periodicals, personal interviews, etc.). Indicate whether the situations you are describing involve descriptive statistics or inferential statistics or a combination of both.
Vegetables
Poultry Department Dry Goods
3. 1-3. Review Figures 1.2 and 1.3 and discuss any differences you see between the histogram and the bar chart.
4. 1-4. Think of yourself as working for an advertising firm. Provide an example of how hypothesis testing can be used to evaluate a product claim.
Business Applications
1-5. Describe how statistics could be used by a business to determine if the dishwasher parts it produces last longer than a competitor’s brand.
1.2
outcome 1
Procedures for Collecting Data
We have defined business statistics as a set of procedures that analysts use to transform data into information. Before you learn how to use statistical procedures, it is important that you become familiar with different types of data collection methods.
Primary Data Collection Methods
Many methods and procedures are available for collecting data. The following are considered some of the most useful and frequently used data collection methods:
●● Experiments
· ●● Telephone surveys
· ●● Written questionnaires and online surveys
· ●● Direct observation and personal interviews
Monthly Sales
6 Chapter 1 | The Where, Why, and How of Data Collection
BUSINESS APPLICATION Experiments
Experiment
A process that produces a single outcome whose result cannot be predicted with certainty.
Experimental Design
A plan for performing an experiment in which the variable of interest is defined. One or more factors are identified to be manipulated, changed, or observed so that the impact (or influence) on the variable of interest can be measured or observed.
FIGURE 1.4 Data Layout for the French Fry Experiment
Food Processing A company often must conduct a specific experiment or set of experi- ments to get the data managers need to make informed decisions. For example, Con-Agra Foods, Inc., McCain Foods from Canada, and the J. R. Simplot Company are the primary suppliers of french fries to McDonald’s in North America. These companies have testing facilities where they conduct experiments on their potato manufacturing processes. McDonald’s has strict standards on the quality of the french fries it buys. One important attribute is the color of the fries after cooking. They should be uniformly “golden brown”— not too light or too dark.
French fries are made from potatoes that are peeled, sliced into strips, blanched, partially cooked, and then freeze-dried—not a simple process. Because potatoes differ in many ways (such as sugar content and moisture), blanching time, cooking temperature, and other factors vary from batch to batch.
Company employees start their experiments by grouping the raw potatoes into batches with similar characteristics. They run some of the potatoes through the line with blanch time and temperature settings at specific levels defined by an experimental design. After measur- ing one or more output variables for that run, employees change the settings and run another batch, again measuring the output variables.
Figure 1.4 shows a typical data collection form. The output variable (for example, per- centage of fries without dark spots) for each combination of potato category, blanch time, and temperature is recorded in the appropriate cell in the table. Chapter 12 introduces the funda- mental concepts related to experimental design and analysis.
Blanch Time
10 minutes
15 minutes
20 minutes
25 minutes
Blanch Temperature
100 110 120
100 110 120
100 110 120
100 110 120
Potato Category
1 2 3 4
BUSINESS APPLICATION TelephoneSurveys
Public Issues Chances are that you have been on the receiving end of a telephone call that begins something like: “Hello. My name is Mary Jane and I represent the XYZ organization. I am conducting a survey on . . .” Political groups use telephone surveys to poll people about candidates and issues. Marketing research companies use phone surveys to learn likes and dislikes of potential customers.
Telephone surveys are a relatively inexpensive and efficient data collection procedure. Of course, some people will refuse to respond to a survey, others are not home when the calls come, and some people do not have home phones—they only have a cell phone—or cannot be reached by phone for one reason or another. Figure 1.5 shows the major steps in conducting a telephone survey. This example survey was run a number of years ago by a Seattle television station to determine public support for using tax dollars to build a new football stadium for the National Football League’s Seattle Seahawks. The survey was aimed at property tax payers only.
Because most people will not stay on the line very long, the phone survey must be short—usually one to three minutes. The questions are generally what are called
FIGURE 1.5 Major Steps for a Telephone Survey
Define the Issue
Develop Survey Questions
1.2 Procedures for Collecting Data | Chapter 1 7 Do taxpayers favor a special bond to build a new football stadium for
the Seahawks? If so, should the Seahawks’ owners share the cost?
Population is all residential property tax payers in King County, Washington. The survey will be conducted among this group only.
Limit the number of questions to keep the survey short.
Ask important questions first. Provide specific response options when possible.
Establish eligibility. “Do you own a residence in King County?” Add demographic questions at the end: age, income, etc. Introduction should explain purpose of survey and who is conducting it—stress that answers are anonymous.
Try the survey out on a small group from the population. Check for length, clarity, and ease of conducting. Have we forgotten anything? Make changes if needed.
Sample size is dependent on how confident we want to be of our results, how precise we want the results to be, and how much opinions differ among the population members. Chapter 7 will show how sample sizes are computed. Various sampling methods are available. These are reviewed later in Chapter 1.
Get phone numbers from a computer-generated or “current” list. Develop “callback” rule for no answers. Callers should be trained to ask questions fairly. Do not lead the respondent. Record responses on data sheet.
Define the Population of Interest
Pretest the Survey |
Closed-End Questions
Questions that require the respondent to select from a short list of defined choices.
Demographic Questions
Questions relating to the respondents’ characteristics, backgrounds, and attributes.
closed-end questions. For example, a closed-end question might be, “To which political party do you belong? Republican? Democrat? Or other?”
The survey instrument should have a short statement at the beginning explaining the pur- pose of the survey and reassuring the respondent that his or her responses will remain confi- dential. The initial section of the survey should contain questions relating to the central issue of the survey. The last part of the survey should contain demographic questions (such as gender, income level, education level) that will allow researchers to break down the responses and look deeper into the survey results.
A researcher must also consider the survey budget. For example, if you have $3,000 to spend on calls and each call costs $10 to make, you obviously are limited to making 300 calls. However, keep in mind that 300 calls may not result in 300 usable responses.
The phone survey should be conducted in a short time period. Typically, the prime call- ing time for a voter survey is between 7:00 p.m. and 9:00 p.m. However, some people are not home in the evening and will be excluded from the survey unless there is a plan for conduct- ing callbacks.
Telephone surveys are becoming more problematic as more and more households drop their landlines in favor of cell phones, which makes it difficult to reach prospective survey responders. Additionally, many people refuse to answer if the caller ID is not a number they recognize.
Written Questionnaires and Surveys The most frequently used method to collect opinions and factual data from people is a written questionnaire. In some instances, the ques- tionnaires are mailed to the respondents. In others, they are administered directly to the potential respondents. Written questionnaires are generally the least expensive means of col- lecting survey data. If they are mailed, the major costs include postage to and from the respondents, questionnaire development and printing costs, and data analysis. Online surveys are being used more frequently for written surveys now that software packages such as Sur- vey Monkey are readily available. This technology eliminates postage costs and makes it
Determine Sample Size and Sampling Method
Select Sample and Make Calls
8 Chapter 1 | The Where, Why, and How of Data Collection
FIGURE 1.6 Written Survey Steps
Define the Issue
Design the Survey Instrument
Clearly state the purpose of the survey. Define the objectives. What do you want to learn from the survey? Make sure there is agreement before you proceed.
Define the overall group of people to be potentially included in the survey and obtain a list of names and addresses or e-mail addresses of those individuals in this group.
Limit the number of questions to keep the survey short.
Ask important questions first. Provide specific response options when possible.
Add demographic questions at the end: age, income, etc. Introduction should explain purpose of survey and who is conducting it—stress that answers are anonymous.
Layout of the survey must be clear and attractive. Provide location for responses.
Try the survey out on a small group from the population. Check for length, clarity, and ease of conducting. Have we forgotten anything? Make changes if needed.
Sample size is dependent on how confident we want to be of our results, how precise we want the results to be, and how much opinions differ among the population members. Chapter 7 will show how sample sizes are computed. Various sampling methods are available. These are reviewed later in Chapter 1.
Send survey to a subset of the larger group.
Include an introductory message explaining the purpose
of the survey.
If the survey is mailed, include a stamped return envelope for returning the survey.
Pretest the Survey
Open-End Questions
Questions that allow respondents the freedom to respond with any value, words, or statements of their own choosing.
easier to format the data for statistical analysis. Figure 1.6 shows the major steps in conduct- ing a written survey. Note how written surveys are similar to telephone surveys; however, written surveys can be slightly more involved and, therefore, take more time to complete than those used for a telephone survey. You still must be careful to construct a questionnaire that can be easily completed without requiring too much time.
A written survey can contain both closed-end and open-end questions. Open-end ques- tions provide the respondent with greater flexibility in answering a question; however, the responses can be difficult to analyze. Note that telephone surveys can use open-end ques- tions, too. However, the caller may have to transcribe a potentially long response, and there is risk that the interviewees’ comments may be misinterpreted.
Written surveys also should be formatted to make it easy for the respondent to provide accurate and reliable data. This means that proper space must be provided for the responses, and the directions must be clear about how the survey is to be completed. A written survey needs to be pleasing to the eye. How it looks will affect the response rate, so it must look professional.
You also must decide whether to manually enter or scan the data gathered from your written survey. The approach you take will affect the survey design. If you are administering a large number of surveys, scanning is preferred. It cuts down on data entry errors and speeds up the data gathering process. However, you may be limited in the form of responses that are possible if you use scanning.
If the survey is administered directly to the desired respondents, you can expect a high response rate. For example, you probably have been on the receiving end of a written survey many times in your college career, when you were asked to fill out a course evaluation form right in the classroom. In this case, most students will complete the form. On the other hand, if a survey is administered through the mail or online, you can expect a low response rate— typically 5% to 10% for mailed surveys. Although there are mixed findings about online sur- vey response rates, some authors suggest that online response rates tend to be lower than rates for mailed surveys. (See A. Bryman, Social Research Methods, Fifth Edition, Oxford Univer- sity Press, 2015.) Therefore, if you want 200 responses, you might need to distribute as many as 4,000 questionnaires.
Define the Population of Interest
Determine Sample Size and Sampling Method
Select Sample and Send Surveys
1.2 Procedures for Collecting Data | Chapter 1 9
Overall, written surveys can be a low-cost, effective means of collecting data if you can overcome the problems of low response. Be careful to pretest the survey and spend extra time on the format and look of the survey instrument.
Developing a good written questionnaire or telephone survey instrument is a major chal- lenge. Among the potential problems are the following:
· ● Leading questions
Example: “Do you agree with most other reasonably minded people that the city should spend more money on neighborhood parks?”
Issue: In this case, the phrase “Do you agree” may suggest that you should agree.
Also, since the question suggests that “most reasonably minded people” already agree, the respondent might be compelled to agree so that he or she can also be considered “reasonably minded.”
Improvement: “In your opinion, should the city increase spending on neighbor- hood parks?”
Example: “To what extent would you support paying a small increase in your prop- erty taxes if it would allow poor and disadvantaged children to have food and shelter?”
Issue: The question is ripe with emotional feeling and may imply that if you don’t
support additional taxes, you don’t care about poor children.
Improvement: “Should property taxes be increased to provide additional funding
for social services?”
· ● Poorly worded questions
Example: “How much money do you make at your current job?”
Issue: The responses are likely to be inconsistent. When answering, does the respondent state the answer as an hourly figure or as a weekly or monthly
total? Also, many people refuse to answer questions regarding their
income.
Improvement: “Which of the following categories best reflects your weekly
income from your current job? _____Under $500 _____$500–$1,000 _____Over $1,000”
Example: “After trying the new product, please provide a rating from 1 to 10 to indi- cate how you like its taste and freshness.”
Issue: First, is a low number or a high number on the rating scale considered a
positive response? Second, the respondent is being asked to rate two factors, taste and freshness, in a single rating. What if the product is fresh but does not taste good?
Improvement: “After trying the new product, please rate its taste on a 1 to 10 scale with 1 being best. Also rate the product’s freshness using the same 1 to 10 scale. _____Taste _____Freshness”
The way a question is worded can influence the responses. Consider an example that occurred in 2008 that resulted from the sub-prime mortgage crisis and bursting of the real estate bubble. The bubble occurred because home prices were driven up due to increased demand by individuals who were lured into buying homes they could not afford. Many financial organizations used low initial interest rates and little or no credit screening to attract customers who later found they could not make the monthly payments. As a result, many buyers defaulted on their loans and the banks were left with abandoned homes and no way of collecting the money they had loaned out. Three surveys were conducted on the same basic issue. The following questions were asked:
“Do you approve or disapprove of the steps the Federal Reserve and Treasury Depart- ment have taken to try to deal with the current situation involving the stock market and major financial institutions?” (Dan Balz and Jon Cohen, “Economic fears give Obama clear lead over McCain in poll,” www.washingtonpost.com, Sep. 24, 2008) 44% Approve—42% Disapprove—14% Unsure
10 Chapter 1 | The Where, Why, and How of Data Collection
Structured Interviews
Interviews in which the questions are scripted.
Unstructured Interviews
Interviews that begin with one or more broadly stated questions, with further questions being based on the responses.
“Do you think the government should use taxpayers’ dollars to rescue ailing private financial firms whose collapse could have adverse effects on the economy and market, or is it not the government’s responsibility to bail out private companies with taxpayer dollars?” (Doyle McManus, “Americans reluctant to bail out Wall Street,” Los Angeles Times/Bloomberg Poll, Sep. 24, 2008) 31% Use Tax Payers’ Dollars—55% Not Govern- ment’s Responsibility—14% Unsure
“As you may know, the government is potentially investing billions to try and keep financial institutions and markets secure. Do you think this is the right thing or the wrong thing for the government to be doing?” (PewResearchCenter, www.people-press.org, Sep. 23, 2008) 57% Right Thing—30% Wrong Thing—13% Unsure
Note the responses to each of these questions. The way the question is worded can affect the responses.
Direct Observation and Personal Interviews
Direct observation is another proce- dure that is often used to collect data. As implied by the name, this technique requires researchers to actually observe the data collection process and then record the data based on what takes place in the process.
Possibly the most basic way to gather data on human behavior is to watch people. If you are trying to decide whether a new method of displaying your product at the supermar- ket will be more pleasing to customers, change a few displays and watch customers’ reac- tions. If, as a member of a state’s transportation department, you want to determine how well motorists are complying with the state’s seat belt laws, place observers at key spots throughout the state to monitor people’s seat belt habits. A movie producer, seeking infor- mation on whether a new movie will be a success, holds a preview showing and observes the reactions and comments of the movie patrons as they exit the screening. The major constraints when collecting observations are the amount of time and money required. For observations to be effective, trained observers must be used, which increases the cost. Per- sonal observation is also time-consuming. Finally, personal perception is subjective. There is no guarantee that different observers will see a situation in the same way, much less report it the same way.
Personal interviews are often used to gather data from people. Interviews can be either structured or unstructured, depending on the objectives, and they can utilize either open- end or closed-end questions.
Regardless of the procedure used for data collection, care must be taken that the data col- lected are accurate and reliable and that they are the right data for the purpose at hand.
Other Data Collection Methods
Data collection methods that take advantage of new technologies are becoming more preva- lent all the time. For example, many people believe that Walmart is one of the best compa- nies in the world at collecting and using data about the buying habits of its customers. Most of the data are collected automatically as checkout clerks scan the UPC bar codes on the products customers purchase. Not only are Walmart’s inventory records automatically updated, but information about the buying habits of customers is also recorded. This allows Walmart to use analytics and data mining to drill deep into the data to help with its decision making about many things, including how to organize its stores to increase sales. For instance, Walmart apparently decided to locate beer and disposable diapers close together when it discovered that many male customers also purchase beer when they go to the store for diapers.
Bar code scanning is used in many different data collection applications. In a DRAM (dynamic random-access memory) wafer fabrication plant, batches of silicon wafers have bar codes. As the batch travels through the plant’s workstations, its progress and quality are tracked through the data that are automatically obtained by scanning.
Every time you use your credit card, data are automatically collected by the retailer and the bank. Computer information systems are developed to store the data and to provide deci- sion makers with procedures to access the data. For example, a number of years ago Target executives wanted to try marketing to pregnant women in their second trimester, which is when most expectant mothers begin buying products like prenatal vitamins and maternity
clothing. (See Charles Duhigg, “How companies learn your secrets,” The New York Times Magazine, Feb. 16, 2012.) If Target could attract women to buy these products, then once the baby was born, the women would be likely to buy many other products as well. But Target needed a way to know when a woman was in her second trimester. Analysts observed that women on their baby registry tended to buy certain products in larger amounts early on in their pregnancy and other products later in the pregnancy. They found that pregnant women also tended to purchase certain types of products such as washcloths closer to their delivery date. By applying statistical analytics to the data they collect on their customers, Target was able to identify about 25 products that, when analyzed together, allowed them to assign each shopper a “pregnancy prediction” score. They could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.
In many instances, your data collection method will require you to use physical measure- ment. For example, the Andersen Window Company has quality analysts physically measure the width and height of its windows to assure that they meet customer specifications, and a state Department of Weights and Measures physically tests meat and produce scales to deter- mine that customers are being properly charged for their purchases.
Data Collection Issues
Data Accuracy When you need data to make a decision, we suggest that you first see if appropriate data have already been collected, because it is usually faster and less expensive to use existing data than to collect data yourself. However, before you rely on data that were col- lected by someone else for another purpose, you need to check out the source to make sure that the data were collected and recorded properly.
Such organizations as Bloomberg,Value Line, and Fortune have built their reputations on providing quality data. Although data errors are occasionally encountered, they are few and far between. You really need to be concerned with data that come from sources with which you are not familiar. This is an issue for many sources on the World Wide Web. Any organiza- tion or any individual can post data to the web. Just because the data are there doesn’t mean they are accurate. Be careful.
Interviewer Bias There are other general issues associated with data collection. One of Bias these is the potential for bias in the data collection. There are many types of bias. For exam-
An effect that alters a statistical result by systematically distorting it; different from a random error, which may distort on any one occasion but balances out on the average.
ple, in a personal interview, the interviewer can interject bias (either accidentally or on pur- pose) by the way she asks the questions, by the tone of her voice, or by the way she looks at the subject being interviewed. We recently allowed ourselves to be interviewed at a trade show. The interviewer began by telling us that he would only get credit for the interview if we answered all of the questions. Next, he asked us to indicate our satisfaction with a particular display. He wasn’t satisfied with our less-than-enthusiastic rating and kept asking us if we really meant what we said. He even asked us if we would consider upgrading our rating! How reliable do you think these data will be?
Nonresponse Bias Another type of bias that can be interjected into a survey data collec- tion process is called nonresponse bias. We stated earlier that mail surveys suffer from a high percentage of unreturned surveys. Phone calls don’t always get through, people refuse to answer, or e-mail surveys are deleted. Subjects of personal interviews may refuse to be inter- viewed. There is a potential problem with nonresponse. Those who respond may provide data that are quite different from the data that would be supplied by those who choose not to respond. If you aren’t careful, the responses may be heavily weighted by people who feel strongly one way or another on an issue.
Selection Bias Bias can be interjected through the way subjects are selected for data col- lection. This is referred to as selection bias. A study on the virtues of increasing the student athletic fee at your university might not be best served by collecting data from students attending a football game. Sometimes, the problem is more subtle. If we do a telephone sur- vey during the evening hours, we will miss all of the people who work nights. Do they share the same views, incomes, education levels, and so on as people who work days? If not, the data are biased.
1.2 Procedures for Collecting Data | Chapter 1 11
QUESTION 1
Briefly explain what is meant by an experiment and an experimental design.
QUESTION 2
In your capacity as assistant sales manager for a large office products retailer, you have been assigned the task of interviewing purchasing managers for medium and large companies in the San Francisco Bay area. The objective of the interview is to determine the office product buying plans of the company in the coming year. Develop a personal interview form that asks both issue-related questions and demographic questions.
QUESTION 3
A company has 18,000 employees. The file containing the names is ordered by employee number from 1 to 18,000. If a sample of 100 employees is to be selected from the 18,000 using systematic random sampling, within what range of employee numbers will the first employee selected come from?
QUESTION 4
Mount Hillsdale Hospital has 4,000 patient files listed alphabetically in its computer system. The office manager wants to survey a statistical sample of these patients to determine how satisfied they were with service provided by the hospital. She plans to use a telephone survey of 100 patients.
1. Describe how you would attach identification numbers to the patient files; for example, how many digits (and which digits) would you use to indicate the first patient file?
2. Describe how you would obtain the first random number to begin a simple random sample method.
3. How many random digits would you need for each random number you selected?
4. Use Excel to generate the list of patients to be surveyed.
RUBRIC