Assignment #2

PPol 603
Due: Thursday, 13 September 2012, 9:30 a.m.

Type up your answers. Read the section in the syllabus on Academic Honesty and Plagiarism (here) to make sure you are giving proper credit to those you work with and/or the text(s).

Solve the following problems. Show all of your work, but keep your answers concise. Highlight your (final) answer to distinguish it from your other numbers and text. Include a copy of your input (e.g. do file) or output (e.g. log file), when it is an appropriate way to show your work. However, do not include unnecessary output (i.e. no data dumps), and format any output so that it is easily readable. An appropriate time to include output is when you put your results in a table--if your results are wrong, then graders have no idea how you came to your conclusions (i.e. give partial credit) unless you provide some output. Explanation includes statistical and substantive explanation (explain so that a statistical layperson can understand it, and so that a statistical analyst will see your erudition).

  1. {5 points} Do Problem 2.12 part a. in Stock and Watson
  2. {25} [from Agresti and Finlay 2009] "Lake Wobegon Junior College admits students only if they score above 400 on a standardized achievement test. Applicants from group A have a mean of 500 and a standard deviation of 100 on this test, and applicants from group B have a mean of 450 and a standard deviation of 100. Both distributions are approximately normal, and both groups have the same size.
    a. Find the proportion not admitted for each group.
    b. Of the students who are not admitted, what proportion are from group B?
    c. A state legislator proposes that the college lower the cutoff point for admission to 300, thinking that the proportion of the students who are not admitted who are from group B would decrease. If this policy is implemented, determine the effect on the answer to b., and comment."
  3. {20} [from Agresti and Finlay 2009] "The distribution of family size in a particular tribal society is skewed to the right, with E(Y) = 5.2 and sd(Y) = 3.0. These values are unknown to an anthropologist, who takes a sample of families in this society to estimate mean family size." Let y-bar denote the sample mean family size she obtains, for a random sample of 36 families.
    a. "Identify the sampling distribution of y-bar. State its mean and standard error and explain what it describes."
    b. "Find the probability that her sample mean falls within 0.5 of the population mean."
  4. {50} Case Study: Furniture Fire [from McClave, Benson, and Sincich 1998] "A wholesale furniture retailer stores in-stock items at a large warehouse located in Tampa, Florida. In early 1992, a fire destroyed the warehouse and all the furniture in it. After determining the fire was an accident, the retailer sought to recover costs by submitting a claim to its insurance company."
    "As is typical in a fire insurance policy of this type, the furniture retailer must provide the insurance company with an estimate of 'lost' profit for the destroyed items. Retailers calculate profit margin in percentage form using the Gross Profit Factor (GPF). By definition, the GPF for a single sold item is the ratio of the profit to the item's selling price measured as a percentage, i.e.
    Item GPF = (Profit/Sales price) x 100
    Of interest to both the retailer and the insurance company is the average GPF for all of the items in the warehouse. Since these furniture pieces were all destroyed, their eventual selling prices and profit values are obviously unknown."
    "One way to estimate the mean GPF of the destroyed items is to use the mean GPF of similar, recently sold items. The retailer sold 3,005 furniture items in 1991 (the year prior to the fire) and kept paper invoices on all sales. Rather than calculate the mean GPF for all 3,005 items (the data were not computerized), the retailer sampled a total of 253 of the invoices and computed the mean GPF for these items. The 253 items were obtained by first selecting a sample of 134 items and then augmenting this sample with a second sample of 119 items. The mean GPFs for the two subsamples were calculated to be 50.6% and 51.0%, respectively, yielding an overall average GPF of 50.8%. This average GPF can be applied to the costs of the furniture items destroyed in the fire to obtain an estimate of the 'lost' profit."
    "According to experienced claims adjusters at the insurance company, the GPF for sale items of the type destroyed in the fire rarely exceeds 48%. Consequently, the estimate of 50.8% appeared to be unusually high. (A 1% increase in GPF for items of this type equates to, approximately, an additional $16,000 in profit.) When the insurance company questioned the retailer on this issue, the retailer responded, 'Our estimate was based on selecting two independent, random samples from the population of 3,005 invoices in 1991. Since the samples were selected randomly and the total sample size is large, the mean GPF of 50.8% is valid.'"
    "A dispute arose between the furniture retailer and the insurance company, and a lawsuit was filed. In one portion of the suit, the insurance company accused the retailer of fraudulently representing their sampling methodology. Rather than selecting the samples randomly, the retailer was accused of selecting an unusual number of 'high profit' items from the population in order to increase the average GPF of the overall sample."
    "To support their claim of fraud, the insurance company hired a CPA firm to independently assess the retailer's 1991 Gross Profit Factor. Through the discovery process, the CPA firm legally obtained the paper invoices for the entire population of 3,005 items sold and input the information into a computer. The selling price, profit, profit margin, and month sold for these 3,005 furniture items are available" here (in Stata format, you may need to right-click the link), and are described here.
    "Your objective in this case is to use these data to determine the likelihood of fraud. Is it likely that a random sample of 253 items selected from the population of 3,005 items would yield a mean GPF of at least 50.8%? Or, is it likely that two independent, random samples of size 134 and 119 will yield mean GPFs of at least 50.6% and 51.0%, respectively? (These were the questions posed to a statistician retained by the CPA firm.) Use the ideas of probability and sampling distributions to guide your analysis."
    "Prepare a professional document which presents the results of your analysis and gives your opinion regarding fraud. Be sure to describe the assumptions and methodologies used to arrive at your findings." Assume that your readers have only a vague familiarity with statistics (i.e., they are laypersons).

    More information on how to write a professional document (or memo) about data analysis is found here. Some key points are that your professional memo should only be one page (with standard margins and fonts), and you should have a professionally formatted appendix following the memo to support the conclusions of the memo. For this class, you also need to have a grader appendix showing your work (such as a Stata log file).


    Back to Assignments page