Assignment #4

Political Science 328

This assignment will be due in hard copy form in the department dropbox (outside 745 SWKT) AND uploaded on Learning Suite before 1:30 pm, Thursday, February 7. Turn in the assignment electronically on Learning Suite (separately for each part of the assignment), and on paper (in four separate documents) in the Political Science dropbox. Remember that no late assignments will be accepted.

Type your answers in a regular font (e.g. Times Roman 12). (As noted later, Stata .do files and .log files are displayed in Courier 8.)

This assignment is divided into four parts. You must submit your answers to each part separately, as we will have a different TA grade each part. Make sure that your name, section number as well as the problem set and part number (e.g. Assignment 4, Part 1) are clearly listed on each part. Students who fail to do so may be penalized on the assignment.

If necessary, re-read the section in the syllabus on group work in Academic Honesty and Plagiarism (here) to make sure you are giving proper credit to those you work with and/or the text(s) you use for each problem. As a reminder, you are in violation of this course's policies as well as the Honor Code if you are sharing electronic portions of your assignment with other people. That includes emailing other people code (even snippets of code), .do files, Word files, or anything else related to a problem set. Your assignment must represent your own work. Please work together: We encourage you to do so! But remember that when working together you should product your own independent work product.

Solve the following problems. Show all of your work, but keep your answers concise. Include a copy of your input and output: your .do file and your .log file. However, do not include unnecessary output (i.e. no data dumps), and format any output so that it is easily readable. Convert Stata output (logs and do-files) to Courier 8 with single-spacing. Explanation includes statistical and substantive explanation (explain so that a statistical layperson can understand it, and so that a statistical analyst will see your erudition). Highlight your answer.

{15 points} [from Agresti and Finlay] (This question is similar to a question you might see on the Testing Center part of the exams: "Here is a study/survey/table. Where does it show this? Interpret that." See how much you can answer without using any other resources. Then, for this assignment, you may use a computer to make calculations, but to get full credit, you need to write out formulas and fill in the necessary information.) A study compared substance use, delinquency, psychological well-being, and social support among various family types, for a sample of urban African-American adolescent males. The sample contained 108 subjects from single-mother households and 44 from households with both biological parents. The youths responded to a battery of questions that provides a measure of perceived parental support. This measure had sample means of 46 (s = 9) for the single-mother households and 42 (s = 10) for the households with both biological parents. Consider the conclusion, "The mean parental support was 4 units higher for the single-mother households. If the true means were equal, a difference of this size could be expected only 2% of the time. For samples of this size, 95% of the time one would expect this difference to be within 3.4 of the true value."

a. Explain how this conclusion refers to the results of (i) a confidence interval, (ii) a hypothesis test. From the information provided, replicate (i.e. show the calculations yourself for) the confidence interval and hypothesis test.

b. Explain the results of the study to someone who has not studied inferential statistics.

{30} [adapted from Stock and Watson] (You may use a computer to make calculations, but to get full credit, you need to write out formulas and fill in the necessary information. Although LSAT scores are rounded to the nearest integer, for this assignment, round to the third decimal place.) Grades on LSAT are known to have a mean of 150 for students in the United States. The LSAT is administered to 65 randomly selected students at BYU. In this sample the mean is 154 and the standard deviation is 10.

a. Construct a 95% confidence interval for the average LSAT score for BYU students.

b. Conduct a hypothesis test of whether the mean LSAT score of BYU students is different than other students in the United States. Include all assumptions, the hypotheses, test statistic, and p-value, and interpret the result (including a conclusion). If you need to make an assumption to conduct a test, state the assumption, and conduct the test.

c. Use both part (a) and part (b) to answer the question: Is there statistically significant evidence that BYU students perform differently on the LSAT than other students in the United States? Discuss how the hypothesis test and confidence interval are related or not.

d. Another 73 students are selected at random from BYU. They are given a 3-hour preparation course before the LSAT is administered. Their average score is 157 with a standard deviation of 9. Construct a 95% confidence interval for the change in average LSAT score associated with the prep course.

e. Conduct a hypothesis test of whether the prep course improves LSAT scores (among BYU students). Include all assumptions, the hypotheses, test statistic, and p-value, and interpret the result (including a conclusion). If you need to make an assumption to conduct a test, state the assumption, and conduct the test.

f. Use both part (d) and part (e) to answer the question: Is there statistically significant evidence that the prep course helped BYU students on the LSAT? Discuss how the hypothesis test and confidence interval are related or not.

g. The original 65 students are given the prep course and then are asked to take the LSAT a second time. The average change in their scores is +2 points, and the standard deviation of the change is 7 points. Construct a 95% confidence interval for the change in average LSAT scores.

h. Conduct a hypothesis test of whether BYU students will perform better on their second attempt on the LSAT after taking the prep course. Include all assumptions, the hypotheses, test statistic, and p-value, and interpret the result (including make a conclusion). If you need to make an assumption to conduct a test, state the assumption, and conduct the test.

i. Use both part (g) and part (h) to answer the question: Is there statistically significant evidence that BYU students will perform better on their second attempt on the LSAT after taking the prep course? Discuss how the hypothesis test and confidence interval are related or not.

j. Why do the statistical conclusions differ (or not) in parts (f) and (i) given they are both about change in average LSAT scores, considering the different changes and sample sizes?
{20} Do Problem E13.1, parts (a) and (d) in Stock and Watson. [NOTE: The first Empirical Exercise in Chapter 13, pp. 517-518. If you do not have the updated 3rd edition, please check to see if your edition has the correct numbers.]
- For part (a), follow the instructions in the book.
- For part (d), use experience and employment holes to examine possible non-random assignment (i.e. a randomization check).
- Additional part: Some employers claim to be "equal opportunity employers" (eoe). Do these businesses appear to respond to black applicants at higher rates than employers who do not claim to be equal opportunity employers?
{34} Case Study: Sampling Scallops [from Barnett 1995 and McClave, Benson and Sincich 1998] "The US Fisheries and Wildlife Service requires that in any given 'harvest,' the average meat per scallop at least 1/36 of a pound. The requirement is aimed at protecting baby scallops, though less to guarantee them happy childhoods than to preserve enough adult scallops so that the species does not disappear.

"The vessel arrived at a Massachusetts port with 11,000 bags of scallops, from which the harbormaster randomly selected 18 bags for weighing. From each such bag, his agents took a large scoopful of scallops; then, to estimate the bag's average meat per scallop, they divided the total weight of meat in the scoopful by the number of scallops it contained. Based on the 18 statistics thus generated, the harbormaster estimated that each of the ship's scallops possessed on average 1/39 of a pound of meat (that is, they were about seven percent lighter than the minimum requirement). Viewing this outcome as conclusive evidence that the weight standard had been violated, federal authorites at once confiscated 95 percent of the catch (which they then sold in an auction). The fishing voyage was thus transformed into a financial catastrophe for its participants.

"The ship's owner was as displeased with the US government as Captain Ahab had been with Moby Dick. He declared that the vessel had fully complied with the weight standard and saw lunacy in the assertion that sampling 18 bags out of 11,000 could yield a reliable estimate of the mean weight of all the ship's scallops. He filed a lawsuit against the government and arranged for a Boston law firm to represent him."

The law firm would like you to evaluate whether the ship’s owner has cause to file a lawsuit against the federal government. Included below are the actual scallop weight measurements for each of the 18 sampled bags. For ease of understanding, each number is expressed as a multiple of 1/36 of a pound, the minimum permissible average weight per scallop. Consequently, numbers below one indicate individual bags that do not meet the standard:
```
        0.93    0.88    0.85    0.91    0.91    0.84    0.90    0.98    0.88
        0.89    0.98    0.87    0.91    0.92    0.99    1.14    1.06    0.93
```
Among the questions you should answer in the main text of the memo are:
- Can a reliable estimate of the mean weight of all the scallops be obtained from a sample size of 18? If not, how big a sample would give a reliable estimate?
- Are there any statistical flaws in the government’s decision rule to confiscate a scallop catch if the mean weight of the scallops is less than 1/36 of a pound?
- Is there another procedure for determining whether a ship is in violation of the minimum weight restriction? Apply your procedure to the data, and draw a conclusion about the ship in question.
Do not worry about whether the government tried to pick up smaller scallops in the bags. The government is not smart enough to pull it off--consider each bag/scoop to be randomly drawn.

Prepare a professional memo that presents the results of your analysis and gives your opinion regarding the case. Be sure to describe the assumptions and methodologies used to arrive at your findings. Assume that your readers have only a vague familiarity with statistics (i.e., they are laypersons).

Remember that your professional memo should only be one page (with standard margins and fonts), and you should have a professionally formatted statistical appendix following the report to support the conclusions of the report. Although a layperson should be able to understand the report, the appendix can be more technical. A professionally formatted statistical appendix meas that you should not cut and paste Stata output into the appendix. You place your Stata .do file and .log file into a grader's appendix to show your work. The grader's appendix is separate from a professional appendix. If you were producing this work for a job, you would include the report and the professional appendix (but not the grader's appendix).

{1} Complete the Time Spent Survey. State your survey completion code at the top of your Part 4 packet (next to your name, section, etc.).