Assignment #5

PPol 603
Due: Thursday, 4 October 2012, 9:30 a.m.

Type up your answers. Read the section in the syllabus on Academic Honesty and Plagiarism (here) to make sure you are giving proper credit to those you work with and/or the text(s).

Solve the following problems. Show all of your work, but keep your answers concise. Highlight your (final) answer to distinguish it from your other numbers and text. Include a copy of your input (e.g. do file) or output (e.g. log file), when it is an appropriate way to show your work. However, do not include unnecessary output (i.e. no data dumps), and format any output so that it is easily readable. An appropriate time to include output is when you put your results in a table--if your results are wrong, then graders have no idea how you came to your conclusions (i.e. give partial credit) unless you provide some output. Explanation includes statistical and substantive explanation (explain so that a statistical layperson can understand it, and so that a statistical analyst will see your erudition).

  1. {15 points} Do Problem 5.1 in Stock and Watson. Add the following parts:
    e. Do smaller classes improve test scores? By how much? Is the effect large? Explain.
    f. Do you think that the regression errors plausibly are homoskedastic? Explain.
    g. The standard error of the slope was computed using Equation (5.3). Suppose that the regression errors were homoskedastic: Would this affect the validity of the confidence intervals constructed in part a.? Explain.
  2. {15} Do Problem E5.1 in Stock and Watson.
  3. {5} Show that you can get the same regression results of E5.1a in SPSS (except for the robust standard errors). The data (in SPSS format) is here. To open a data file in SPSS, click on File-->Open-->Data... To run a regression, click on Analyze-->Regression-->Linear... Click on the various buttons to find the things you need. Then Click OK. In addition to the usual output, get the confidence intervals of the regression coefficients, as well as the predicted values, residuals, and the mean and individual prediction intervals. (Note the actual command that is being run in the Output Log. This is the syntax of SPSS programming.) Attach a copy of the output, annotated with your comments on where to find the relevant statistics. (Annotations by hand are fine. The first page of predicted values, etc., is sufficient.)
  4. {5} Show that you can get the same regression results of E5.1a in Excel (except for the robust standard errors). (The Excel data is available on the Stock and Watson website in the same place you get the Stata data.) To run a regression, you may first need to add in the Analysis ToolPak. Click on File, then Options, then Add-Ins, then Go.... Click on Analysis ToolPak and click OK. Now click on Data, then on Data Analysis. Click on Regression, then click OK. In addition to the usual output, obtain the residuals (which will also generate fitted values). Attach a copy of the output (or a part of it, if necessary), annotated with your comments on where to find the relevant statistics.
  5. {30} This problem follows up on E4.4 from the previous assignment. Use all observations.
    a. Estimate a regression of Growth on TradeShare, using the “robust” option.
    b. Is the slope coefficient statistically significantly different from zero at the 5% significance level? Show how you reach this conclusion.
    c. Report the 95% confidence interval for the slope of the population regression line.
    d. What is the R2 of this regression? What does this mean?
    e. Compute the correlation coefficient between Growth and TradeShare, and compare its square to the R2. How are the correlation coefficient and the R2 related?
    f. What is the value of the standard error of the regression? What does this mean?
    g. Based on your graph from E4.4a. (a scatterplot), does the regression error appear to be homoskedastic or heteroskedastic? In addition, do the errors appear to be normally distributed?
    h. Run the regression again without the “robust” option. Compare the results to what you obtained with the “robust” option. What is different?
    i. Create the plot of residuals vs. predicted values. Do there appear to be any problems with the regression? How would you fix them?
    j. What is the predicted growth rate with a trade share of 1.0 (also found in E4.4c.)? Using the robust results, what is the confidence interval of the predicted average growth rate at this trade share (i.e. the confidence interval of the mean prediction)? Discuss your results. Create a plot that has the confidence interval of the mean prediction overlaid with the data.
    k. Using the non-robust results, what is the confidence interval of the predicted growth of a specific country at this trade share (i.e. the confidence interval of the individual prediction)? Discuss your results.
    l. Construct a new variable, lowrgdp60, which equals one if the country’s GDP is in the bottom quartile of GDP for 1960 and equals zero otherwise. Estimate a regression of Growth on lowrgdp60, using the “robust” option. What is the coefficient on lowrgdp60? Explain in words what this means. Is the numerical value of your estimate large or small in a real-world sense?
    m. Test the hypothesis that the mean growth rate from 1960-1995 is the same for countries with lowrgdp60 = 1 as it is for countries with lowrgdp60 = 0, against the alternative that they differ, at the 5% significance level.
    n. Using the “summarize” command, compute the sample average of Growth for countries with lowrgdp60 = 1 and then again for countries with lowrgdp60 = 0; from this compute the difference in mean GDP growth rates for the two groups and construct the differences-of-means t-statistic testing the hypothesis that the mean growth rates are the same, assuming equal and then unequal variances (ttest ..., ... unequal).
    o. Reestimate the regression of Growth on lowrgdp60, without the “robust” option. How does the t-statistic computed in n. compare to the t-statistic on the slope coefficient in the regression of Growth on lowrgdp60 obtained with and without the “robust” option? Explain.
  6. {30} Research Project Data Summary:
    Turn in a one-page, double-spaced document (standard font and margins) that offers details about the data set that you have obtained. Briefly review (or update) the research question you will address, and your dependent variable and independent variables. The summary should include summary statistics and any relevant figures that help describe the data. (Figures and tables do not count toward the page limit.) Some suggestions on writiing a research paper (by Stock and Watson) can be found here.

Back to Assignments page