The error term has a zero mean, variance equal to 2=X2 i ; and E (uiuj ) = 0 for i 6= j: You are given a sample of observations f(Yi ; Xi)g n i=1. You may treat Xi as being non-stochastic. Clearly annotating your answers:

(a) (5 marks) Derive the OLS estimator of : In the presence of heteroskedasticity, the OLS estimator remains unbiased (you are not asked to show this). Derive the variance of the OLS estimator of . (b)

(3 marks) Discuss how you can obtain the Best Linear Unbiased Estimator (BLUE) of given the heteroskedasticity.

2. Consider the simple linear regression model

Yi = + Xi + ui , i = 1; :::; n

in the presence of correlation between the error and regressor. The regressor exhibits variability in the sample, i.e., Pn i=1(Xi X) 2 6= 0: Under assumptions of homoskedasticity and the absence of autocorrelation, the IV estimator of that uses the instrument Z has the following (asymptotic) variance (no need to prove this statement)

V ar ^ IV = 2 P u n i=1(Xi X) 2 1 r 2 XZ ;

where rXZ 6= 0 is the sample correlation between X and Z and 2 u is the variance of the disturbance term u.

(a) (1 marks) Give the formula for ^ IV (you are not asked to derive it).

(b) (4 marks) Provide at least three factors that will help obtain more precise IV parameter estimates for : In your answer explain why the precision of parameter estimates is important.

(c) (3 marks) Discuss the following statement: “If X is not correlated with u; the best choice of instrument is using the regressor itself

You are interested in the extent to which removing the option of paid childcare has affected the share of time mothers compared to fathers spend with their children. The economic and health response to the pandemic has caused job losses, increased telecommuting, closed day care and schools, and placed state restrictions on in-home paid childcare – all of which may have affected the bargaining dynamic between parents in deciding how much time to spend caring for their children. The American Time Use Survey provides nationally representative estimates of how and with whom Americans spend their time, including hours spent on paid work and childcare. It is an annual cross-sectional survey asked during weeks when school is typically in session that is linked to the Current Population Survey (demographic questions were asked several months before the Time Use questions), thus you can observe demographic information and family relationships.

You have data from 2015-2019 and will have data in 2020 next year. You are crafting your econometric specification. Prior to 2020, all states allowed paid in-home care and had schools open. As of the time of the survey in 2020, all states had closed schools (shutdown of physical buildings and in-person instruction) but there was state variation in allowing paid in-home care for children less than 6 years old. That is, some, but not all, states deemed in-home childcare essential when stay-at-home orders went into effect in 2020 such that the availability of paid in-home care depended on the state you lived in. You want to learn the effect of disallowing paid childcare and closing schools on spousal allocation of childcare hours as measured by y, the ratio of hours mother spent on childcare to hours father spent on childcare. For example, y = 1 when mothers and fathers spent the same amount of time on childcare and y = 1.5 when mothers spent 50% more than fathers.

You restrict your sample to opposite sex, married couples with children where both parents worked in the prior year and both parents report positive hours of childcare.^{1} You can observe the following variables for each couple:

1For simplicity, we exclude all couples with first responders, like health care workers, for whom special rules applied in the state stay-at-home orders.

y_{it}= hours mother spent on childcare divided by hours father spent in couple i in year t

x_{1}_{it}=1 if wife in couple i in year t is currently working for pay, 0 otherwise

x_{2}_{it}=wife’s paid work hours last week reported in year t for couple i

x_{3}_{it}=1 if husband in couple i in year t is currently working for pay, 0 otherwise

x_{4}_{it}=husband’s paid work hours last week reported in year t for couple i x_{5}_{it}=1 if couple i has a child under 6 years old in year t, 0 otherwise x_{6}_{it}=1 if couple i’s youngest child is 6-17 years old in year t, 0 otherwise

x_{7}_{i}=1 if wife in couple i has years of education = husband’s, 0 otherwise

x_{8}_{i}=1 if wife in couple i has years of education > husband’s, 0 otherwise

x_{9}_{i}=age of husband – age of wife

x_{10}_{i}=1 if wife’s occupation is teaching, 0 otherwise

x_{11}_{i}=1 if husband’s occupation is teaching, 0 otherwise

pandemic_{t }=1 if year is after pandemic hit (2020 or later), 0 otherwise

notavail_{it}=1 if paid in-home childcare was NOT available for couple i in year t, 0 otherwise

Let’s simply refer to the effect of removing the option of paid childcare on the ratio of mother’s to father’s hours of childcare as the treatment effect of interest: TE.

Since some but not all states allow paid in-home childcare in 2020 at the time of the survey, you could estimate TE with αˆ_{3} by running ordinary least squares using only 2020 data with the following specification:

6 points If you ran an ols regression based on the specification above, what type of estimator is βˆ3? And what assumptions are necessary for βˆ3 to be unbiased?

4 points Compare the advantages and disadvantages of the two potential ordinary least squares estimators for TE: βˆ3 from the specification above and γˆ_{3} from assuming

4 points The treatment effect of removing the option of paid childcare may be different for families where one of the parents is themselves a teacher. How would you suggest modifying your specification and why?

20 points To complete the Master of Science in Economics at a University, students must complete the core Economics course and one of the four advanced econometrics courses, among other requirements. Students receiving a B- or better on their final course grade receive credit for the course, while students with a C+ or below do not receive credit. Instructors of the core Econometrics course calculate a final number score for the course with cutoff values for assigning the letter grades. Failing to receive credit for a course that you attended and paid tuition for can be discouraging possibly affecting your enthusiasm for the subject matter. However, sucessfully retaking a course you struggled in may boost your confidence and enthusiasm. Suppose that you were interested in estimating the Treatment Effect of receiving a passing grade on the number of advanced econometric courses taken.

Let x_{0} be the cutoff value for receiving a B- in a core Econometrics course. You observe for each student:

y_{i}=number of advanced econometrics courses student i enrolled in

x_{1}_{i}=core Econometrics course score for student i

x_{2}_{i}=fraction of students failed by student i’s core Econometrics instructor

x_{3}_{i}=student i’s mean GPA (excluding econometrics courses)

x_{4}_{i}=1 if student i received an A in their Statistics class, 0 otherwise

x_{5}_{i}=1 if student i took Mathematical Methods for Economists, 0 otherwise

4 points To estimate the average treatment effect of receiving a passing grade on number of advanced econometrics courses taken, would you use a sharp or fuzzy Regression Discontinuity design? Explain.

4 points Write down your model specification to estimate the average treatment effect.

6 points Describe two graphical tests that would be important to perform and explain why.

6 points How would you design your falsification test? Explain why you chose it.

24 points Suppose that your goal was to estimate the effect of education on weekly hours worked for individuals approaching retirement. Assume that education is exogenous. The data set includes men over the age of 50 but less than 60 years old.

Suppose the data set includes a categorical variable that equals 1 if the man works 0 hours (that is, does not work), 2 if he works more than 0 but less than 35 hours per week, and 3 if he works more than 35 hours per week. There are no missing values.

4 points Which is the appropriate econometric model?

4 points Explain why you chose this model.

4 points Write the log likelihood function implied by your model choice.

Instead, suppose that you had the actual number of hours worked per week (e.g. 0,1,2,. . .) and you observe that a substantial share of men don’t work at all (have 0 hours of work).

4 points Which is the appropriate econometric model?

4 points Explain why you chose this model.

4 points Write the log likelihood function implied by your model choice.