📘 ECON 320 Lab Problem Set 3¶

  • Name : [Your Name]

  • Lab Section: [Your Lab Section Here]

Please submit the exercise on Canvas in form of a HTML/PDF file.¶


This assignment builds on:¶

  • Week 6: Multicollinearity
  • Week 7: Omitted Variable Bias (OVB)
  • Data: J.M. Wooldridge (2019) Introductory Econometrics: A Modern Approach, Cengage Learning, 7th ed.

📝 Grading (Total = 10 points)¶

  • Q1: Detect the problem — 2 pts
  • Q2: Fix the problem — 4 pts
  • Q3: OVB — 4 pts

🔧 Setup¶

⚠️ Please run the cells in this section before starting the problem set!!¶

📦 Import required libraries¶
In [ ]:
# Install quietly if needed
!pip install numpy pandas statsmodels wooldridge --quiet

import numpy as np
import pandas as pd
import statsmodels.api as sm
import wooldridge as wr

📥 Load the dataset¶
In [ ]:
# Use CARD dataset only
card = wr.data('card').copy()

# Keep common variables and construct log wage if needed
keep = [c for c in ['lwage','educ','age','exper'] if c in card.columns]
card = card[keep].dropna().copy()

card.head()

Q1. Detect the problem¶

In the Card dataset, potential experience is constructed as: $$ \text{exper} = \text{age} - \text{educ} - 6. $$

Tasks:

  1. Verify this identity by computing exper - (age - educ - 6). (Run the given code cell below)
  2. Based on your results, identify the problem:
    • What type of collinearity is present?
    • What issue will OLS run into when we include age, educ, and exper in the same regression?
In [ ]:
# Q1 Task 1 — verify the identity and identify the issue (Please run this code cell without changes)
card['diff'] = card['exper'] - (card['age'] - card['educ'] - 6)
print("\nAre all values zero (up to rounding)?", np.allclose(card['diff'], 0, atol=1e-8))

Put your answer to Q1 Task 2 here:


Q2. Fix the problem¶

To fix perfect multicollinearity, we need to drop one redundant variable.

Estimate two models:

  1. Drop exper:
    $$ \text{lwage} = \beta_0 + \beta_1\,\text{educ} + \beta_2\,\text{age} + u $$
  2. Drop age:
    $$ \text{lwage} = \beta_0 + \beta_1\,\text{educ} + \beta_2\,\text{exper} + u $$

Tasks:

  1. Report the educ coefficient and R² for both models.
  2. Compare the two results. Are the educ coefficients the same or different? Are the R² values the same or different? Explain why.
In [ ]:
# Put your answer to Q2 Task 1 here:

Put your answer to Q2 Task 2 here:


Q3. Omitted Variable Bias (OVB)¶

Now intentionally omit a relevant variable to see OVB.

Compare the following two models:

Short model (omitting exper): $$ \text{lwage} = \alpha_0 + \alpha_1 \text{educ} + e $$

Long model (including exper): $$ \text{lwage} = \beta_0 + \beta_1 \text{educ} + \beta_2 \text{exper} + u $$

Tasks:

  1. Report the educ coefficient and R² for both models.
  2. Which educ estimate is larger? What does this suggest about the direction of the bias in the short model?
In [ ]:
# Put your answer to Q3 Task 1 here:

Put your answer to Q3 Task 2 here:


End of Problem Set.