📘 ECON 320 Lab Problem Set 2¶

  • Name : [Your Name]

  • Lab Section: [Your Lab Section Here]

Please submit the exercise on Canvas in form of a HTML/PDF file.¶


This assignment builds on:¶

  • Week 3: OLS Estimator for Simple Linear Regression
  • Week 4: OLS Estimator for Multiple Linear Regression
  • Week 5: Incorporating Qualitative Data
  • data: J.M. Wooldridge (2019) Introductory Econometrics: A Modern Approach, Cengage Learning, 7th edition.

🎯 Learning Objectives¶

By the end of this assignment, you should be able to:

  1. Practice OLS in simple and multiple regression.
  2. Interpret estimated slopes (marginal effects).
  3. Incorporate qualitative data (dummy variables) in regression.

📝 Grading (Total = 10 points)¶

  • Q1: Data prep & quick summary — 3 pt
  • Q2: SLR — 2 pts
  • Q3: MLR & dummy variables — 5 pts

🔧 Setup¶

⚠️ Please run the cells in this section before starting the problem set!!¶

📦 Download and import required libraries (please feel free to add any other libraries you may need)¶
In [ ]:
# Install quietly (run once if needed)
%pip install -q wooldridge pandas numpy matplotlib seaborn statsmodels nbconvert
In [ ]:
# Imports
import wooldridge as wr
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf

plt.rcParams['figure.figsize'] = (6,4)
print("✅ Libraries ready.")

📥 Load the dataset¶
  • In this problem set, we will use the econmath dataset from the wooldridge package. (same as PS1)

  • It contains information on students from a large college course in introductory microeconomics.

In [ ]:
# Show dataset description (variables, source, sample)
wr.data("econmath", description=True)

# Load DataFrame for analysis
df_raw = wr.data("econmath").copy()
df_raw.head()

❓ Q1 — Data prep & quick summary (3 pts, 3 sub-questions)¶

  1. Create a working copy df = df_raw.copy() to include only the interested variables in this exercise (score, hsgpa, calculus) and report the number of rows and columns.
In [ ]:
# Put your answer here
  1. Report .describe() for the working copy df.
In [ ]:
# Put your answer here
  1. Report the count of missing values in each column of df.
In [ ]:
# Put your answer here

❓ Q2 — Simple Linear Regression (2 pts, 2 sub-questions)¶

Estimate the simple regression of score on hsgpa using OLS:

$$ \text{score}_i = \beta_0 + \beta_1\,\text{hsgpa}_i + u_i. $$

  1. Report the estimated intercept and slope (from smf.ols("score ~ hsgpa", data=df).fit()).
In [ ]:
# Put your answe here
  1. Interpret the slope in plain English (units of score per 1-point increase in hsgpa).

✍️ Interpret the slope here (two sentences max):

In [ ]:
# Plot scatter + fitted line (provided)
sns.regplot(x='hsgpa', y='score', data=df, scatter_kws={'alpha':0.5})
plt.title("SLR: score ~ hsgpa")
plt.show()

❓ Q3 — Multiple Linear Regression & dummy variables (5 pts, 4 sub-questions)¶

Estimate the multiple regression: $$ \text{score}_i = \beta_0 + \beta_1\,\text{hsgpa}_i + \beta_2\,\text{calculus}_i + u_i. $$

  1. Report the estimated coefficients (from smf.ols("score ~ hsgpa + calculus", data=df).fit()).
In [ ]:
# Put your answer here
  1. Interpret $\beta_1$: the partial effect of hsgpa holding calculus fixed.

✍️ Interpret the partial effect here (two sentences max):

  1. Produce a plot with:
    • fix calculus=1 and plot predicted score vs hsgpa. (Line should be straight since slopes don’t vary with calculus here.) [2 pts]
In [ ]:
# Put your answer here
  1. Compare the slope on hsgpa from Q2 and Q3. Which one is larger? In 1–2 sentences, explain why they might differ.

✍️ Compare the slopes here (two sentences max):


End of Problem Set.