%pip install wooldridge matplotlib seaborn

import wooldridge as wr
import matplotlib.pyplot as plt
import seaborn as sns

print("✅ Libraries ready.")

# Show dataset description (variables, source, sample) for the "econmath" dataset
wr.data("econmath", description=True)

# Load DataFrame for analysis
df_raw = wr.data("econmath")
df_raw.head()

# Put your answer here

# Put your answer here

# Put your answer here

# Solution is provided for this step

df_clean = df_raw.dropna(subset=['actmth', 'acteng'])
print(f"Rows remaining after dropping missing ACT scores: {df_clean.shape[0]}")

# Put your answer here

# Put your answer here
# 1) Histogram of score

# Put your answer here
# 2) Boxplot of score

# Put your answer here
# 3) Scatter: score vs actmth

# Put your answer here
# 4) Scatter: score vs acteng

📘 ECON 320 Lab Problem Set 1¶

**Please submit the exercise on Canvas in form of a HTML/PDF file.**¶

This assignment builds on:¶

🎯 Learning Objectives¶

📝 Grading (Total = 10 points)¶

🔧 Setup¶

⚠️ Please run the cells in this section before starting the problem set!¶

📦 Download and import required libraries¶

📥 Load the dataset¶

❓ Q1 — Summary Statistics (2 pts, 2 sub-questions)¶

1. Use `.describe()` to summarize the variables.¶

2. Report mean, std, min, max for `score`, `actmth`, and `acteng`.¶

❓ Q2 — Cleaning data (2 pts, 3 sub-questions)¶

1. Check for missing values.¶

2. For this assignment, assume the missing ACT scores are missing at random.¶

3. Create a new DataFrame (name it `df_analysis`) based on `df_clean` to keep only the following relevant variables for the rest of the assignment:¶

❓ Q3 — Visualizing the Data (6 pts, 5 sub-questions)¶

1. Histogram of `score`. (1 pt)¶

2. Boxplot of `score`. (1 pt)¶

3. Scatter plot of `score` vs `actmth` (with best-fit line). (1 pt)¶

4. Scatter plot of `score` vs `acteng` (with best-fit line). (1 pt)¶

📘 ECON 320 Lab Problem Set 1¶

Please submit the exercise on Canvas in form of a HTML/PDF file.¶

This assignment builds on:¶

🎯 Learning Objectives¶

📝 Grading (Total = 10 points)¶

🔧 Setup¶

⚠️ Please run the cells in this section before starting the problem set!¶

📦 Download and import required libraries¶

📥 Load the dataset¶

❓ Q1 — Summary Statistics (2 pts, 2 sub-questions)¶

1. Use .describe() to summarize the variables.¶

2. Report mean, std, min, max for score, actmth, and acteng.¶

❓ Q2 — Cleaning data (2 pts, 3 sub-questions)¶

1. Check for missing values.¶

2. For this assignment, assume the missing ACT scores are missing at random.¶

3. Create a new DataFrame (name it df_analysis) based on df_clean to keep only the following relevant variables for the rest of the assignment:¶

❓ Q3 — Visualizing the Data (6 pts, 5 sub-questions)¶

1. Histogram of score. (1 pt)¶

2. Boxplot of score. (1 pt)¶

3. Scatter plot of score vs actmth (with best-fit line). (1 pt)¶

4. Scatter plot of score vs acteng (with best-fit line). (1 pt)¶

**Please submit the exercise on Canvas in form of a HTML/PDF file.**¶

1. Use `.describe()` to summarize the variables.¶

2. Report mean, std, min, max for `score`, `actmth`, and `acteng`.¶

3. Create a new DataFrame (name it `df_analysis`) based on `df_clean` to keep only the following relevant variables for the rest of the assignment:¶

1. Histogram of `score`. (1 pt)¶

2. Boxplot of `score`. (1 pt)¶

3. Scatter plot of `score` vs `actmth` (with best-fit line). (1 pt)¶

4. Scatter plot of `score` vs `acteng` (with best-fit line). (1 pt)¶