Lecture 4: OLS Estimator for Multiple Linear Regression (MLR) (Completed version)¶

Overview

  • Warm up: recall OLS in 2D (line through a scatter)

  • Extend to multiple regression: visualize as a plane in 3D

  • See how residuals look in 2D and 3D (red segments)

  • Slice the plane to understand partial effects (hold one variable fixed)


🌱 Warm-up – Connecting to Last Week¶

Recall: In simple linear regression (SLR) we had dots and a line.

  • What does the OLS line do? It’s the best-fit line through the data.

Formally, the SLR model is:

$$ y = \beta_0 + \beta_1 x + u $$

  • $\beta_0$: intercept
  • $\beta_1$: slope (effect of $x$ on $y$ in the model)
  • $u$: error term (everything else not captured by $x$)

OLS chooses $\hat\beta_0, \hat\beta_1$ to minimize the sum of squared errors (SSE):

$$ \min_{\beta_0, \beta_1} \sum_{i=1}^n \big(y_i - (\beta_0 + \beta_1 x_i)\big)^2 $$

⚠️ Caution: This is a statistical effect in the model, not automatically a causal effect — causality needs stronger assumptions.
Example: 🍕 Pizza consumption vs. group project grades → Teams that meet more often might eat more pizza and do better on projects, but the driver is time spent collaborating. Pizza itself isn’t boosting grades.


What happens if we have more regressors?
For today's class, we will focus on the case of two regressors. Do we still have dots and a line?

❌ Short answer: No!
✅ Instead: we have dots in 3D space and OLS fits a plane through them.

👉 Today’s goal: see that line → plane → slice gives us intuition for multiple regression.


📦 Required libraries¶

We’ll use a few standard Python libraries in this lab:

  • numpy : generate data and do calculations.
  • statsmodels : make 2D and 3D plots.
  • matplotlib : run OLS regressions.
In [ ]:
# Let's install and import the required libraries together!
!pip install numpy statsmodels matplotlib --quiet

# Core libraries
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

# For 3D plotting
from mpl_toolkits.mplot3d import Axes3D  

# For animations in Jupyter
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

🎯 Tiny Toy Dataset (5 points) — used in all sections¶

We’ll use this small dataset for all the visuals in today’s lab:

$$ \begin{array}{c|c|c|c} \text{Individual }(i) & \text{Education }(x_{1i}) & \text{Experience }(x_{2i}) & \text{Wage }(y_i) \\\hline 1 & 10 & 1 & 20 \\ 2 & 12 & 4 & 24 \\ 3 & 14 & 5 & 27 \\ 4 & 16 & 9 & 31 \\ 5 & 18 & 7 & 34 \\ \end{array} $$


Big picture.

  • In simple linear regression (SLR) with one regressor (e.g., education), OLS fits a line through a scatter of points.
  • In multiple linear regression (MLR) with two regressors (education and experience), each observation is a point in 3D space, and OLS fits a plane.

Partial effect (what a coefficient means).

  • The coefficient on a regressor (say, education) is the slope of the plane in that direction, holding the other regressor fixed (experience).
  • A slice of the plane at a fixed value of the other variable is a line whose slope equals the partial effect.
In [ ]:
# Create the dataset

educ = np.array([10, 12, 14, 16, 18])   # x1
exper = np.array([ 1,  4,  5,  9,  7])   # x2
wage  = np.array([20, 24, 27, 31, 34])   # y

1) Simple OLS in 2D: draw a line through a scatter of points¶

  • Goal. Find a line
    $$ \hat{y}_i = \hat\beta_0 + \hat\beta_1 x_i $$
    that best summarizes the relation between $y$ and $x$.

  • OLS idea. Choose $\hat\beta_0, \hat\beta_1$ to minimize the mean squared error (MSE):

    $$ \min_{\beta_0,\beta_1} \; \frac{1}{n}\sum_i \big(y_i - (\beta_0 + \beta_1 x_i)\big)^2 $$


  • Residual (estimation). For a point $(x_i,y_i)$, the residual is the red vertical segment between the actual point and the fitted line:

    $$ e_i = y_i - \hat y_i $$

    OLS makes these red segments as short as possible on average (squared).


  • In our toy dataset, let’s regress wage on education only:

    $$ \text{wage}_i = \beta_0 + \beta_1 \,\text{educ}_i + u_i $$

    where:

    • $\text{wage}_i$: hourly wage (in dollars) for person $i$
    • $\text{educ}_i$: years of education for person $i$
    • $u_i$: error term (everything else affecting wage not captured by education)

Here’s the subset of data we are using:

$$ \begin{array}{c|c|c} \text{Individual }(i) & \text{Education }(x_i) & \text{Wage }(y_i) \\\hline 1 & 10 & 20 \\ 2 & 12 & 24 \\ 3 & 14 & 27 \\ 4 & 16 & 31 \\ 5 & 18 & 34 \\ \end{array} $$

⚠️ Note: The full dataset also has other variables (like experience).
By running SLR, we are only incorporating part of the data.
Later, in MLR, we’ll see how to bring in those extra columns.

In [ ]:
# SLR: wage on education only

# Step 1. Raw scatter plot
plt.scatter(educ, wage)
plt.title("Toy dataset: Wage vs Education (raw scatter)")
plt.xlabel("Education"); plt.ylabel("Wage")
plt.show()
In [ ]:
# Step 2. Fit OLS line

# Add constant to X matrix (for intercept)
X_slr = sm.add_constant(educ)
# Fit the model
m_slr = sm.OLS(wage, X_slr).fit()
# Get coefficients
b0_slr, b1_slr = m_slr.params

# Step 3. Add fitted line + residuals

# Create line for plotting
xs = np.linspace(educ.min()-0.5, educ.max()+0.5, 200) # synthetic x values for line: from educ.min()-0.5 to educ.max()+0.5, 200 points.
ys = b0_slr + b1_slr*xs # corresponding y values on line (based on estimated coefficients)

plt.scatter(educ, wage, label="Data")
plt.plot(xs, ys, linewidth=2, label="OLS line")

plt.title("SLR: Wage vs Education (with residuals)")
plt.xlabel("Education"); plt.ylabel("Wage"); plt.legend(); plt.show()
In [ ]:
# Step 3. Plot residuals (vertical red lines) into the graph we just made

# The graph we made above
plt.scatter(educ, wage, label="Data")
plt.plot(xs, ys, linewidth=2, label="OLS line")

# Predicted values and residuals
yhat_wage_slr = m_slr.predict(X_slr)
resid_slr = wage - yhat_wage_slr

mse_slr = np.mean(resid_slr**2)

# draw residuals
for i in range(len(educ)):
    plt.vlines(educ[i], yhat_wage_slr[i], wage[i], color="red", linewidth=1.5)

plt.title("SLR: Wage vs Education (with residuals)")
plt.xlabel("Education"); plt.ylabel("Wage"); plt.legend(); plt.show()
In [ ]:
# Let's inspect more closely the estimated coefficients and the MSE
print(f"Intercept (β0): {b0_slr:.2f}")
print(f"Slope (β1): {b1_slr:.2f}")
print(f"MSE: {mse_slr:.2f}")

📊 Interpretation of coefficients¶

  • Estimated OLS line:
    $$ \hat{y}_i = 2.70 + 1.75 \, \text{educ}_i $$

  • Intercept ($\hat\beta_0 = 2.70$): Predicted hourly wage when education = 0.
    (Not very meaningful in practice, but that’s the baseline.)

  • Slope ($\hat\beta_1 = 1.75$): Each extra year of education is associated with $1.75 higher hourly wage on average.

  • MSE (0.06): The model’s average squared error is very small, meaning the line fits these 5 data points almost perfectly.

Predicted values and residuals:

$$ \begin{array}{c|c|c|c|c} \text{Obs} & \text{Education }(x_i) & \text{Actual Wage }(y_i) & \text{Predicted Wage }(\hat{y}_i) & \text{Residual }(e_i) \\\hline 1 & 10 & 20 & 20.2 & -0.2 \\ 2 & 12 & 24 & 23.7 & \;\;0.3 \\ 3 & 14 & 27 & 27.2 & -0.2 \\ 4 & 16 & 31 & 30.7 & \;\;0.3 \\ 5 & 18 & 34 & 34.2 & -0.2 \\ \end{array} $$

These residuals ($e_i$) are exactly the red vertical lines drawn in the plot.


2) Multiple regression in 3D: fit a plane through points in space¶

  • Setup (intuition). In simple regression, each individual $i$ is a point $(x_i, y_i)$ in 2D (education, wage).
    With two regressors, each individual becomes a point in 3D space:

    $$ (x_{1i}, x_{2i}, y_i) = (\text{educ}_i,\ \text{exper}_i,\ \text{wage}_i) $$

    For example:

    • Individual 1: $(10,\ 1,\ 20)$ — one point in 3D.
    • Individual 2: $(12,\ 4,\ 24)$ — another point.

    Plotting all 5 individuals gives a cloud of dots in 3D.
    OLS chooses the plane (instead of a line) that best fits these dots by minimizing the sum of vertical (in the $y$ direction) squared distances.


Here is our toy dataset with two regressors (education, experience) and outcome (wage):

$$ \begin{array}{c|c|c|c} \text{Individual }(i) & \text{Education }(x_{1i}) & \text{Experience }(x_{2i}) & \text{Wage }(y_i) \\\hline 1 & 10 & 1 & 20 \\ 2 & 12 & 4 & 24 \\ 3 & 14 & 5 & 27 \\ 4 & 16 & 9 & 31 \\ 5 & 18 & 7 & 34 \\ \end{array} $$

  • Each row is one individual.
  • Each individual becomes a point in 3D space: $(\text{educ}_i, \text{exper}_i, \text{wage}_i)$.
In [ ]:
# Intuition: manually enter one point at a time with students
fig = plt.figure(figsize=(10,8))  # create a new figure
ax = fig.add_subplot(1, 1, 1, projection="3d")  # add a 3D subplot

# Individual  1
ax.scatter(10, 1, 20, color="royalblue", s=120, edgecolor="black")
ax.text(10+0.3, 1+0.3, 20+0.3, "Individual 1", color="royalblue")

# Individual  2
ax.scatter(12, 4, 24, color="seagreen", s=120, edgecolor="black")
ax.text(12+0.3, 4+0.3, 24+0.3, "Individual 2", color="seagreen")

# Individual  3
ax.scatter(14, 5, 27, color="darkorange", s=120, edgecolor="black")
ax.text(14+0.3, 5+0.3, 27+0.3, "Individual 3", color="darkorange")

# Individual  4
ax.scatter(16, 9, 31, color="purple", s=120, edgecolor="black")
ax.text(16+0.3, 9+0.3, 31+0.3, "Individual 4", color="purple")

# Individual  5
ax.scatter(18, 7, 34, color="crimson", s=120, edgecolor="black")
ax.text(18+0.3, 7+0.3, 34+0.3, "Individual 5", color="crimson")

# Labels and style
ax.set_xlabel("Education", labelpad=15)
ax.set_ylabel("Experience", labelpad=15)
ax.text(
    x=min(educ)-1, 
    y=min(exper)-1, 
    z=max(wage)+1,
    s="Wage", fontsize=11, rotation=90, color="black"
)

ax.view_init(elev=20, azim=120)

# Keep axis ranges consistent with padding
ax.set_xlim(min(educ)-1, max(educ)+1)
ax.set_ylim(min(exper)-1, max(exper)+1)
ax.set_zlim(min(wage)-1, max(wage)+1)

plt.subplots_adjust(left=0.1, right=0.95, top=0.9, bottom=0.1)
plt.show()
In [ ]:
# Auto-rotate the exact 3D graph you built
import numpy as np
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(1, 1, 1, projection="3d")

# --- Same points/labels as your static plot ---
ax.scatter(10, 1, 20, color="royalblue", s=120, edgecolor="black"); ax.text(10+0.3, 1+0.3, 20+0.3, "Individual 1", color="royalblue")
ax.scatter(12, 4, 24, color="seagreen", s=120, edgecolor="black");  ax.text(12+0.3, 4+0.3, 24+0.3, "Individual 2", color="seagreen")
ax.scatter(14, 5, 27, color="darkorange", s=120, edgecolor="black"); ax.text(14+0.3, 5+0.3, 27+0.3, "Individual 3", color="darkorange")
ax.scatter(16, 9, 31, color="purple", s=120, edgecolor="black");     ax.text(16+0.3, 9+0.3, 31+0.3, "Individual 4", color="purple")
ax.scatter(18, 7, 34, color="crimson", s=120, edgecolor="black");    ax.text(18+0.3, 7+0.3, 34+0.3, "Individual 5", color="crimson")

# Axes labels (with manual z-label placement)
ax.set_xlabel("Education", labelpad=15)
ax.set_ylabel("Experience", labelpad=15)
ax.text(x=min(educ)-1, y=min(exper)-1, z=max(wage)+1, s="Wage", fontsize=11, rotation=90, color="black")

# Ranges & layout exactly as before
ax.set_xlim(min(educ)-1, max(educ)+1)
ax.set_ylim(min(exper)-1, max(exper)+1)
ax.set_zlim(min(wage)-1, max(wage)+1)
ax.view_init(elev=20, azim=120)
plt.subplots_adjust(left=0.1, right=0.95, top=0.9, bottom=0.1)

# --- Animation: spin around azimuth ---
def update(angle):
    ax.view_init(elev=20, azim=angle)
    return ()

anim = FuncAnimation(fig, update, frames=np.linspace(0, 360, 181), interval=50, blit=False)

plt.close(fig) 
HTML(anim.to_jshtml())  # shows the animation inline in Jupyter

Now that we’ve placed all 5 individuals in 3D, notice something important:

  • In 2D (SLR), a line summarizes how $y$ changes with $x$.

  • In 3D (MLR with two regressors), a single line is no longer enough.

    Each point is $(\text{educ}_i, \text{exper}_i, \text{wage}_i)$ in 3D, and OLS fits a plane to summarize how wage depends on both regressors.

  • Each slope describes change along one axis:

    • $\hat\beta_1$: slope in the education direction (holding experience fixed).

    • $\hat\beta_2$: slope in the experience direction (holding education fixed).

  • If we slice the plane at a fixed level of one regressor, the slice is a line in 2D whose slope equals the partial effect of the other regressor.


OLS in 3D:

  • The fitted surface is:
    $$ \hat y_i = \hat\beta_0 + \hat\beta_1 \,\text{educ}_i + \hat\beta_2 \,\text{exper}_i $$
  • OLS chooses $(\hat\beta_0, \hat\beta_1, \hat\beta_2)$ to minimize the sum of squared vertical distances between actual points and the plane:
    $$ \min_{\beta_0,\beta_1,\beta_2} \sum_i \Big(y_i - (\beta_0 + \beta_1 \,\text{educ}_i + \beta_2 \,\text{exper}_i)\Big)^2 $$

👉 Same idea as before: in 2D OLS found the best-fitting line, in 3D it finds the best-fitting plane.

In [ ]:
# MLR: wage on education and experience (3D plane + residual segments)
X_mlr = sm.add_constant(np.column_stack([educ, exper]))  # add constant & combine regressors
m_mlr = sm.OLS(wage, X_mlr).fit()
b0, b1, b2 = m_mlr.params

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1, 1, 1, projection="3d")

# Data points
ax.scatter(educ, exper, wage, label="Data")

# Mesh for the fitted plane
E, K = np.meshgrid(
    np.linspace(educ.min()-0.5, educ.max()+0.5, 20),
    np.linspace(exper.min()-0.5, exper.max()+0.5, 20)
)
W = b0 + b1*E + b2*K
ax.plot_surface(E, K, W, alpha=0.35)

# Residual "vertical" segments
yhat3 = b0 + b1*educ + b2*exper
for i in range(len(educ)):
    ax.plot([educ[i], educ[i]], [exper[i], exper[i]], [yhat3[i], wage[i]], linewidth=1.5)

# Labels, ranges, view
ax.set_title("MLR: Fit a plane to 5 points (with residual segments)")
ax.set_xlabel("Education", labelpad=12)
ax.set_ylabel("Experience", labelpad=12)
ax.text(
    x=min(educ)-1, 
    y=min(exper)-1, 
    z=max(wage)+1,
    s="Wage", fontsize=11, rotation=90, color="black"
)
ax.set_xlim(educ.min()-1, educ.max()+1)
ax.set_ylim(exper.min()-1, exper.max()+1)
ax.set_zlim(wage.min()-1, wage.max()+1)
ax.view_init(elev=20, azim=120)

plt.show()

# Pretty print coefficients and fit quality
print(f"Intercept (β̂0): {b0:.2f}")
print(f"Educ slope (β̂1): {b1:.2f}")
print(f"Exper slope (β̂2): {b2:.2f}")

📊 Interpretation of the MLR fit (5-point dataset)¶

  • Estimated OLS plane:
    $$ \hat{y}_i \;\approx\; 3.80 \;+\; 1.61\,\text{educ}_i \;+\; 0.16\,\text{exper}_i $$

  • Intercept ($\hat\beta_0 = 3.80$): Predicted wage when education = 0 and experience = 0 (baseline; not always meaningful).

  • Education slope ($\hat\beta_1 = 1.61$): Holding experience fixed, +1 year of education is associated with about $1.61 higher wage on average.

  • Experience slope ($\hat\beta_2 = 0.16$): Holding education fixed, +1 year of experience is associated with about $0.16 higher wage on average.

  • Residuals: The red vertical segments in the 3D plot show
    $$ e_i = y_i - \hat{y}_i $$ OLS chooses the plane to make these vertical gaps as small as possible in the squared sense.

In [ ]:
# Rotate the fitted plane + points to view from all angles (with ground projections)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1, 1, 1, projection="3d")

# Re-draw the same plot
ax.scatter(educ, exper, wage, label="Data")
E, K = np.meshgrid(
    np.linspace(educ.min()-0.5, educ.max()+0.5, 20),
    np.linspace(exper.min()-0.5, exper.max()+0.5, 20)
)
W = b0 + b1*E + b2*K
ax.plot_surface(E, K, W, alpha=0.35, shade=False)  # shade=False to avoid warnings

yhat3 = b0 + b1*educ + b2*exper
for i in range(len(educ)):
    ax.plot([educ[i], educ[i]], [exper[i], exper[i]], [yhat3[i], wage[i]], linewidth=1.5)

# --- Prediction at (educ*, exper*) and ground projections ---
educ_star  = 15    # <-- change live in class
exper_star = 6     # <-- change live in class
yhat_star  = b0 + b1*educ_star + b2*exper_star

# point on the plane
ax.scatter([educ_star], [exper_star], [yhat_star],
           s=120, color="crimson", edgecolor="black", label="Predicted ŷ on plane")

# choose a ground z (bottom of z-limits) for projections
z0 = wage.min() - 1

# vertical dotted line from ground up to the plane
ax.plot([educ_star, educ_star], [exper_star, exper_star],
        [z0, yhat_star], linestyle=":", linewidth=2, color="black")

# ground projection lines along axes (z = z0)
xmin, xmax = ax.get_xlim()
ymin, ymax = ax.get_ylim()

# from experience axis to (educ*, exper*) on ground (move along education at fixed exper=exper*)
ax.plot([xmax, educ_star], [exper_star, exper_star],
        [z0, z0], linestyle=":", linewidth=1.8, color="gray")

# from education axis to (educ*, exper*) on ground (move along experience at fixed educ=educ*)
ax.plot([educ_star, educ_star], [ymax, exper_star],
        [z0, z0], linestyle=":", linewidth=1.8, color="gray")

# optional: mark the ground base point
ax.scatter([educ_star], [exper_star], [z0], s=50, color="gray")

# annotate predicted value
ax.text(educ_star+0.4, exper_star+0.4, yhat_star+0.4,
        f"ŷ = {yhat_star:.2f}", color="crimson")

# Labels, ranges, view
ax.set_xlabel("Education", labelpad=12)
ax.set_ylabel("Experience", labelpad=12)
ax.set_zlabel("Wage", labelpad=18)
ax.set_xlim(educ.min()-1, educ.max()+1)
ax.set_ylim(exper.min()-1, exper.max()+1)
ax.set_zlim(wage.min()-1, wage.max()+1)
ax.view_init(elev=20, azim=120)

def update(angle):
    ax.view_init(elev=20, azim=angle)
    return ()

anim = FuncAnimation(fig, update, frames=np.linspace(0, 360, 181), interval=50, blit=False)

plt.close(fig)  # hide static frame
HTML(anim.to_jshtml())

🔮 Reading a prediction from the regression plane¶

Suppose we want the predicted wage for an individual with

  • Education = 15 years
  • Experience = 6 years.

OLS gives us a predicted value:
$$ \hat y^* = \hat\beta_0 + \hat\beta_1 \cdot 15 + \hat\beta_2 \cdot 6 . $$

In the graph:

  • The red point is $(15,\;6,\;\hat y^*)$ sitting on the regression plane.
  • The black dotted line shows how we project up from the ground to the plane to find $\hat y^*$.
  • The gray dotted lines on the ground connect back to the education and experience axes, so you can read the input values directly from the axis ticks.

👉 This shows how the regression plane lets us compute and visualize predicted outcomes for any combination of regressors.


3) Slicing the plane ⇒ a line: reading a partial effect¶

  • Holding constant.
    To isolate the effect of education, we “slice” the regression plane at a fixed level of experience (for example, the median value).

  • What we see.
    That slice is a line in the (education, wage) plane.
    Its slope equals the estimated coefficient on education, $\hat\beta_1$.

  • Interpretation.

    If education increases by 1 year, while holding experience fixed,
    wage is predicted to change by $\hat\beta_1$ on average.

👉 This is what we mean by a partial effect: the slope of the plane in one direction when the other regressor is kept constant.

In [ ]:
# Slice the plane at exper = 6, and rotate to view from all angles

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1, 1, 1, projection="3d")

# Data and fitted plane
ax.scatter(educ, exper, wage, label="Data")
E, K = np.meshgrid(
    np.linspace(educ.min()-0.5, educ.max()+0.5, 20),
    np.linspace(exper.min()-0.5, exper.max()+0.5, 20)
)
W = b0 + b1*E + b2*K
ax.plot_surface(E, K, W, alpha=0.35, shade=False)

# --- Slice settings ---
exper_slice = 6.0  # ← choose the fixed experience level for the slice (e.g., median or 6)
z0, z1 = wage.min() - 1, wage.max() + 1
x0, x1 = educ.min() - 1, educ.max() + 1

# Draw a translucent vertical panel at exper = exper_slice (to visualize "holding exper fixed")
Xp, Zp = np.meshgrid(np.linspace(x0, x1, 2), np.linspace(z0, z1, 2))
Yp = np.full_like(Xp, exper_slice)
ax.plot_surface(Xp, Yp, Zp, alpha=0.10, color="gray", edgecolor="none")  # slice panel

# Intersection line: where the OLS plane meets the slice panel
x_line = np.linspace(educ.min()-0.5, educ.max()+0.5, 200)
y_line = np.full_like(x_line, exper_slice)
z_line = b0 + b1*x_line + b2*exper_slice
ax.plot(x_line, y_line, z_line, linewidth=3, color="crimson", label=f"Slice @ exper={exper_slice:g}")

# Optional: label the slice line
ax.text(x_line[-1], exper_slice, z_line[-1], "  slice (partial effect of educ)", color="crimson")

# Residuals (keep or comment out if you want cleaner view)
yhat3 = b0 + b1*educ + b2*exper
for i in range(len(educ)):
    ax.plot([educ[i], educ[i]], [exper[i], exper[i]], [yhat3[i], wage[i]], linewidth=1.2, color="black")

# Axes, limits, view
ax.set_xlabel("Education", labelpad=12)
ax.set_ylabel("Experience", labelpad=12)
ax.set_zlabel("Wage", labelpad=18)
ax.set_xlim(educ.min()-1, educ.max()+1)
ax.set_ylim(exper.min()-1, exper.max()+1)
ax.set_zlim(wage.min()-1, wage.max()+1)
ax.view_init(elev=20, azim=120)

def update(angle):
    ax.view_init(elev=20, azim=angle)
    return ()

anim = FuncAnimation(fig, update, frames=np.linspace(0, 360, 181), interval=50, blit=False)

plt.close(fig)  # hide static frame
HTML(anim.to_jshtml())
In [ ]:
# Slice the plane at a fixed experience level, show the implied line
exper_slice = 6.0
xs2 = np.linspace(educ.min()-0.5, educ.max()+0.5, 200)
w_slice = b0 + b1*xs2 + b2*exper_slice

plt.scatter(educ, wage, label="Data", color="blue")
plt.plot(xs2, w_slice, color="crimson", linewidth=2,
         label=f"Slice @ exper = {exper_slice:g}")

plt.title("Slice of the plane ⇒ line (partial effect of education)")
plt.xlabel("Education")
plt.ylabel("Wage")
plt.legend()
plt.show()

print("Partial effect of education (β̂1) =", round(b1, 6))

💡 Key point:

  • The slope with respect to education ($\hat\beta_1$) is the same across all slices.
  • The intercept of each slice changes with the fixed value of experience (shifted by $\hat\beta_2 \times \text{exper}$).

👉 So when we change the slice, we move the line up/down, but the slope stays constant.


🔎 Comparing MLR vs SLR¶

SLR (wage ~ educ): $$ \hat{y}_i \;\approx\; 2.70 \;+\; 1.75 \,\text{educ}_i $$

  • Slope ($\hat\beta_1 = 1.75$): each extra year of education is associated with $1.75 higher wage on average.
  • But ⚠️ this ignores experience, so part of experience’s effect is bundled into education’s slope.

MLR (wage ~ educ + exper): $$ \hat{y}_i \;\approx\; 3.80 \;+\; 1.61 \,\text{educ}_i \;+\; 0.16 \,\text{exper}_i $$ - Education effect drops to $1.61 once we account for experience.
- Experience itself has a small positive slope ($0.16 per year).

Takeaway:

  • In SLR, the slope on education was a bit too high because experience was omitted and is positively correlated with education.
  • In MLR, the regression “splits” the variation correctly: education still matters a lot, but some of the wage differences are explained by experience.

References & Acknowledgments¶

  • This teaching material was prepared with the assistance of OpenAI's ChatGPT (GPT-5).

End of lecture notebook.