ECON 320 Lab Exercise : Week 1 --- Does a Higher Average Mean Better Teaching? 🎓¶

  • Name : [Your Name]

  • Lab Section: [Your Lab Section Here]

  • Code for week 1's attendance: [Attendance code here]

Please submit the exercise on Canvas in form of a HTML/PDF file.¶


Background¶

  • You’re given test scores from two lab sessions (A and B, which are taught by instructors Alice and Bob, respectively).

  • Students come in with different prep levels (“High” vs “Low”).

Instructions¶

  • Before answering the questions below, run the code cell to generate the dataset.

  • The below cell creates a sample dataset for you to work with. Run the cell to generate the data BUT DO NOT MODIFY IT.

In [ ]:
# **************** You don't need to modify anything in this cell.*****************
# **************** Just run it to generate the dataset. *****************

import numpy as np
import pandas as pd

rng = np.random.default_rng(7)

n_A, n_B = 120, 120

prep_A = rng.choice(["High", "Low"], size=n_A, p=[0.8, 0.2])
prep_B = rng.choice(["High", "Low"], size=n_B, p=[0.3, 0.7])

def gen_scores(prep, mean_high_A=85, mean_low_A=72, mean_high_B=88, mean_low_B=74, sd=6, instructor="A"):
    means = np.where(prep=="High",
                     mean_high_A if instructor=="Alice" else mean_high_B,
                     mean_low_A if instructor=="Alice" else mean_low_B)
    return rng.normal(loc=means, scale=sd, size=len(prep))

scores_A = gen_scores(prep_A, instructor="Alice")
scores_B = gen_scores(prep_B, instructor="Bob")

df = pd.DataFrame({
    "instructor": np.r_[np.repeat("Alice", n_A), np.repeat("Bob", n_B)],
    "prep":    np.r_[prep_A, prep_B],
    "score":   np.r_[scores_A, scores_B]
})
  • Take a look at the dataset by running the following code cell.
In [ ]:
# **************** You don't need to modify anything in this cell.*****************
# **************** Just run it to Preview the first five rows of the dataset. *****************

df.head(5)

Task (15-20 minutes)¶

Q1. Find the number of students in each lab section (A and B)¶

Your task is to calculate how many students are in each lab section (A and B). Store the results in the variables n_A and n_B.

💡 Hints:

  • You can filter the dataframe by instructor, e.g.:
    df[df["instructor"] == "Alice"]
    df[df["instructor"] == "Bob"]
    
  • After filtering, you can count the rows using any of these:
    • len(df_filtered) → counts rows directly
    • df_filtered.shape[0] → returns the number of rows
    • df_filtered["score"].count() → counts non-missing scores only
  • Pick whichever method you prefer!
In [ ]:
# Put your answer for Q1 here:





# Print the answers (NO NEED TO MODIFY)
print("The number of students in lab A is:", n_A)
print("The number of students in lab B is:", n_B)

You should expect to see the following output after running the code cell below:

The number of students in lab A is: 120
The number of students in lab B is: 120

Q2. Compute the mean test score for each lab section (A and B)¶

Using the dataset provided, compute the mean test score separately for:

  • Instructor Alice → Lab Section A
  • Instructor Bob → Lab Section B

💡 Hint:
Use the groupby() function in combination with mean() in pandas.

In [ ]:
# Put your answer for Q2 here:

Q3. Calculate the average test score by prep level and teacher¶

Now, let’s find the mean test score for each prep level (High vs. Low) within each teacher’s lab.

💡 Hint:
Use the groupby() function in combination with mean() in pandas, for example:

df.groupby(["prep", "instructor"])["score"].mean().reset_index()
  • The first argument ["prep", "instructor"] groups the data by both columns.
  • The ["score"].mean() part calculates the average score for each group.
  • The reset_index() part is optional but helps to convert the result back to a DataFrame.
In [ ]:
# Put your answer for Q3 here:

Q4. Count how many high-prep students are in each instructor's lab section¶

Let's see if the number of high-prep students differs between class of Instructor Alice and Instructor Bob.

Count the number of students with prep == "High" in each lab section.

💡 Hint:
Use groupby() with a filter:

df[df["prep"] == "High"].groupby("instructor")["prep"].count().reset_index()

This filters for students with prep == "High" and counts them by instructor.

In [ ]:
# Put your answer for Q4 here:

Q5. From the results above, do you think Instructor A is a better teacher than Instructor B? Why or why not?¶

Put your answer for Q5 here:¶

End of Lab Exercise.