Lecture 6 — Reporting Guidelines & Exam Preparation

Author

Johnny van Doorn

Published

June 17, 2026

By the end of this lecture, you will be able to:

Apply the van Doorn et al. (2021) reporting checklist to a Bayesian analysis
Use calibrated language when describing BFs and posteriors
Identify common errors in written Bayesian reports
Demonstrate the full analysis pipeline on exam-style questions

Reading: van Doorn et al. (2021), Psychonomic Bulletin & Review. doi: 10.3758/s13423-020-01798-5

Exam: Friday, June 19, 09:00–11:00, IWO 4.04C (Blauw). Open-book ANS exam with R (no internet); the textbook is provided as a PDF and you may bring one A4 cheat sheet (both sides). Covers all lecture material (L1–L5).

Four Common Exam Mistakes

Mistake 1: Misreading the Credible Interval

Wrong: “There is a 95% probability that the true \(\theta\) is between 0.51 and 0.78 in repeated sampling.”

Correct: “Given the data and the prior, there is a 95% probability that \(\theta\) is between 0.51 and 0.78.”

The frequentist “repeated sampling” clause belongs to the confidence interval, not the Bayesian credible interval.

Mistake 2: BF = Posterior Probability

Wrong: “\(BF_{10} = 9\) means there is a 90% chance that \(H_1\) is true.”

Correct: “\(BF_{10} = 9\) means the data are 9 times more likely under \(H_1\) than under \(H_0\). Converting to posterior probability requires specifying the prior odds.”

\[P(H_1 \mid y) = \frac{BF_{10} \times P(H_1)}{BF_{10} \times P(H_1) + P(H_0)}\]

Mistake 3: No Evidence ≠ Evidence for the Null

Wrong: “\(BF_{10} = 1.2\), there is no effect.”

Correct: “\(BF_{10} = 1.2\) is anecdotal. The data were uninformative: they did not substantially shift beliefs in either direction.”

Contrast with \(BF_{01} = 14\): this is positive evidence for \(H_0\).

Mistake 4: Using the CI to Test a Hypothesis

Wrong: “The 95% CI excludes 0.5, so \(H_0: \theta = 0.5\) is rejected.”

Correct: Credible intervals are for estimation. Use the Bayes factor for hypothesis testing. The CI and the BF answer different questions.

Question	Tool
Does an effect exist?	Bayes factor
How large is the effect?	Posterior + CI
What data should we expect next?	Posterior predictive

Error-Spotting Exercises

Report 1: Mindfulness & Memory

We tested a mindfulness intervention on working memory (\(n = 15\)). \(BF_{10} = 4.3\), so we conclude the null is false. The CI [39.8, 42.6] proves the intervention is effective. No robustness analysis was needed since results are clear.

Find 3 errors in this report.

“Null is false”: \(BF_{10} = 4.3\) is moderate evidence; it cannot falsify \(H_0\). Report as “moderate evidence in favour of \(H_1\).”
“CI proves effectiveness”: credible intervals summarise uncertainty, they cannot “prove” anything. We would also need to compare the CI to a meaningful null value.
“No robustness needed”: robustness checks are especially important when making strong claims, not an optional extra.

Report 2: Facial Feedback

One-sided Bayesian \(t\)-test (\(n_1 = 53\), \(n_2 = 57\)). Informed prior: \(t(0.35, 0.102, 3)\). Result: \(BF_{0-} = 11.5\). This means \(H_0\) is 11.5× more probable than \(H_-\).

Find 1 error in this report.

“\(H_0\) is 11.5× more probable”: \(BF_{01}\) is a likelihood ratio, not a probability ratio. Converting to probability ratios requires specifying prior odds.

Mock Exam 1

A full practice paper (16.5 pts) in the style of last year’s exam. It is deliberately longer than the real exam (10 pts, 3 questions) to give you more to work through, and it covers Lectures 3–5 plus a model-checking simulation. (Lectures 1–2 are covered in Mock Exam 2 below.)

Q1–Q3 are by-hand computational questions; Q4 is a with-R simulation question.

Attempt each question under exam conditions before opening its solution.

Mock Q1: Conjugacy & Prior Choice (4.5 pts)

(Lecture 3 / Ch. 5; cf. Beta–Binomial exercises)

A food-delivery app wants to estimate \(\theta\), the proportion of customers who tip their driver. Based on a pilot study, they adopt a \(\text{Beta}(5, 5)\) prior. In the new cohort, 42 out of 60 customers tip.

a. [1] Identify the data model and conjugate prior family. Write the posterior update rule and state the resulting posterior.

b. [1] Compute the prior mean, observed proportion, and posterior mean. Explain the influence of the prior vs data.

c. [0.75] Compute the 95% equal-tailed credible interval and interpret it.

d. [0.75] Compute \(P(\theta > 0.70 \mid y)\).

e. [1] (Describe, max 90 words.) Describe how the posterior has shifted relative to the prior — in terms of its mean, mode, and standard deviation, and explain what this shift indicates about how the data updated your beliefs about \(\theta\).

Attempt before looking at the solution.

a. \(Y \mid \theta \sim \text{Binomial}(60, \theta)\); conjugate prior \(\theta \sim \text{Beta}(\alpha,\beta)\). Update: \(\theta \mid y \sim \text{Beta}(\alpha + y,\ \beta + n - y)\).

Code

alpha <- 5; beta <- 5; y <- 42; n <- 60
ap <- alpha + y; bp <- beta + n - y
cat("Posterior: Beta(", ap, ",", bp, ")\n")

Posterior: Beta( 47 , 23 )

Code

cat("Prior mean:     ", round(alpha/(alpha+beta), 3), "\n")

Prior mean:      0.5

Code

cat("Observed prop:  ", round(y/n, 3), "\n")

Observed prop:   0.7

Code

cat("Posterior mean: ", round(ap/(ap+bp), 3), "\n")

Posterior mean:  0.671

Code

cat("Prior ESS:      ", alpha + beta, "\n")

Prior ESS:       10

The posterior mean (0.671) is a weighted average of the prior mean (0.50) and the observed proportion (0.70). With prior ESS = 10 vs n = 60, the data carry ~6× the weight, so the posterior sits close to \(\bar y\), only slightly pulled toward the prior.

Code

round(qbeta(c(0.025, 0.975), ap, bp), 3)

[1] 0.558 0.776

Given the data and prior, there is a 95% probability that \(\theta\) lies in this interval.

Code

pbeta(0.70, ap, bp, lower.tail = FALSE)

[1] 0.3134553

About a 31% posterior probability that more than 70% of customers tip.

Code

prior_mode <- (alpha - 1) / (alpha + beta - 2)
post_mode  <- (ap - 1) / (ap + bp - 2)
prior_sd <- sqrt(alpha*beta / ((alpha+beta)^2 * (alpha+beta+1)))
post_sd  <- sqrt(ap*bp / ((ap+bp)^2 * (ap+bp+1)))
cat("Prior:     mean", round(alpha/(alpha+beta),3), " mode", round(prior_mode,3), " sd", round(prior_sd,3), "\n")

Prior:     mean 0.5  mode 0.5  sd 0.151

Code

cat("Posterior: mean", round(ap/(ap+bp),3),         " mode", round(post_mode,3),  " sd", round(post_sd,3),  "\n")

Posterior: mean 0.671  mode 0.676  sd 0.056

The posterior has shifted upward — both the mean (0.50 → 0.671) and the mode (0.50 → 0.677) move about 0.17 toward the observed proportion of 0.70 — and tightened sharply, with the SD falling roughly threefold (0.151 → 0.056). The 60 observations have refined a fairly diffuse, symmetric Beta(5, 5) prior into a much more precise, slightly right-leaning posterior: our belief about the tipping rate has moved from “around half, very uncertain” to “about two-thirds, fairly precisely.”

Marking: upward shift in mean and mode (0.25); reduced SD / increased precision (0.25); state the actual prior and posterior values (0.25); interpret the shift as the data sharpening a diffuse prior toward the observed proportion (0.25).

Mock Q2: Grid Approximation & MCMC (3 pts)

(Lecture 4 / Ch. 6–7; cf. grid + Metropolis exercises)

A researcher models \(\theta\), the probability of a new drug being effective. They use a \(\text{Beta}(2, 2)\) prior truncated to \([0.3, 0.7]\). In a trial of 20 patients, 12 show improvement.

a. [0.5] Explain why this truncated prior has no closed-form conjugate posterior.

b. [1] The researcher writes the following recipe on the whiteboard:

Build a fine grid \(\theta_1, \ldots, \theta_K\) on \([0, 1]\).

At each \(\theta_i\), compute \(u_i = f(\theta_i)\, L(y \mid \theta_i)\) using the truncated Beta prior and Binomial likelihood.

Normalise: \(w_i = u_i / \sum_k u_k\).

Report posterior summaries as weighted averages over the grid.

[0.5] Which numerical method does this describe? Pick one and justify in one sentence: (A) Metropolis–Hastings MCMC · (B) Grid approximation · (C) Conjugate Beta–Binomial update · (D) Posterior-predictive Monte Carlo.
[0.5] Applying the recipe gives a posterior mean of \(\bar\theta \approx 0.562\). The untruncated conjugate posterior \(\text{Beta}(2 + 12,\; 2 + 8) = \text{Beta}(14, 10)\) has mean \(\approx 0.583\). What does the comparison tell you about the truncation here?

c. [1] A random-walk Metropolis sampler on the untruncated model is at \(\theta_c = 0.55\) and proposes \(\theta_p = 0.60\). Using the unnormalised posterior, compute the Metropolis acceptance probability \(\alpha\). Is the move accepted with certainty?

d. [0.5] You rerun the sampler with three step sizes. One trace plot is a flat caterpillar around the mode; one drifts slowly and never settles; one gets “stuck” on long plateaus. Match each pattern to the diagnosis: good mixing, step size too small, step size too large.

Attempt before looking at the solution.

a. The Beta–Binomial conjugate update needs a proper Beta prior on \([0, 1]\). Truncating to \([0.3, 0.7]\) takes the prior out of the Beta family, so the posterior is no longer Beta and has no closed form.

b. (i) (B) Grid approximation. No accept/reject step (so not MCMC), no analytical update (so not the conjugate Beta–Binomial), and the recipe produces posterior samples for \(\theta\) rather than predictions for future \(y\) (so not posterior-predictive Monte Carlo).

The two means differ by about 0.02 (0.562 vs 0.583). With only \(n = 20\), the conjugate posterior \(\text{Beta}(14, 10)\) still has noticeable mass above 0.7, which the truncation cuts off — that pulls the truncated posterior mean slightly downward. The truncation is doing real work here, unlike in cases with larger \(n\) where the data already concentrate \(\theta\) well inside the truncation bounds.

Code

y <- 12; n <- 20
logpost <- function(t) dbeta(t, 2, 2, log = TRUE) + dbinom(y, n, t, log = TRUE)
cat("Acceptance probability:", round(min(1, exp(logpost(0.60) - logpost(0.55))), 3), "\n")

Acceptance probability: 1

\(\alpha = 1\): the proposed value \(\theta_p = 0.60\) is more plausible than the current \(\theta_c = 0.55\) (the MLE sits exactly at \(12/20 = 0.60\)), so the unnormalised posterior ratio exceeds 1 and the move is accepted with certainty.

d. Flat caterpillar around the mode = good mixing. Slow drift that never settles = step size too small (high autocorrelation, the chain barely explores). Long stuck plateaus = step size too large (proposals overshoot into low-probability regions and are repeatedly rejected).

Mock Q3: Inference, Prediction & Testing (5 pts)

(Lecture 5 / Ch. 8; cf. posterior-summary, prediction & Bayes-factor exercises)

A clinic models \(\lambda\), the mean number of emergency calls per hour, with a \(\text{Gamma}(3, 1)\) prior. Over 8 hours they record 15 calls.

a. [0.75] Give the posterior distribution, its mean, and a 95% equal-tailed credible interval.

b. [1] Predict the number of calls next hour. Compute \(E(Y' \mid y)\) and \(P(Y' \ge 2 \mid y)\) using the posterior-predictive (Negative-Binomial) distribution. Why is this wider than plugging the posterior mean into a single Poisson?

c. [1] Test \(H_+: \lambda > 2\) vs \(H_-: \lambda < 2\). Compute \(BF_{+-}\) from the prior and posterior odds.

d. [0.75] Test \(H_0: \lambda = 2\) using the Savage–Dickey density ratio. What does \(BF_{01}\) say about the value \(\lambda = 2\)?

e. [1.5] (Interpret, max 70 words.) State in words what the Bayes factor from (c) tells you about the call rate, using calibrated language. What does it license you to conclude, and what does it not?

Attempt before looking at the solution.

a. \(Y_i\mid\lambda \overset{iid}{\sim}\text{Poisson}(\lambda)\); posterior \(\lambda\mid y \sim \text{Gamma}(s+\sum y,\ r+n)\).

Code

s <- 3; r <- 1; sumy <- 15; nobs <- 8
sp <- s + sumy; rp <- r + nobs
cat("Posterior: Gamma(", sp, ",", rp, "),  mean =", round(sp/rp, 3), "\n")

Posterior: Gamma( 18 , 9 ),  mean = 2

Code

cat("95% ETI:", round(qgamma(c(0.025, 0.975), sp, rp), 3), "\n")

95% ETI: 1.185 3.024

Code

cat("E(Y' | y) =",       round(sp/rp, 3), "\n")

E(Y' | y) = 2

Code

cat("P(Y' >= 2 | y) =",  round(1 - pnbinom(1, size = sp, prob = rp/(rp+1)), 3), "\n")

P(Y' >= 2 | y) = 0.58

The posterior-predictive marginalises over the full posterior uncertainty in \(\lambda\). Plugging the posterior mean into a single \(\text{Poisson}(\hat\lambda)\) fixes \(\lambda\) at one value and ignores that uncertainty, so it is too narrow and over-confident; the Negative-Binomial is correctly wider.

Code

prior_above <- pgamma(2, s, r, lower.tail = FALSE)    # P(lambda > 2)
post_above  <- pgamma(2, sp, rp, lower.tail = FALSE)   # P(lambda > 2 | y)
BF_plusminus <- (post_above/(1 - post_above)) / (prior_above/(1 - prior_above))
cat("Prior P(l>2):", round(prior_above,3), " Posterior P(l>2):", round(post_above,3), "\n")

Prior P(l>2): 0.677  Posterior P(l>2): 0.469

Code

cat("BF(+, -) =", round(BF_plusminus, 3),
    "  (equivalently BF(-, +) =", round(1/BF_plusminus, 2), ")\n")

BF(+, -) = 0.421   (equivalently BF(-, +) = 2.37 )

\(BF_{+-} \approx 0.42 < 1\), so the data favour \(H_-\) (rate below 2) over \(H_+\). Equivalently \(BF_{-+} \approx 2.4\) — only anecdotal evidence for \(\lambda < 2\) on the Kass & Raftery scale. The data nudged \(P(\lambda > 2)\) down from 0.68 to 0.47, but not decisively.

Code

cat("BF01 =", round(dgamma(2, sp, rp) / dgamma(2, s, r), 3), "\n")

BF01 = 3.112

Code

cat("BF10 =", round(dgamma(2, s, r) / dgamma(2, sp, rp), 3), "\n")

BF10 = 0.321

\(BF_{01} \approx 3.1\): the posterior density at \(\lambda = 2\) is about three times the prior density there, so the data are ~3× more consistent with \(\lambda = 2\) than the diffuse alternative expected — moderate evidence for the value 2, not against it. Note that (c) and (d) answer different questions: a one-sided direction test versus a point test.

e. Model answer (≤70 words). The Bayes factor \(BF_{+-} \approx 0.42\) means the observed data are about 2.4 times more likely under \(H_-\) (\(\lambda < 2\)) than under \(H_+\) (\(\lambda > 2\)) — anecdotal, not compelling, evidence for a call rate below 2 per hour. It licenses a tentative lean toward the lower rate; it does not prove \(\lambda < 2\), nor give the probability that \(H_-\) is true (that would need the prior odds).

Marking: correct direction — evidence favours \(H_-\) (0.5); translates 0.42 into “~2.4× more likely under \(H_-\)” (0.5); uses calibrated/probabilistic language (not “reject”/“significant”) and notes BF \(\neq\) posterior probability (0.5).

Mock Q4: Posterior Prediction by Simulation (4 pts)

(Lecture 5 / Ch. 8; cf. posterior-prediction exercises, with-R question)

A marine ecologist models \(\lambda\), the mean number of whale sightings per one-hour boat survey, with a \(\text{Gamma}(2, 1)\) prior. Over 10 surveys the counts are

\[y = (2,\; 0,\; 3,\; 1,\; 2,\; 4,\; 1,\; 0,\; 3,\; 2).\]

a. [0.5] Derive the posterior analytically.

b. [1.5] Write R code to simulate 50,000 draws from the posterior-predictive distribution for a single future survey. Plot it, and compute \(E(Y' \mid y)\) and \(P(Y' = 0 \mid y)\). Confirm \(P(Y'=0\mid y)\) against the analytic Negative-Binomial value.

c. [1] The team plans a block of 5 future surveys. Simulate the total number of sightings across the block and estimate \(E(Y_\text{total} \mid y)\) and \(P(Y_\text{total} \ge 10 \mid y)\).

d. [1] (Interpret, max 70 words.) A colleague wants to report only the predictive mean from (b) as “the” forecast for the next survey, with no uncertainty. What is wrong with this, and what should they report instead?

Attempt before looking at the solution.

a. \(Y_i \mid \lambda \overset{iid}{\sim} \text{Poisson}(\lambda)\); conjugate update \(\lambda \mid y \sim \text{Gamma}(s + \sum y_i,\ r + n)\).

Code

y  <- c(2, 0, 3, 1, 2, 4, 1, 0, 3, 2)
s0 <- 2; r0 <- 1                          # prior Gamma(2, 1)
sp <- s0 + sum(y); rp <- r0 + length(y)
cat("Posterior: Gamma(", sp, ",", rp, "),  mean =", round(sp/rp, 3), "\n")

Posterior: Gamma( 20 , 11 ),  mean = 1.818

\(\sum y_i = 18\), \(n = 10\), so \(\lambda \mid y \sim \text{Gamma}(20, 11)\) with mean \(20/11 \approx 1.82\).

b. Draw \(\lambda^{(s)}\) from the posterior, then \(Y'^{(s)} \sim \text{Poisson}(\lambda^{(s)})\) — this propagates posterior uncertainty into the prediction.

Code

set.seed(2026)
N <- 50000
lambda_post <- rgamma(N, shape = sp, rate = rp)
y_next      <- rpois(N, lambda = lambda_post)

cat("E(Y' | y)     =", round(mean(y_next), 3), "\n")

E(Y' | y)     = 1.82

Code

cat("P(Y' = 0 | y) =", round(mean(y_next == 0), 3),
    "  (analytic NB:", round(dnbinom(0, size = sp, prob = rp/(rp + 1)), 3), ")\n")

P(Y' = 0 | y) = 0.174   (analytic NB: 0.175 )

Code

hist(y_next, breaks = -0.5:(max(y_next) + 0.5), probability = TRUE,
     main = "Posterior predictive: next survey",
     xlab = "Whale sightings", col = "steelblue", border = "white")

The predictive distribution is right-skewed over 0–6 sightings. \(E(Y' \mid y) \approx 1.82\) (it equals the posterior mean of \(\lambda\)), and \(P(Y' = 0 \mid y) \approx 0.17\) — the simulation matches the analytic Negative-Binomial value (0.175).

Code

y_block <- replicate(N, sum(rpois(5, lambda = rgamma(1, sp, rp))))
cat("E(Y_total | y)       =", round(mean(y_block), 2), "\n")

E(Y_total | y)       = 9.09

Code

cat("P(Y_total >= 10 | y) =", round(mean(y_block >= 10), 3), "\n")

P(Y_total >= 10 | y) = 0.42

About 9.1 sightings expected over the 5-survey block (\(5 \times\) the per-survey mean), with roughly a 42% chance of 10 or more.

d. Model answer (≤70 words). Reporting only the mean (≈1.8) discards all predictive uncertainty — and 1.8 is not even an observable count. They should report the full posterior-predictive distribution, or at least a predictive interval (here a 90% interval is roughly \([0, 4]\) sightings) plus informative summaries such as \(P(Y' = 0 \mid y)\). A bare point estimate cannot distinguish a confident forecast from a highly uncertain one.

Marking: identifies that a point estimate discards predictive uncertainty (0.5); proposes reporting the predictive distribution or a predictive interval / tail probabilities instead (0.5).

Mock Exam 2 (Last Year’s Resit)

Last year’s digital resit (8 July 2025), reproduced as a practice paper. Six exercises, 18 points, spanning Lectures 1–5. Same open-book ANS-with-R format as your exam.

Q1: Bayes’ Rule (1 pt)

(Lecture 1 / Ch. 2)

Sophie is trying to guess where a friend spent a particular summer day. She remembers that 40% of her friends stayed in the Netherlands and 60% went on holiday to southern Europe, but not who went where. The chance of a great beach day is only 0.10 in the Netherlands, but 0.90 in southern Europe. One day the friend texts: “Today was a perfect beach day!”

Given the friend had a great beach day, what is the probability they spent the day in the Netherlands?

Attempt before looking at the solution.

By Bayes’ rule, with \(B\) = “stayed in the Netherlands” and \(A\) = “great beach day”:

\[P(B \mid A) = \frac{P(A \mid B)\,P(B)}{P(A \mid B)\,P(B) + P(A \mid B^c)\,P(B^c)}.\]

Code

(0.10 * 0.40) / (0.10 * 0.40 + 0.90 * 0.60)

[1] 0.06896552

So \(P(\text{Netherlands} \mid \text{beach day}) \approx 0.069\): even though beach days are far more common abroad, hearing about a great beach day makes the Netherlands quite unlikely.

Q2: The Beta-Binomial Model (4.25 pts)

(Lecture 2 / Ch. 3)

A researcher models a success probability with a \(\text{Beta}(3, 10)\) prior and collects \(n = 100\) trials. The Beta-Binomial update gives the posterior \(\text{Beta}(24, 89)\).

a. [0.25] How many successes and how many failures were observed?

b. [0.5] Write the summarize_beta_binomial() call that produces this prior-to-posterior summary.

c. [0.5] Write the code that plots the prior and posterior in one figure, without the scaled likelihood.

d. [3] (Describe, max 130 words.) Describe how the posterior has shifted relative to the prior in terms of its mean, mode, and standard deviation, and what this shift says about how the data updated your beliefs.

Attempt before looking at the solution.

a. The update is \(\text{Beta}(\alpha+y,\ \beta+n-y) = \text{Beta}(3+y,\ 10+n-y) = \text{Beta}(24, 89)\), so \(y = 21\) successes and \(n - y = 79\) failures.

Code

summarize_beta_binomial(alpha = 3, beta = 10, y = 21, n = 100)

Code

plot_beta_binomial(alpha = 3, beta = 10, y = 21, n = 100, likelihood = FALSE)

d. The posterior mean falls slightly (0.23 to 0.21) while the posterior mode rises slightly (0.18 to 0.21). The two move in opposite directions because the prior is right-skewed, and updating pulls the mean and mode together. The posterior is also far more concentrated: the standard deviation drops from about 0.11 to about 0.04, reflecting much greater certainty. In short the data broadly agreed with the prior; rather than overturning our beliefs, they sharpened them, leaving the estimate near 0.21 but with much less uncertainty.

Marking: mean shifts down and mode shifts up, with the opposite directions attributed to the prior’s skewness; SD shrinks (greater precision); actual prior/posterior values stated; data seen to reinforce rather than overturn the prior.

Q3: Match the Beta Posteriors (2.75 pts)

(Lecture 2 / Ch. 3)

You are modelling a success probability. In each case you start with a Beta prior and observe some successes and failures:

Case	Prior	Observed data
A	Beta(1, 1)	3 successes, 6 failures
B	Beta(2, 2)	6 successes, 4 failures
C	Beta(1, 4)	4 successes, 2 failures

a. [0.75] Determine the posterior distribution for each case.

b. [1.5] Compute the standard deviation of each posterior.

c. [0.5] Rank the posteriors from highest to lowest confidence about the success probability.

Attempt before looking at the solution.

a. Update each with \(\text{Beta}(\alpha + s,\ \beta + f)\):

A: \(\text{Beta}(1+3,\ 1+6) = \text{Beta}(4, 7)\)
B: \(\text{Beta}(2+6,\ 2+4) = \text{Beta}(8, 6)\)
C: \(\text{Beta}(1+4,\ 4+2) = \text{Beta}(5, 6)\)

Code

post_sd <- function(a, b) sqrt(a * b / ((a + b)^2 * (a + b + 1)))
round(c(A = post_sd(4, 7), B = post_sd(8, 6), C = post_sd(5, 6)), 3)

    A     B     C 
0.139 0.128 0.144

c. Higher confidence means a smaller posterior SD, so the ranking is B (SD 0.128, highest confidence), then A (SD 0.139), then C (SD 0.144, lowest confidence).

Q4: Sequential Updating with the Bechdel Data (2.5 pts)

(Lecture 2 / Ch. 4)

The bechdel data in bayesrules records whether films pass the Bechdel test. John models \(\pi\), the proportion of films that pass, with a symmetric \(\text{Beta}(2, 2)\) prior and analyses one year at a time.

a. [0.5] John analyses the 1995 films. Give the posterior, its mean, and its mode.

b. [0.5] The next day he analyses the 2005 films, building on the previous day’s posterior. Give the posterior, mean, and mode.

c. [0.5] On the third day he analyses the 2013 films, building on the previous two analyses. Give the posterior, mean, and mode.

d. [1] (Explain, max 70 words.) Jenna instead analyses 1995, 2005 and 2013 jointly, all at once. Under what conditions will her posterior be identical to John’s?

Attempt before looking at the solution.

a.–c. Each year adds its passes and failures, and yesterday’s posterior becomes today’s prior. Starting from \(\text{Beta}(2, 2)\):

Code

data(bechdel, package = "bayesrules")
a <- 2; b <- 2
for (yr in c(1995, 2005, 2013)) {
  d <- bechdel[bechdel$year == yr, ]
  y <- sum(d$binary == "PASS"); n <- nrow(d)
  a <- a + y; b <- b + (n - y)
  cat(yr, ": Beta(", a, ",", b, ")  mean =", round(a/(a+b), 3),
      " mode =", round((a-1)/(a+b-2), 3), "\n")
}

1995 : Beta( 20 , 20 )  mean = 0.5  mode = 0.5 
2005 : Beta( 74 , 66 )  mean = 0.529  mode = 0.529 
2013 : Beta( 120 , 119 )  mean = 0.502  mode = 0.502

So 1995 gives \(\text{Beta}(20, 20)\) (mean and mode 0.50), 2005 gives \(\text{Beta}(74, 66)\) (mean 0.529), and 2013 gives \(\text{Beta}(120, 119)\) (mean 0.502).

d. Model answer (≤70 words). If Jenna uses the same \(\text{Beta}(2, 2)\) prior and the same combined data, the two posteriors are identical. In Bayesian updating, processing the data sequentially or all at once gives the same posterior, as long as the prior and the total data are the same.

Marking: states the conditions (same prior and same data) and recognises that sequential and batch updating coincide.

Q5: Tuning a Gamma Prior (2 pts)

(Lecture 3 / Ch. 5)

Analysts model \(\lambda\), the average number of three-pointers made per NBA game.

a. [1] Based on past seasons they expect a mean of 30 with a variance of 60. Construct a Gamma prior for \(\lambda\).

b. [1] Using the current season (284 games, 858 three-pointers in total), compute the posterior mean.

Attempt before looking at the solution.

a. Moment-matching for \(\text{Gamma}(s, r)\) uses mean \(= s/r\) and variance \(= s/r^2\), so \(r = \text{mean}/\text{variance} = 30/60 = 0.5\) and \(s = \text{mean} \times r = 15\). The prior is \(\text{Gamma}(15, 0.5)\).

b. The Gamma-Poisson update is \(\text{Gamma}(s + \sum y_i,\ r + n)\) with \(\sum y_i = 858\) and \(n = 284\) games:

Code

s <- 15; r <- 0.5
sp <- s + 858; rp <- r + 284
cat("Posterior: Gamma(", sp, ",", rp, "),  mean =", round(sp/rp, 4), "\n")

Posterior: Gamma( 873 , 284.5 ),  mean = 3.0685

The posterior mean is about 3.07.

Q6: Customer Complaints (5.5 pts)

(Lecture 5 / Ch. 8)

A call centre evaluates its complaint rate \(\lambda\) (average complaints per hour) using a \(\text{Gamma}(2, 1)\) prior (shape-rate). After observing \(y = 12\) complaints in \(t = 4\) hours, they update to the posterior \(\text{Gamma}(14, 5)\).

a. [1] The centre aims to keep the rate below 3 per hour. Compute \(P(\lambda < 3 \mid y)\).

b. [1] Compute the prior odds and posterior odds for \(H_0: \lambda \geq 3\) versus \(H_1: \lambda < 3\).

c. [1] Use part (b) to compute \(BF_{10}\).

d. [1.5] (Interpret, max 70 words.) What does this Bayes factor say about the strength of evidence in the data?

e. [1] Test the point hypothesis \(H_0: \lambda = 3\) versus \(H_1: \lambda \neq 3\) with the Savage-Dickey density ratio. Compute \(BF_{01}\).

Attempt before looking at the solution.

Code

pgamma(3, shape = 14, rate = 5)

[1] 0.6367822

\(P(\lambda < 3 \mid y) \approx 0.637\).

Code

prior_p <- pgamma(3, 2, 1)    # P(lambda < 3) under the prior
post_p  <- pgamma(3, 14, 5)   # P(lambda < 3 | y)
cat("Prior odds  (H1/H0):", round(prior_p/(1 - prior_p), 3), "\n")

Prior odds  (H1/H0): 4.021

Code

cat("Posterior odds (H1/H0):", round(post_p/(1 - post_p), 3), "\n")

Posterior odds (H1/H0): 1.753

Prior odds of \(H_1\) over \(H_0\) are about 4.02; posterior odds about 1.75. (Equivalently, \(H_0\) over \(H_1\): 0.25 and 0.57.)

Code

BF10 <- (post_p/(1 - post_p)) / (prior_p/(1 - prior_p))
cat("BF10 =", round(BF10, 3), "\n")

BF10 = 0.436

d. Model answer (≤70 words). \(BF_{10} \approx 0.44\) is below 1, so the data favour \(H_0\) (\(\lambda \geq 3\)) over \(H_1\) (\(\lambda < 3\)): the observed data are about \(1/0.44 \approx 2.3\) times more likely under \(H_0\) than under \(H_1\). The observed rate of 3 per hour pulled belief toward higher rates, so this is only weak evidence, and it points away from the “below 3” hypothesis.

Marking: states that the evidence favours \(H_0\); translates 0.44 into “about 2.3 times more likely under \(H_0\)”; uses probabilistic language rather than “reject” or “significant”.

Code

cat("BF01 (Savage-Dickey) =", round(dgamma(3, 14, 5) / dgamma(3, 2, 1), 3), "\n")

BF01 (Savage-Dickey) = 3.201

\(BF_{01} \approx 3.20\): the posterior density at \(\lambda = 3\) is about three times the prior density there, so the data are about three times more consistent with exactly \(\lambda = 3\) than the diffuse alternative expected. Moderate evidence for the point value 3.

Lecture 6 — Reporting Guidelines & Exam Preparation

Reporting Guidelines (van Doorn et al., 2021)

Why Reporting Matters

Stage 1: Planning

Stage 2: Executing

Stage 3: Interpreting

Stage 4: Reporting (The Complete Checklist)

Language Guide

A Specimen Paragraph

Four Common Exam Mistakes

Mistake 1: Misreading the Credible Interval

Mistake 2: BF = Posterior Probability

Mistake 3: No Evidence ≠ Evidence for the Null

Mistake 4: Using the CI to Test a Hypothesis

Error-Spotting Exercises

Report 1: Mindfulness & Memory

Report 2: Facial Feedback

Mock Exam 1

Mock Q1: Conjugacy & Prior Choice (4.5 pts)

Mock Q2: Grid Approximation & MCMC (3 pts)

Mock Q3: Inference, Prediction & Testing (5 pts)

Mock Q4: Posterior Prediction by Simulation (4 pts)

Mock Exam 2 (Last Year’s Resit)

Q1: Bayes’ Rule (1 pt)

Q2: The Beta-Binomial Model (4.25 pts)

Q3: Match the Beta Posteriors (2.75 pts)

Q4: Sequential Updating with the Bechdel Data (2.5 pts)

Q5: Tuning a Gamma Prior (2 pts)

Q6: Customer Complaints (5.5 pts)

Exam Tips

What to Expect

Study Checklist (Lectures 1–5)

Key Formulas to Have at Hand