where \(w_\text{prior} = \frac{\sigma^2}{n\tau^2 + \sigma^2}\) and \(w_\text{data} = \frac{n\tau^2}{n\tau^2 + \sigma^2}\).
As \(n\) grows, \(w_\text{data} \to 1\) and the posterior concentrates on \(\bar{y}\).
Notation convention. Throughout this lecture and Formula 5.12 in the book, \(\mathcal{N}(\cdot, \cdot)\) uses (mean, variance). The textbook occasionally writes the second slot as SD instead — when it does, square it before plugging into the update formulas. (R is the opposite: dnorm and the bayesrules functions take sd =, not variance.)
Simulate what the prior predicts before seeing data. If the simulated data look implausible, revise the prior.
Draw \(\theta^{(i)}\) from the prior
Simulate \(y^{(i)}\) using the binomial PMF at each \(\theta^{(i)}\)
Plot: does this look like realistic study data?
Prior Predictive Check in R
Code
# Prior Beta(12, 8), planning n = 20theta_prior <-rbeta(1e4, shape1 =12, shape2 =8)y_prior_pred <-rbinom(1e4, size =20, prob = theta_prior)hist(y_prior_pred, breaks =0:20,main ="Prior Predictive Distribution (Beta(12,8), n=20)",xlab ="Number of successes (out of 20)",col ="steelblue", border ="white")
Prior Sensitivity Analysis
Code
# Same data: y = 14, n = 20. Three different priors.priors <-list("Beta(12,8)"=c(12, 8),"Beta(2,2)"=c(2, 2),"Beta(1,1)"=c(1, 1))for (nm innames(priors)) { a <- priors[[nm]][1] b <- priors[[nm]][2] post_mean <- (a +14) / (a + b +20)cat(nm, "→ posterior mean:", round(post_mean, 3), "\n")}
No prior is “objective.” Beta(1,1) has ESS = 2 — it asserts that any success probability is equally plausible, from 0.001 to 0.999. That is itself an assumption, not a neutral stance.
In-Class Exercises
Exercise 1: Identify the Family
For each scenario below, identify the appropriate model (likelihood + prior family) and write the posterior update formula.
# 5. Sensitivity: Beta(1,1)summarize_beta_binomial(alpha =1, beta =1, y =18, n =25)
Code
pbeta(0.60, 19, 8, lower.tail =FALSE)
[1] 0.8784429
Interpretation: Under Beta(6,4), posterior mean ≈ 0.69, with strong evidence \(\theta > 0.60\). Under the flat Beta(1,1), the posterior shifts only slightly (the data dominate). Conclusions are robust.
Exercise 3: Choosing a Prior via Moment Matching
A cognitive psychologist expects Stroop interference to add about 80 ms to reaction time, with considerable uncertainty (SD ≈ 30 ms). Measurement error is known to be \(\sigma = 50\) ms.
Interpretation: With a tight prior (SD = 30), the posterior is pulled toward 730 ms. With a vague prior (SD = 100), it follows the data much more closely. The prior’s ESS (\(\approx \sigma^2/\tau^2\)) drives this difference.
Next lecture (June 10): When conjugacy doesn’t deliver a closed-form posterior, we approximate it instead — grid approximation for low-dimensional problems and MCMC for everything else.