Results

Data from Davey et al. (2003). The authors investigated how mood and stop rules impact checking thoughts.

Bayesian ANOVA

Model Comparison
Models P(M) P(M|data) BFM BF10 error %
mood + stop_rule + mood   stop_rule 0.200 0.479 3.671 1.000  
Null model 0.200 0.279 1.550 0.583 2.597
stop_rule 0.200 0.154 0.730 0.323 2.597
mood 0.200 0.057 0.242 0.119 2.597
mood + stop_rule 0.200 0.031 0.127 0.064 2.961

The full model (2 main effects + interaction effect) seems to fit the data the best, although the null model is in second place so the effects are not so pronounced. The BF10 column shows each of the models, compared to the best model (the full model). For the first row, we compare the full model to the full model, so that is a BF of 1. The second row indicates that the data are 0.58 times as likely under the null model than under the full model (or, twice as likely under the full model compared to the null model), which is pretty weak evidence.


The effects table below gives the inclusion Bayes factors (these compare all model with that predictor, to all models without that predictor) for each predictor. For the two main effects, the Bayes factors are not very informative - the data are as likely under models with that predictor, as they are without that predictor. The interaction effect does seem to have some evidence in its favor, but 3.6 is not very high for a Bayes factor. In combination with the small normality problem below, I would not risk a paper on these research findings (but could collect more data because we are Bayesians now! Usually increasing the sample size gives more decisive Bayes factors).

Analysis of Effects - checks
Effects P(incl) P(excl) P(incl|data) P(excl|data) BFincl
mood 0.600 0.400 0.566 0.434 0.871
stop_rule 0.600 0.400 0.664 0.336 1.316
mood   stop_rule 0.200 0.800 0.479 0.521 3.671

Model Averaged Q-Q Plot

Our friend the Q-Q plot also returns in Bayesian ANOVA. Due to how the Bayesian framework works, we can even express uncertainty (credible intervals) in the Q-Q plot, to be even more nuanced about the normality assumption. Here there does seem to be some deviation from normality (multiple credible intervals do not overlap with the red diagonal) and 60 participants is not that many (with 2 predictors), so we need to be a bit cautious in interpreting the results.

Descriptives

Descriptives plots