Below are the results of various Bayesian binomial tests on the "Emily Rosa" data set. We have one categorical variable, "Outcome", which lists which therapists gave a correct or an incorrect response. In this analysis, we would like to test the hypothesis that more than 50% of responses are correct, since this would indicate that the therapists perform above chance level and can therefore detect energy fields (i.e., if they would merely guess, we would observe about 50% correct responses).
JASP shows the binomial tests for both the proportion of "Correct" and of "Incorrect". The output below shows both, but we are primarily interested in the proportion of "Correct". Note that for the two-sided test this does not matter for the Bayes factor, since any deviation from 0.5 will be seen as evidence against H0. Only when we make a one-sided alternative hypothesis will it matter whether we look at "Correct" or "Incorrect" (you can see this by comparing BF+0 for "Correct" to the BF-0 for "Incorrect" - these will be same).
(2) We start with the table below - here you can see the descriptive statistics (observed proportion), as well as the (two-sided) Bayes factor. The data set contains 70 correct responses, which corresponds to a proportion of 0.467. In order to assess if that observed proportion is enough to accept or reject H0, we can look at the Bayes factor (see next analysis block for one-sided tests). For the two-sided alternative hypothesis, using a uniform prior (a = b = 1), we observe a Bayes factor of 7.051 in favor of H0. This means that the observed data are about 7 times more likely under the null hypothesis than under the alternative hypothesis. This is completely the same as saying: we observe a Bayes factor of 0.142 in favor of H1. This means that the data are 0.142 times more likely under the alternative hypothesis than under the null hypothesis. BF01 = 1 / BF10, so 1 / 7.051 = 0.142
(3) Additionally, we can do parameter estimation, where we estimate the population proportion theta. In the Prior/Posterior plot below, you can see estimation information as well: the 95% credible interval is from 0.389 and 0.546, and the posterior median is 0.467. This means that under our current model (two-sided H1), there is a 95% probability that the population proportion lies between 0.389 and 0.546. Since 0.5 lies in this interval, we can already see that we will probably not reject the null hypothesis. Note that when we do parameter estimation by looking at the credible interval, we always do this with a two-sided model. If we use a one-sided model, our interval estimates are thrown off by the sidedness of the model (see the one-sided prior/posterior plots further below for their credible interval and compare to the two-sided credible interval for an illustration of this).
Bayesian Binomial Test
|
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Level | Counts | Total | Proportion | BF₁₀ | |||||||
Outcome | Correct | 70 | 150 | 0.467 | 0.142 | ||||||
Incorrect | 80 | 150 | 0.533 | 0.142 | |||||||
Note. Proportions tested against value: 0.5. |
(4) In order to test whether the therapists can detect energy fields, we do a one-sided test, to see whether the proportion is greater than 0.5.
To do so, we tick the box "> Test Value" under "Alt. Hypothesis" in the input menu. This results in a little note under the table below, where we see that we now test the alternative hypothesis that "the poportion is greater than 0.5". We now see that we do not get BF10, but BF+0, to indicate that we are performing a one-sided hypothesis test on the proportions.
(5) Again, we are interested in this test for the proportion of "Correct". The BF+0 here is 0.059, which means that the data are 0.059 times more likely under the alternative hypothesis than under the null hypothesis. Again, this is completely analogue to saying that the data are 1 / 0.059 = 16.955 times more likely under the null hypothesis than under the alternative hypothesis. We can interpret this as pretty strong evidence that the therapists do not have the ability to detect energy fields.
(6) When we look at the sequential analysis, where can see the Bayes factor develop as the data accumulate, we can see that it gradually progressed to the current Bayes factor of 16.955. At one point, for the first +- 10 participants, there was some evidence in favor of the alternative hypothesis (a BF+0 of around 3). However, it quickly declined after that point and started going towards evidence against the alternative hypothesis.
Bayesian Binomial Test
|
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Level | Counts | Total | Proportion | BF₊₀ | |||||||
Outcome | Correct | 70 | 150 | 0.467 | 0.059 | ||||||
Incorrect | 80 | 150 | 0.533 | 0.225 | |||||||
Note. For all tests, the alternative hypothesis specifies that the proportion is greater than 0.5. |
(7) Lastly, we turn to different prior specifications. So far, we have used the "uninformative" prior, which in this case is a uniform prior (i.e., a beta prior with a = b = 1). These priors are generally recommended, since they let the data speak for itself.
However, it could be the case that we have some prior information or stronger belief that we want to be reflected in the prior distribution. For instance, there could have been a previous study that we can incorporate. Let's say that this previous study observed 10 participants, out of which 8 gave the Correct response. For the binomial test, we can easily add this to the prior distribution, by setting the two "shape parameters" of the beta distribution to the previously observed successes and failures. Specifically, we set a = 8, and b = 2. This induces a beta distribution that is situated mostly on the right side of the x-axis: it reflects the prior belief that the proportion is close to 0.8 (we have obtained this belief through our previous experiment). However, this prior belief predicted the data pretty poorly: where the prior predicts values of theta that are close to 0.8 (and generally predicts values for theta greater than 0.5), the data has a proportion of 0.467, which is on the other side of the test-value (i.e., 0.5)! If we look at the one-sided Bayes factor, we see that now BF0+ = 88.9 (the previous BF0+, based on the uninformative prior, was 16.955).
So, what we see here is that if we use a model for theta that is more informative (i.e., a prior distribution that is not a uniform distribution), we make a more specific prediction about the data. If the data are in line with this prediction, this model gets rewarded in terms of statistical evidence (i.e., a higher Bayes factor in its favor). However, if the data are not in line with this prediction, this model gets penalized. Because our current informative one-sided hypothesis/model predicted the data poorly, it gets even more penalized than the non-informative one-sided hypothesis/model (from the previous analysis block).
More generally, this illustrates how we draw up a model that makes predictions. By specifying a prior distribution that only assigns mass to values greater than 0.5, our model only predicts values greater than 0.5. The height of the mass of the prior distribution reflects how much "money" we bet on those specific values, which in turn determines our "pay-off". The prior distribution in this analysis only bet money on values of theta greater than 0.5, and bet most of its money on values of theta close to 0.8 (see the dashed line in the prior/posterior plot below). The prior distribution from the previous analysis bet money on values of theta greater than 0.5. but spread its money equally across all values between 0.5 and 1. Lastly, the null hypothesis bets all of its money on theta = 0.5. Since the observed proportion is close to 0.5, the null hypothesis is the big winner here.
Bayesian Binomial Test
|
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Level | Counts | Total | Proportion | BF₊₀ | |||||||
Outcome | Correct | 70 | 150 | 0.467 | 0.011 | ||||||
Incorrect | 80 | 150 | 0.533 | 0.057 | |||||||
Note. For all tests, the alternative hypothesis specifies that the proportion is greater than 0.5. |