Abstract
In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301–322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys–Zellner–Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


Similar content being viewed by others
Notes
We generated data that covaried exactly according to the input covariance matrix. Because the covariances of the data were equal to the covariances of the population, there was no need to control for random sampling, and we simulated only one experiment per scenario. The full simulation code is available in the supplemental materials.
The approximation can be made arbitrarily close by increasing the number of MCMC samples.
We thank an anonymous reviewer for pointing this out to us.
We compared the fit of four distributions: a nonstandardized t-distribution, a normal distribution, a nonparametric distribution estimated with the spline interpolation function splinefun in R, and a nonparametric distribution estimated with the R function logspline that also uses splines to estimate the log density. All four distributions fitted reasonably well: The Bayes factors of the analytical test and the SD method are similar with all different posterior distributions. All four distributions are therefore included in the R package BayesMed and can be used when applying the SD method.
References
Armstrong, A. M., & Dienes, Z. (2013). Subliminal understanding of negation: Unconscious control by subliminal processing of word pairs. Consciousness and Cognition, 22, 1022–1040.
Berger, J. O. (2006). Bayes factors. In S. Kotz, N. Balakrishnan, C. Read, B. Vidakovic, & N. L. Johnson (Eds.), Encyclopedia of statistical sciences, vol. 1 (2nd ed., pp. 378–386). Hoboken, NJ: Wiley.
Berger, J. O., & Delampady, M. (1987). Testing precise hypotheses. Statistical Science, 2, 317–352.
Berger, J. O., & Wolpert, R. L. (1988). The likelihood principle (2nd ed.). Hayward (CA): Institute of Mathematical Statistics.
Consonni, G., Forster, J. J., & La Rocca, L. (2013). The whetstone and the alum block: Balanced objective Bayesian comparison of nested models for discrete data. Statistical Science, 28, 398–423.
Dickey, J. M., & Lientz, B. P. (1970). The weighted likelihood ratio, sharp hypotheses about chances, the order of a Markov chain. Annals of Mathematical Statistics, 41, 214–226.
Dienes, Z. (2008). Understanding psychology as a science: An introduction to scientific and statistical inference. New York: Palgrave MacMillan.
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on? Perspectives on psychological. Science, 6, 274–290.
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.
Elliot, D. L., Goldberg, L., Kuehl, K. S., Moe, E. L., Breger, R. K., & Pickering, M. A. (2007). The phlame (promoting healthy lifestyles: Alternative models’ effects) firefighter study: Outcomes of two models of behavior change. Journal of Occupational and Environmental Medicine, 49(2), 204–213.
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732.
Guo, X., Li, F., Yang, Z., & Dienes, Z. (2013). Bidirectional transfer between metaphorical related domains in implicit learning of form-meaning connections. PLoS ONE, 8, e68100.
Hoijtink, H., Klugkist, I., & Boelen, P. (2008). Bayesian evaluation of informative hypotheses. New York: Springer.
Iverson, G. J., Wagenmakers, E. J., & Lee, M. D. (2010). A model averaging approach to replication: The case of p rep . Psychological Methods, 15, 172–181.
Jeffreys, H. (1961). Theory of Probability (3rd ed.). Oxford, UK: Oxford University Press
Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90, 928–934.
Klugkist, I., Laudy, O., & Hoijtink, H. (2005). Inequality constrained analysis of variance: A Bayesian approach. Psychological Methods, 10, 477.
Kruschke, J. K. (2010). Doing Bayesian data analysis: A tutorial introduction with R and BUGS. Burlington, MA: Academic Press.
Lee, M. D., & Wagenmakers, E. J. (2013). Bayesian modeling for cognitive science: A practical course. Germany: Cambridge University Press.
Lewis, S. M., & Raftery, A. E. (1997). Estimating Bayes factors via posterior simulation with the Laplace–Metropolis estimator. Journal of the American Statistical Association, 92, 648–655.
Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423.
Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.
MacKinnon, D. P., Fairchild, A., & Fritz, M. (2007). Mediation analysis. Annual Review of Psychology, 58, 593.
MacKinnon, D. P., Lockwood, C. M., & Hoffman, J. (1998). A new method to test for mediation. Paper presented at the annual meeting of the Society for Prevention Research, Park City, UT.
MacKinnon, D. P., Lockwood, C., Hoffman, J., West, S., & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83–104.
MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99–128.
MacKinnon, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation study of mediated effect measures. Multivariate Behavioral Research, 30, 41–62.
Morey, R. D., & Rouder, J. N. (2011). Bayes factor approaches for testing interval null hypotheses. Psychological Methods, 16, 406–419.
Morey, R. D., & Wagenmakers, E. J. (2014). Simple relation between one–sided and two–sided Bayesian point–null hypothesis tests. Manuscript submitted for publication.
Myung, I. J., & Pitt, M. A. (1997). Applying Occam’s razor in modeling cognition: A Bayesian approach. Psychonomic Bulletin & Review, 4, 79–95.
Nuijten, M. B., Wetzels, R., Matzke, D., Dolan, C. V., & Wagenmakers, E. J. (2014). BayesMed: Default Bayesian hypothesis tests for correlation, partial correlation, and mediation. R package version 1.0. http://CRAN.R-project.org/package=BayesMed
O’Hagan, A., & Forster, J. (2004). Kendall’s advanced theory of statistics vol. 2B: Bayesian inference (2nd ed.). London: Arnold.
Overstall, A. M., & Forster, J. J. (2010). Default Bayesian model determination methods for generalised linear mixed models. Computational Statistics & Data Analysis, 54, 3269–3288.
Pericchi, L. R., Liu, G., & Torres, D. (2008). Objective Bayes factors for informative hypotheses: “Completing” the informative hypothesis and “splitting” the Bayes factor. In H. Hoijtink, I. Klugkist, & P. A. Boelen (Eds.), Bayesian evaluation of informative hypotheses (pp. 131–154). New York: Springer Verlag.
Plummer, M. (2009). JAGS version 1.0. 3 manual. URL: http://www-ice.iarc.fr/~martyn/software/jags/jags_user_manual. pdf
R Core Team. (2012). R: A language and environment for statistical computing []. Vienna, Austria. APACrefURL http://www.R-project.org/ ISBN 3-900051-07-0
Rouder, J. N., & Morey, R. D. (2012). Default Bayes factors for model selection in regression. Multivariate Behavioral Research, 47, 877–903.
Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012). Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56, 356–374.
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p values for testing precise null hypotheses. The American Statistician, 55, 62–71.
Semmens-Wheeler, R., Dienes, Z., & Duka, T. (2013). Alcohol increases hypnotic susceptibility. Consciousness and Cognition, 22(3), 1082–1091.
Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13, 290–312.
Vandekerckhove, J, Matzke, D., & Wagenmakers, E. J. (in press). Model comparison and the principle of parsimony. In J. Busemeyer, J. Townsend, Z. J. Wang, & A. Eidels (Eds.), Oxford handbook of computational and mathematical psychology. Oxford University Press.
Venzon, D., & Moolgavkar, S. (1988). A method for computing profile-likelihood-based confidence intervals. Applied Statistics, 37(1), 87–94.
Verhagen, J., & Wagenmakers, E. J. (in press). A Bayesian test to quantify the success or failure of a replication attempt. Journal of Experimental Psychology: General.
Wagenmakers, E. J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804.
Wagenmakers, E. J., & Grünwald, P. (2006). A Bayesian perspective on hypothesis testing. Psychological Science, 17, 641–642.
Wagenmakers, E. J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psychologists: A tutorial on the Savage–Dickey method. Cognitive Psychology, 60, 158–189.
Wagenmakers, E. J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi. Journal of Personality and Social Psychology, 100, 426–432.
Wetzels, R., Grasman, R. P. P. P., & Wagenmakers, E. J. (2010). An encompassing prior generalization of the Savage–Dickey density ratio test. Computational Statistics & Data Analysis, 54, 2094–2102.
Wetzels, R., Grasman, R. P. P. P., & Wagenmakers, E. J. (2012). A default Bayesian hypothesis test for ANOVA designs. The American Statistician, 66, 104–111.
Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., & Wagenmakers, E. J. (2011). Statistical evidence in experimental psychology: An empirical comparison using 855 t tests. Perspectives on Psychological Science, 6, 291–298.
Wetzels, R., Raaijmakers, J. G. W., Jakab, E., & Wagenmakers, E. J. (2009). How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t test. Psychonomic Bulletin & Review, 16, 752–760.
Wetzels, R., & Wagenmakers, E. J. (2012). A default Bayesian hypothesis test for correlations and partial correlations. Psychonomic Bulletin & Review, 19, 1057–1064.
Yuan, Y., & MacKinnon, D. P. (2009). Bayesian mediation analysis. Psychological Methods, 14, 301–322.
Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, & A. F. M. Smith (Eds), Bayesian statistics (pp. 585–603). Valencia: University Press.
Acknowledgements
This research was supported by an ERC grant from the European Research Council. Conor V. Dolan is supported by the European Research Council (Genetics of Mental Illness; grant number: ERC–230374). Ruud Wetzels is supported by the Dutch national program COMMIT.
Author information
Authors and Affiliations
Corresponding author
Appendixes
Appendixes
Appendix 1. JAGS code
JAGS code for correlation

JAGS code for partial correlation

Appendix 2. Testing the correctness of our JAGS implementation
To assess the correctness of our JAGS implementation, we compared the analytical results for the two-sided Bayes factor against the Savage-Dickey density ratio results based on the MCMC samples from JAGS. The distribution that fit the posterior samples bestFootnote 4 is the nonstandardized t-distribution with the following density:
with ν degrees of freedom, location parameter μ, and scale parameter σ. With the samples of the parameter of interest, we can estimate ν, μ, and σ and, thus, the exact shape of the distribution and the exact height of the distribution at the point of interest.
We checked the fit of this distribution and the performance of the SD method in a small simulation study. We considered the following sample sizes: N = 20, 40, 80, or 160. We simulated correlational data by drawing N values for X from a standard normal distribution, and conditional on X, we simulated values for Y according to the following equation:
where the subscript i denotes subject i and τ represents the relation between X and Y. For each of the four sample sizes, we generated 100 data sets, in each of which τ was drawn from a standard uniform distribution.
Next, we tested the correlation in each data set with both the analytical Bayesian correlation test and the SD method with the nonstandardized t-distribution and compared the results. The results are shown in Fig. 3. The figure shows that the proposed SD method performs well: The Bayes factors of the analytical test and the SD method are similar for all sample sizes and correlations.
Natural logarithm of the Bayes factors for correlation obtained with analytical calculations (x axis) or obtained with the SD method based on a nonstandardized t-distribution (y axis) for different sample sizes (N). The graphs show fewer points as the samples grow larger, because in these situations, there are more extreme Bayes factors that fall outside the axis limits. We restricted the graphs, since it is most important that the lower Bayes factors lie on the diagonal; it is not important whether a Bayes factor is 2,000 or 3,000, since it is overwhelming evidence in any case
Rights and permissions
About this article
Cite this article
Nuijten, M.B., Wetzels, R., Matzke, D. et al. A default Bayesian hypothesis test for mediation. Behav Res 47, 85–97 (2015). https://doi.org/10.3758/s13428-014-0470-2
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13428-014-0470-2