In this blog post, I give a very short primer for the importance of multi-level aka 3-level meta-analyses and walk you through the necessary R code to conduct such analyses. This post is heavily inspired by
Mike Cheung´s (A Guide to Conducting a Meta-Analysis with Non-Independent Effect Sizes).
If you want to dive deeper into multilevel modeling for meta-analyses, I highly recommend this article, his 2014 Article, and, if you want to dive into the really deep waters, his book Meta-Analysis: A Structural Equation Modeling Approach.
The Problem: Effect Size Dependency due to Nestedness
Mike Cheung outlines the problem of effect size dependency in the context of meta-analyses (Cheung 2019) as follows:
When effect sizes are not independent, conclusions based on these conventional procedures can be misleading or even wrong. Traditional approaches, such as averaging the effect sizes and selecting one effect size per study, are usually used to avoid the dependence of the effect sizes. These ad-hoc approaches, however, may lead to missed opportunities to utilize all available data to address the relevant research questions.
To avoid such misleading results or missed opportunities, he suggests:
Multilevel meta-analyses to address nested data. This is the case when multiple effect sizes are reported per study or more generally: when effects within a cluster are more similar to each other than the effect sizes across clusters.
Multivariate meta-analyses to address studies containing effect sizes with correlated sampling errors (which is the case when the same participants are used for calculating multiple effect sizes).
If such dependencies are not addressed, the estimated standard errors on the average effect are generally under-estimated, which leads to narrower CIs, which in turn increases the risk of type-1-errors.
The Solution: 3-Level Meta-Analysis
Benefits of 3-level meta-analysis:
Results based on a conventional meta-analysis and a three-level meta-analysis are identical only when the level-2 heterogeneity variance is zero.
Possibility to study the level-2 and level-3 heterogeneity variances and their I2 counterparts at level-2 and level-3.
Possibility to investigate how the level-2 and level-3 moderators explain the heterogeneity using R2 at level- 2 and level-3. These additional statistical analyses allow researchers to study the heterogeneity at different levels (see Cheung, 2014).
There are three source of variability which cause effect size estimates to deviate from the true effect size:
- level-1 = sampling error of individual studies.
- level-2 = variability at the “within study” level. This is the variability due to effects that are nested within a study.
- level-3 = variability at the “between study” level. The true effect size of a study itself is only part of an overarching distribution of true effect sizes, from which the individual true effect size is drawn.
You can find a detailed explanation about each level, including nice visualizations, here: Doing Meta-Analysis in R.
How To: 3-Level Meta-Analyses with
Below, we create a data set with 10 effect sizes, which are nested in 5 studies.
data <- tribble( ~study_name, ~effect_size, ~variance, "A", 0.408, .033, "A", 0.472, .020, "B", -0.198, .020, "C", 0.48, .019, "C", 0.33, .030, "C", 0.37, .027, "D", 0.862, .119, "E", 0.43, .033, "E", 0.116, .032, "E", 0.193, .034, )
If we just calculate a univariate random effects meta-analysis, the code looks like this:
uni_metafor <- rma.uni(data = data, yi = effect_size, vi = variance, method = "REML", # suggested estimator slab = study_name) # study label uni_metafor
Random-Effects Model (k = 10; tau^2 estimator: REML) tau^2 (estimated amount of total heterogeneity): 0.0344 (SE = 0.0302) tau (square root of estimated tau^2 value): 0.1854 I^2 (total heterogeneity / total variability): 54.46% H^2 (total variability / sampling variability): 2.20 Test for Heterogeneity: Q(df = 9) = 20.7105, p-val = 0.0140 Model Results: estimate se zval pval ci.lb ci.ub 0.3120 0.0808 3.8624 0.0001 0.1537 0.4704 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now, before we can use the
rma.mv() function to calculate the three level model, we need an ID for each study to indicate level-3, and an effect size ID to indicate level-2. Here is an example for how this could be done using a tidyverse approach:
data_3l <- data %>% mutate(es_id = row_number()) %>% # adding effect size id, needs to be unique group_by(study_name) %>% # grouping by study name/authors mutate(study_id = group_indices()) # adding study size id data_3l
# A tibble: 10 x 5 # Groups: study_name  study_name effect_size variance es_id study_id <chr> <dbl> <dbl> <int> <int> 1 A 0.408 0.033 1 1 2 A 0.472 0.02 2 1 3 B -0.198 0.02 3 2 4 C 0.48 0.019 4 3 5 C 0.33 0.03 5 3 6 C 0.37 0.027 6 3 7 D 0.862 0.119 7 4 8 E 0.43 0.033 8 5 9 E 0.116 0.032 9 5 10 E 0.193 0.034 10 5
three_metafor <- rma.mv(yi = effect_size, V = variance, # sampling variances random = list(~ 1 | es_id, # indicate level 2 ~ 1 | study_id), # indicate level 3 tdist = TRUE, # Knapp-Hartung adjustment data = data_3l, method = "REML", slab = study_name) three_metafor
Multivariate Meta-Analysis Model (k = 10; method: REML) Variance Components: estim sqrt nlvls fixed factor sigma^2.1 0.0000 0.0000 10 no es_id sigma^2.2 0.0773 0.2781 5 no study_id Test for Heterogeneity: Q(df = 9) = 20.7105, p-val = 0.0140 Model Results: estimate se tval pval ci.lb ci.ub 0.3004 0.1421 2.1138 0.0637 -0.0211 0.6218 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now, you can see under nlvls how many effect size and studies are included in the 3-level structure: 10 effect sizes at level 2 (sigma^2.1) and 5 studies at level-3 (sigma^2.2). One thing that immediately stands out: the summary effect is very similar, yet the confidence intervalls are not! The three-level model has wider CIs, now including the null! If we would not address the effect size dependency, we would underestimate the heterogeneity and might wrongfully conclude an effect, if there really isn´t one.
One of the great benefits of 3-level models is that we can now look at the heterogeneity at each level. The
mlm.variance.distribution() function from Mathias Harrer´s
dmetar package nicely visualizes, how the total heterogeneity is set up.
heterogeneity <- mlm.variance.distribution(x = three_metafor)
% of total variance I2 Level 1 2.710371e+01 --- Level 2 1.362725e-09 0 Level 3 7.289629e+01 72.9 Total I2: 72.9%
% of total variance I2 Level 1 2.710371e+01 --- Level 2 1.362725e-09 0 Level 3 7.289629e+01 72.9
You can see, that the total heterogeneity is quite high with 72.9% and 0% can be attributed to level 2 (within-cluster heterogeneity) and 72.9% to level 3 (between-cluster heterogeneity). So, in this fictious example, all heterogeneity can be attributed to difference between the studies!
The formula used to calculate Higgins I2 can be found here, in our case the code would look like this:
estimate ci.lb ci.ub sigma^2.1 0.0000 0.0000 0.0862 sigma.1 0.0000 0.0000 0.2936 estimate ci.lb ci.ub sigma^2.2 0.0773 0.0000 0.6817 sigma.2 0.2781 0.0000 0.8256
W <- diag(1/data_3l$variance) X <- model.matrix(three_metafor) P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W I2 <- 100 * sum(three_metafor$sigma2) / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P))) I2 # This value is the Higgins I2 for the three-level model
Calculating how much of the total variance can be attributed to between- and within-cluster heterogeneity separately:
100 * three_metafor$sigma2 / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P)))
 1.362725e-09 7.289629e+01
Comparing the fit
Now one question remains: does our 3-level model capture the heterogeneity of our data better than the 2-level model (for a nice explanation, why every meta-analysis is a multi-level model see)?
We can easily check this by removing each level and comparing the fit with the
removed_l2 <- rma.mv(yi = effect_size, V = variance, random = list(~ 1 | es_id, ~ 1 | study_id), tdist = TRUE, data = data_3l, method = "REML", slab = study_name, sigma2 = c(0, NA)) # remove level-3 removed_l3 <- rma.mv(yi = effect_size, V = variance, random = list(~ 1 | es_id, ~ 1 | study_id), tdist = TRUE, data = data_3l, method = "REML", slab = study_name, sigma2 = c(NA, 0)) # remove level-3
Full model vs model without level-2
df AIC BIC AICc logLik LRT pval QE Full 3 4.7638 5.3555 9.5638 0.6181 20.7105 Reduced 2 2.7638 3.1583 4.7638 0.6181 0.0000 1.0000 20.7105
Full model vs model without level-3
df AIC BIC AICc logLik LRT pval QE Full 3 4.7638 5.3555 9.5638 0.6181 20.7105 Reduced 2 5.2619 5.6564 7.2619 -0.6310 2.4981 0.1140 20.7105
We can see that first
Full model compared to the model without level 2, does not have a better fit, indicated by higher
BIC values. But the second
Full model compared to the model without level 3 does have a better fit, so it was a good idea to multi-level up!
Concluding remarks and further resources
Addressing the nested structure is both important and easy to do (with R)!
Additional code examples on how to conduct such analyses with the
metaSEM package can be found here.