Photo by Markus Spiske on Unsplash

A very short primer on multi-level meta-analyses

How to conduct a three-level meta-analysis in R.


Intro

In this blog post, I give a very short primer for the importance of multi-level aka 3-level meta-analyses and walk you through the necessary R code to conduct such analyses. This post is heavily inspired by Mike Cheung´s (A Guide to Conducting a Meta-Analysis with Non-Independent Effect Sizes).
If you want to dive deeper into multilevel modeling for meta-analyses, I highly recommend this article, his 2014 Article, and, if you want to dive into the really deep waters, his book Meta-Analysis: A Structural Equation Modeling Approach.


The Problem: Effect Size Dependency due to Nestedness

Mike Cheung outlines the problem of effect size dependency in the context of meta-analyses (Cheung 2019) as follows:

When effect sizes are not independent, conclusions based on these conventional procedures can be misleading or even wrong. Traditional approaches, such as averaging the effect sizes and selecting one effect size per study, are usually used to avoid the dependence of the effect sizes. These ad-hoc approaches, however, may lead to missed opportunities to utilize all available data to address the relevant research questions.

To avoid such misleading results or missed opportunities, he suggests:

  1. Multilevel meta-analyses to address nested data. This is the case when multiple effect sizes are reported per study or more generally: when effects within a cluster are more similar to each other than the effect sizes across clusters.

  2. Multivariate meta-analyses to address studies containing effect sizes with correlated sampling errors (which is the case when the same participants are used for calculating multiple effect sizes).

If such dependencies are not addressed, the estimated standard errors on the average effect are generally under-estimated, which leads to narrower CIs, which in turn increases the risk of type-1-errors.


The Solution: 3-Level Meta-Analysis

Benefits of 3-level meta-analysis:

  1. Results based on a conventional meta-analysis and a three-level meta-analysis are identical only when the level-2 heterogeneity variance is zero.

  2. Possibility to study the level-2 and level-3 heterogeneity variances and their I2 counterparts at level-2 and level-3.

  3. Possibility to investigate how the level-2 and level-3 moderators explain the heterogeneity using R2 at level- 2 and level-3. These additional statistical analyses allow researchers to study the heterogeneity at different levels (see Cheung, 2014).

Levels

There are three source of variability which cause effect size estimates to deviate from the true effect size:

  • level-1 = sampling error of individual studies.
  • level-2 = variability at the “within study” level. This is the variability due to effects that are nested within a study.
  • level-3 = variability at the “between study” level. The true effect size of a study itself is only part of an overarching distribution of true effect sizes, from which the individual true effect size is drawn.

You can find a detailed explanation about each level, including nice visualizations, here: Doing Meta-Analysis in R.


How To: 3-Level Meta-Analyses with metafor

There are two packages suited to conduct these kind of multilevel meta-analyses: metaSEM and metafor. Below, I walk you through the R code with metafor to keep things short and simple.

Below, we create a data set with 10 effect sizes, which are nested in 5 studies.

data <- tribble(
  ~study_name, ~effect_size, ~variance,
  "A",           0.408,     .033,
  "A",           0.472,     .020,
  "B",           -0.198,     .020,
  "C",           0.48,      .019,
  "C",           0.33,      .030,
  "C",           0.37,      .027,
  "D",           0.862,     .119,
  "E",           0.43,      .033,
  "E",           0.116,     .032,
  "E",           0.193,     .034,
)


Univariate Meta-Analysis

If we just calculate a univariate random effects meta-analysis, the code looks like this:

uni_metafor <- rma.uni(data = data, 
                       yi = effect_size,
                       vi = variance,      
                       method = "REML",   # suggested estimator
                       slab = study_name) # study label

uni_metafor

Random-Effects Model (k = 10; tau^2 estimator: REML)

tau^2 (estimated amount of total heterogeneity): 0.0344 (SE = 0.0302)
tau (square root of estimated tau^2 value):      0.1854
I^2 (total heterogeneity / total variability):   54.46%
H^2 (total variability / sampling variability):  2.20

Test for Heterogeneity:
Q(df = 9) = 20.7105, p-val = 0.0140

Model Results:

estimate      se    zval    pval   ci.lb   ci.ub 
  0.3120  0.0808  3.8624  0.0001  0.1537  0.4704  *** 

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


3-level meta-analysis

Data preparation

Now, before we can use the rma.mv() function to calculate the three level model, we need an ID for each study to indicate level-3, and an effect size ID to indicate level-2. Here is an example for how this could be done using a tidyverse approach:

data_3l <- data %>% 
  mutate(es_id = row_number()) %>%            # adding effect size id, needs to be unique
  group_by(study_name) %>%                    # grouping by study name/authors
  mutate(study_id = group_indices())          # adding study size id
data_3l
# A tibble: 10 x 5
# Groups:   study_name [5]
   study_name effect_size variance es_id study_id
   <chr>            <dbl>    <dbl> <int>    <int>
 1 A                0.408    0.033     1        1
 2 A                0.472    0.02      2        1
 3 B               -0.198    0.02      3        2
 4 C                0.48     0.019     4        3
 5 C                0.33     0.03      5        3
 6 C                0.37     0.027     6        3
 7 D                0.862    0.119     7        4
 8 E                0.43     0.033     8        5
 9 E                0.116    0.032     9        5
10 E                0.193    0.034    10        5
three_metafor <- rma.mv(yi = effect_size, 
                    V = variance,                  # sampling variances
                    random = list(~ 1 | es_id,     # indicate level 2
                                  ~ 1 | study_id), # indicate level 3
                    tdist = TRUE,                  # Knapp-Hartung adjustment
                    data = data_3l,
                    method = "REML",
                    slab = study_name)

three_metafor

Multivariate Meta-Analysis Model (k = 10; method: REML)

Variance Components:

            estim    sqrt  nlvls  fixed    factor 
sigma^2.1  0.0000  0.0000     10     no     es_id 
sigma^2.2  0.0773  0.2781      5     no  study_id 

Test for Heterogeneity:
Q(df = 9) = 20.7105, p-val = 0.0140

Model Results:

estimate      se    tval    pval    ci.lb   ci.ub 
  0.3004  0.1421  2.1138  0.0637  -0.0211  0.6218  . 

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now, you can see under nlvls how many effect size and studies are included in the 3-level structure: 10 effect sizes at level 2 (sigma^2.1) and 5 studies at level-3 (sigma^2.2). One thing that immediately stands out: the summary effect is very similar, yet the confidence intervalls are not! The three-level model has wider CIs, now including the null! If we would not address the effect size dependency, we would underestimate the heterogeneity and might wrongfully conclude an effect, if there really isn´t one.


Heterogeneity

One of the great benefits of 3-level models is that we can now look at the heterogeneity at each level. The mlm.variance.distribution() function from Mathias Harrer´s dmetar package nicely visualizes, how the total heterogeneity is set up.

heterogeneity <- mlm.variance.distribution(x = three_metafor)
        % of total variance   I2
Level 1        2.710371e+01  ---
Level 2        1.362725e-09    0
Level 3        7.289629e+01 72.9
Total I2: 72.9% 

heterogeneity
        % of total variance   I2
Level 1        2.710371e+01  ---
Level 2        1.362725e-09    0
Level 3        7.289629e+01 72.9

You can see, that the total heterogeneity is quite high with 72.9% and 0% can be attributed to level 2 (within-cluster heterogeneity) and 72.9% to level 3 (between-cluster heterogeneity). So, in this fictious example, all heterogeneity can be attributed to difference between the studies!


The formula used to calculate Higgins I2 can be found here, in our case the code would look like this:

confint(three_metafor)

          estimate  ci.lb  ci.ub 
sigma^2.1   0.0000 0.0000 0.0862 
sigma.1     0.0000 0.0000 0.2936 

          estimate  ci.lb  ci.ub 
sigma^2.2   0.0773 0.0000 0.6817 
sigma.2     0.2781 0.0000 0.8256 
W <- diag(1/data_3l$variance)

X <- model.matrix(three_metafor)

P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W

I2 <- 100 * sum(three_metafor$sigma2) / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P)))

I2 # This value is the Higgins I2 for the three-level model
[1] 72.89629

Calculating how much of the total variance can be attributed to between- and within-cluster heterogeneity separately:

100 * three_metafor$sigma2 / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P)))
[1] 1.362725e-09 7.289629e+01


Comparing the fit

Now one question remains: does our 3-level model capture the heterogeneity of our data better than the 2-level model (for a nice explanation, why every meta-analysis is a multi-level model see)?

We can easily check this by removing each level and comparing the fit with the Full model:

removed_l2 <- rma.mv(yi = effect_size, 
                     V = variance,                 
                     random = list(~ 1 | es_id,     
                                   ~ 1 | study_id), 
                     tdist = TRUE,                  
                     data = data_3l,
                     method = "REML",
                     slab = study_name,
                     sigma2 = c(0, NA)) # remove level-3

removed_l3 <- rma.mv(yi = effect_size, 
                     V = variance,                 
                     random = list(~ 1 | es_id,     
                                   ~ 1 | study_id), 
                     tdist = TRUE,                  
                     data = data_3l,
                     method = "REML",
                     slab = study_name,
                     sigma2 = c(NA, 0)) # remove level-3


Comparing fit: Full model vs model without level-2

anova(three_metafor, removed_l2)

        df    AIC    BIC   AICc logLik    LRT   pval      QE 
Full     3 4.7638 5.3555 9.5638 0.6181               20.7105 
Reduced  2 2.7638 3.1583 4.7638 0.6181 0.0000 1.0000 20.7105 


Comparing fit Full model vs model without level-3

anova(three_metafor, removed_l3) 

        df    AIC    BIC   AICc  logLik    LRT   pval      QE 
Full     3 4.7638 5.3555 9.5638  0.6181               20.7105 
Reduced  2 5.2619 5.6564 7.2619 -0.6310 2.4981 0.1140 20.7105 

We can see that first Full model compared to the model without level 2, does not have a better fit, indicated by higher AIC and BIC values. But the second Full model compared to the model without level 3 does have a better fit, so it was a good idea to multi-level up!


Concluding remarks and further resources

Addressing the nested structure is both important and easy to do (with R)!

Additional code examples on how to conduct such analyses with the metaSEM package can be found here.