## Intro

In this blog post, I give a very short primer for the importance of multi-level aka 3-level meta-analyses and walk you through the necessary R code to conduct such analyses. This post is heavily inspired by
Mike Cheung´s (*A Guide to Conducting a Meta-Analysis with Non-Independent Effect Sizes*).

If you want to dive deeper into multilevel modeling for meta-analyses, I highly recommend this article, his 2014 Article, and, if you want to dive into the really deep waters, his book Meta-Analysis: A Structural Equation Modeling Approach.

## The Problem: Effect Size Dependency due to Nestedness

Mike Cheung outlines the problem of effect size dependency in the context of meta-analyses (Cheung 2019) as follows:

When effect sizes are not independent, conclusions based on these conventional procedures can be

misleading or even wrong. Traditional approaches, such as averaging the effect sizes and selecting one effect size per study, are usually used to avoid the dependence of the effect sizes. These ad-hoc approaches, however, may lead tomissed opportunitiesto utilize all available data to address the relevant research questions.

To avoid such misleading results or missed opportunities, he suggests:

**Multilevel meta-analyses**to address nested data. This is the case when multiple effect sizes are reported per study or more generally: when effects within a cluster are more similar to each other than the effect sizes across clusters.**Multivariate meta-analyses**to address studies containing effect sizes with correlated sampling errors (which is the case when the same participants are used for calculating multiple effect sizes).

If such dependencies are not addressed, the estimated standard errors on the average effect are generally under-estimated, which leads to narrower *CI*s, which in turn increases the risk of type-1-errors.

## The Solution: 3-Level Meta-Analysis

### Benefits of 3-level meta-analysis:

Results based on a conventional meta-analysis and a three-level meta-analysis are identical only when the level-2 heterogeneity variance is zero.

Possibility to study the level-2 and level-3

**heterogeneity variances and their I**at level-2 and level-3.^{2}counterpartsPossibility to investigate how the level-2 and level-3

**moderators**explain the heterogeneity using R^{2}at level- 2 and level-3. These additional statistical analyses allow researchers to study the heterogeneity at different levels (see Cheung, 2014).

### Levels

There are three source of variability which cause effect size estimates to deviate from the true effect size:

- level-1 = sampling error of individual studies.
- level-2 = variability at the “within study” level. This is the variability due to effects that are nested within a study.

- level-3 = variability at the “between study” level. The true effect size of a study itself is only part of an overarching distribution of true effect sizes, from which the individual true effect size is drawn.

You can find a detailed explanation about each level, including nice visualizations, here: Doing Meta-Analysis in R.

## How To: 3-Level Meta-Analyses with `metafor`

There are two packages suited to conduct these kind of multilevel meta-analyses: `metaSEM`

and `metafor`

. Below, I walk you through the R code with `metafor`

to keep things short and simple.

Below, we create a data set with 10 effect sizes, which are nested in 5 studies.

```
data <- tribble(
~study_name, ~effect_size, ~variance,
"A", 0.408, .033,
"A", 0.472, .020,
"B", -0.198, .020,
"C", 0.48, .019,
"C", 0.33, .030,
"C", 0.37, .027,
"D", 0.862, .119,
"E", 0.43, .033,
"E", 0.116, .032,
"E", 0.193, .034,
)
```

### Univariate Meta-Analysis

If we just calculate a univariate random effects meta-analysis, the code looks like this:

```
uni_metafor <- rma.uni(data = data,
yi = effect_size,
vi = variance,
method = "REML", # suggested estimator
slab = study_name) # study label
uni_metafor
```

```
Random-Effects Model (k = 10; tau^2 estimator: REML)
tau^2 (estimated amount of total heterogeneity): 0.0344 (SE = 0.0302)
tau (square root of estimated tau^2 value): 0.1854
I^2 (total heterogeneity / total variability): 54.46%
H^2 (total variability / sampling variability): 2.20
Test for Heterogeneity:
Q(df = 9) = 20.7105, p-val = 0.0140
Model Results:
estimate se zval pval ci.lb ci.ub
0.3120 0.0808 3.8624 0.0001 0.1537 0.4704 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

### 3-level meta-analysis

#### Data preparation

Now, before we can use the `rma.mv()`

function to calculate the three level model, we need an ID for each study to indicate level-3, and an effect size ID to indicate level-2. Here is an example for how this could be done using a tidyverse approach:

```
data_3l <- data %>%
mutate(es_id = row_number()) %>% # adding effect size id, needs to be unique
group_by(study_name) %>% # grouping by study name/authors
mutate(study_id = group_indices()) # adding study size id
data_3l
```

```
# A tibble: 10 x 5
# Groups: study_name [5]
study_name effect_size variance es_id study_id
<chr> <dbl> <dbl> <int> <int>
1 A 0.408 0.033 1 1
2 A 0.472 0.02 2 1
3 B -0.198 0.02 3 2
4 C 0.48 0.019 4 3
5 C 0.33 0.03 5 3
6 C 0.37 0.027 6 3
7 D 0.862 0.119 7 4
8 E 0.43 0.033 8 5
9 E 0.116 0.032 9 5
10 E 0.193 0.034 10 5
```

```
three_metafor <- rma.mv(yi = effect_size,
V = variance, # sampling variances
random = list(~ 1 | es_id, # indicate level 2
~ 1 | study_id), # indicate level 3
tdist = TRUE, # Knapp-Hartung adjustment
data = data_3l,
method = "REML",
slab = study_name)
three_metafor
```

```
Multivariate Meta-Analysis Model (k = 10; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 10 no es_id
sigma^2.2 0.0773 0.2781 5 no study_id
Test for Heterogeneity:
Q(df = 9) = 20.7105, p-val = 0.0140
Model Results:
estimate se tval pval ci.lb ci.ub
0.3004 0.1421 2.1138 0.0637 -0.0211 0.6218 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

Now, you can see under nlvls how many effect size and studies are included in the 3-level structure: 10 effect sizes at level 2 (sigma^2.1) and 5 studies at level-3 (sigma^2.2). One thing that immediately stands out: the summary effect is very similar, yet the confidence intervalls are not! The three-level model has wider *CI*s, now including the null! If we would not address the effect size dependency, we would underestimate the heterogeneity and might wrongfully conclude an effect, if there really isn´t one.

### Heterogeneity

One of the great benefits of 3-level models is that we can now look at the heterogeneity at each level. The `mlm.variance.distribution()`

function from Mathias Harrer´s `dmetar`

package nicely visualizes, how the total heterogeneity is set up.

`heterogeneity <- mlm.variance.distribution(x = three_metafor)`

```
% of total variance I2
Level 1 2.710371e+01 ---
Level 2 1.362725e-09 0
Level 3 7.289629e+01 72.9
Total I2: 72.9%
```

`heterogeneity`

```
% of total variance I2
Level 1 2.710371e+01 ---
Level 2 1.362725e-09 0
Level 3 7.289629e+01 72.9
```

You can see, that the total heterogeneity is quite high with 72.9% and 0% can be attributed to level 2 (within-cluster heterogeneity) and 72.9% to level 3 (between-cluster heterogeneity). So, in this fictious example, all heterogeneity can be attributed to difference between the studies!

The formula used to calculate Higgins I^{2} can be found here, in our case the code would look like this:

`confint(three_metafor)`

```
estimate ci.lb ci.ub
sigma^2.1 0.0000 0.0000 0.0862
sigma.1 0.0000 0.0000 0.2936
estimate ci.lb ci.ub
sigma^2.2 0.0773 0.0000 0.6817
sigma.2 0.2781 0.0000 0.8256
```

```
W <- diag(1/data_3l$variance)
X <- model.matrix(three_metafor)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
I2 <- 100 * sum(three_metafor$sigma2) / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P)))
I2 # This value is the Higgins I2 for the three-level model
```

`[1] 72.89629`

Calculating how much of the total variance can be attributed to between- and within-cluster heterogeneity separately:

`100 * three_metafor$sigma2 / (sum(three_metafor$sigma2) + (three_metafor$k-three_metafor$p)/sum(diag(P)))`

`[1] 1.362725e-09 7.289629e+01`

### Comparing the fit

Now one question remains: does our 3-level model capture the heterogeneity of our data better than the 2-level model (for a nice explanation, why every meta-analysis is a multi-level model see)?

We can easily check this by removing each level and comparing the fit with the `Full`

model:

```
removed_l2 <- rma.mv(yi = effect_size,
V = variance,
random = list(~ 1 | es_id,
~ 1 | study_id),
tdist = TRUE,
data = data_3l,
method = "REML",
slab = study_name,
sigma2 = c(0, NA)) # remove level-3
removed_l3 <- rma.mv(yi = effect_size,
V = variance,
random = list(~ 1 | es_id,
~ 1 | study_id),
tdist = TRUE,
data = data_3l,
method = "REML",
slab = study_name,
sigma2 = c(NA, 0)) # remove level-3
```

#### Comparing fit: `Full`

model vs model without level-2

`anova(three_metafor, removed_l2)`

```
df AIC BIC AICc logLik LRT pval QE
Full 3 4.7638 5.3555 9.5638 0.6181 20.7105
Reduced 2 2.7638 3.1583 4.7638 0.6181 0.0000 1.0000 20.7105
```

#### Comparing fit `Full`

model vs model without level-3

`anova(three_metafor, removed_l3) `

```
df AIC BIC AICc logLik LRT pval QE
Full 3 4.7638 5.3555 9.5638 0.6181 20.7105
Reduced 2 5.2619 5.6564 7.2619 -0.6310 2.4981 0.1140 20.7105
```

We can see that first `Full`

model compared to the model without level 2, does not have a better fit, indicated by higher `AIC`

and `BIC`

values. But the second `Full`

model compared to the model without level 3 does have a better fit, so it was a good idea to multi-level up!

## Concluding remarks and further resources

Addressing the nested structure is both important and easy to do (with R)!

Additional code examples on how to conduct such analyses with the `metaSEM`

package can be found here.