Photo by Markus Spiske on Unsplash

Recalculating Effect Sizes for Luo et al. 2020


Intro

This blogpost is inspired by a recent tweet on a meta-analysis with extremely large effect sizes.



Lets see what happens if we recalculate these effect sizes and what might have led to such a large effect size estimate.

But first, lets recreate the original analysis.


Recreating Luo et al. 2020

luo_2020 <- tribble(
  ~study,        ~ecbt_mean, ~ecbt_sd, ~ecbt_n, ~cbt_mean, ~cbt_sd, ~cbt_n,
  "Andersson 2013",   10,         1.9,      33,       7,         1.6,     36,
  "Choi 2014",        10.6,       0.2,      56,       14.1,       0.2,    63,
  "Glueckauf 2012",   9.4,        3.5,      7,        2.8,        3.5,     7,
  "Himmelhoch 2013",  6.3,        2.5,     16,        8.7,        2.1,    18, 
  "Kalapatapu 2014",  9.4,        1.5,     50,        10,         1.1,    53,
  "Kay-Lambkin 2009", 11.5,       2.8,     32,        18.3,       2.4,    35,
  "Luxton 2016",      13,         2.2,     62,        18,         2.2,    59,
  "Mohr 2012",        9.25,       0.7,     163,       10.32,      0.8,   162,
  "Nelson 2003",      7.7,        2.9,     14,        1.9,        3.9,    14, 
  "Poppelaars 2016",  4.7,        2.4,     51,        4,          2.4,    50,
  "Sethi 2010",       0.7,        3.4,      9,        11.8,       1.9,    10,
  "Sethi 2013",       5.5,        1.2,     23,        13.6,       1.3,    21,
  "Wagner 2014",      10.6,       2.1,     32,        11.1,       2.1,    30,
  "Wright 2005",      17.5,       2.3,     15,        14.7,       1.9,    15
)


Calculating Hedges g

luo_2020 <- escalc(data = luo_2020,
              measure = "SMD", 
              m1i = ecbt_mean,
              m2i = cbt_mean,
              sd1i = ecbt_sd, 
              sd2i = cbt_sd,
              n1i = ecbt_n,
              n2i = cbt_n)
luo_2020
              study ecbt_mean ecbt_sd ecbt_n cbt_mean cbt_sd cbt_n       yi 
1    Andersson 2013     10.00     1.9     33     7.00    1.6    36   1.6953 
2         Choi 2014     10.60     0.2     56    14.10    0.2    63 -17.3875 
3    Glueckauf 2012      9.40     3.5      7     2.80    3.5     7   1.7649 
4   Himmelhoch 2013      6.30     2.5     16     8.70    2.1    18  -1.0205 
5   Kalapatapu 2014      9.40     1.5     50    10.00    1.1    53  -0.4548 
6  Kay-Lambkin 2009     11.50     2.8     32    18.30    2.4    35  -2.5866 
7       Luxton 2016     13.00     2.2     62    18.00    2.2    59  -2.2584 
8         Mohr 2012      9.25     0.7    163    10.32    0.8   162  -1.4205 
9       Nelson 2003      7.70     2.9     14     1.90    3.9    14   1.6385 
10  Poppelaars 2016      4.70     2.4     51     4.00    2.4    50   0.2895 
11       Sethi 2010      0.70     3.4      9    11.80    1.9    10  -3.9102 
12       Sethi 2013      5.50     1.2     23    13.60    1.3    21  -6.3705 
13      Wagner 2014     10.60     2.1     32    11.10    2.1    30  -0.2351 
14      Wright 2005     17.50     2.3     15    14.70    1.9    15   1.2914 
       vi 
1  0.0789 
2  1.3040 
3  0.3970 
4  0.1334 
5  0.0399 
6  0.1098 
7  0.0542 
8  0.0154 
9  0.1908 
10 0.0400 
11 0.6135 
12 0.5523 
13 0.0650 
14 0.1611 


Calculating the random effects meta-analysis with a REML estimator

luo_reml <- rma(luo_2020, method = "REML", slab = study)
luo_reml

Random-Effects Model (k = 14; tau^2 estimator: REML)

tau^2 (estimated amount of total heterogeneity): 23.1247 (SE = 9.1728)
tau (square root of estimated tau^2 value):      4.8088
I^2 (total heterogeneity / total variability):   99.66%
H^2 (total variability / sampling variability):  291.91

Test for Heterogeneity:
Q(df = 13) = 567.8960, p-val < .0001

Model Results:

estimate      se     zval    pval    ci.lb   ci.ub 
 -2.0097  1.2925  -1.5549  0.1200  -4.5430  0.5235    

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Recreated forest plot

luo <- viz_forest(luo_reml, 
           study_labels = luo_2020$study,
           annotate_CI = T,
           variant = "rain")
Registered S3 methods overwritten by 'car':
  method                          from
  influence.merMod                lme4
  cooks.distance.influence.merMod lme4
  dfbeta.influence.merMod         lme4
  dfbetas.influence.merMod        lme4
luo


Forest plot from paper


We can see that there are some deviations from the original paper, both in summary estimate and CIs. This might be due to a different estimator, but note how zero is included in the CI if we use the REML estimator, and not when using the estimator from RevMan.


Deep dive

Unfortunatley, there are some major issues with the data. I recalculated all reported effect sizes greater than 2 and I could not reproduce any of them. Based on the values I coded, the effect goes in the opposite direction, indicating a greater efficacy for face-to-face cbt.

Here are the sources for the means and SDs for my recalculated effect sizes:


Choi 2014

Values taken from Table 3 for 12 week follow up.


Sethi 2010

Values taken from Table 1


Sethi 2013


Kay-Lambkin


Luxton


Here I change the values extracted from the tables above:

deep_dive_cyp <- tribble(
  ~study,        ~ecbt_mean, ~ecbt_sd, ~ecbt_n, ~cbt_mean, ~cbt_sd, ~cbt_n,  ~ecbt_se, ~cbt_se,
  "Andersson 2013",     10,         1.9,      33,       7,         1.6,    36,  NA ,   NA , 
  "Choi 2014 12 weeks", 13.68,       NA,      56,       14.08,     NA,     63,  1.00,  0.94,
  "Glueckauf 2012",     9.4,        3.5,      7,        2.8,       3.5,     7,   NA , NA ,
  "Himmelhoch 2013",    6.3,        2.5,      16,       8.7,       2.1,    18,   NA , NA ,
  "Kalapatapu 2014",    9.4,        1.5,      50,       10,        1.1,    53,   NA , NA ,
  "Kay-Lambkin 2009",   17.09,      12.14,    32,       13.04,    10.51,   35,   NA , NA ,
  "Luxton 2016",        13.82,      12.02,    45,       11.74,   12.08,    42,   NA , NA ,
  "Mohr 2012",          9.25,       0.7,      163,      10.32,     0.8,   162,   NA , NA ,
  "Nelson 2003",        7.7,        2.9,      14,       1.9,       3.9,    14,   NA , NA ,
  "Poppelaars 2016",    4.7,        2.4,      51,       4,         2.4,    50,   NA , NA ,
  "Sethi 2010",         15.7,       4.2,      9,        7.2,       3.1,    10,   NA , NA ,
  "Sethi 2013",         12.17,      3.12,     23,       7.8,      3.09,    21,   NA , NA ,
  "Wagner 2014",        10.6,       2.1,      32,       11.1,      2.1,    30,   NA , NA ,
  "Wright 2005",        17.5,       2.3,      15,       14.7,      1.9,    15,   NA , NA 
)


Calculating SDs from SEs

We need to calculate some SDs from reported standard deviations

deep_dive_cyp <- deep_dive_cyp %>%
  mutate(ecbt_sd = ifelse(is.na(ecbt_sd), ecbt_se * sqrt(ecbt_n), ecbt_sd)) %>% 
  mutate(cbt_sd = ifelse(is.na(cbt_sd), cbt_se * sqrt(cbt_n), cbt_sd))

deep_dive_cyp
# A tibble: 14 x 9
   study           ecbt_mean ecbt_sd ecbt_n cbt_mean cbt_sd cbt_n ecbt_se cbt_se
   <chr>               <dbl>   <dbl>  <dbl>    <dbl>  <dbl> <dbl>   <dbl>  <dbl>
 1 Andersson 2013      10       1.9      33      7     1.6     36      NA  NA   
 2 Choi 2014 12 w…     13.7     7.48     56     14.1   7.46    63       1   0.94
 3 Glueckauf 2012       9.4     3.5       7      2.8   3.5      7      NA  NA   
 4 Himmelhoch 2013      6.3     2.5      16      8.7   2.1     18      NA  NA   
 5 Kalapatapu 2014      9.4     1.5      50     10     1.1     53      NA  NA   
 6 Kay-Lambkin 20…     17.1    12.1      32     13.0  10.5     35      NA  NA   
 7 Luxton 2016         13.8    12.0      45     11.7  12.1     42      NA  NA   
 8 Mohr 2012            9.25    0.7     163     10.3   0.8    162      NA  NA   
 9 Nelson 2003          7.7     2.9      14      1.9   3.9     14      NA  NA   
10 Poppelaars 2016      4.7     2.4      51      4     2.4     50      NA  NA   
11 Sethi 2010          15.7     4.2       9      7.2   3.1     10      NA  NA   
12 Sethi 2013          12.2     3.12     23      7.8   3.09    21      NA  NA   
13 Wagner 2014         10.6     2.1      32     11.1   2.1     30      NA  NA   
14 Wright 2005         17.5     2.3      15     14.7   1.9     15      NA  NA   


Calculating the effect sizes

deep_dive_cyp <- escalc(data = deep_dive_cyp,
              measure = "SMD", 
              m1i = ecbt_mean,
              m2i = cbt_mean,
              sd1i = ecbt_sd, 
              sd2i = cbt_sd,
              n1i = ecbt_n,
              n2i = cbt_n)


Calculating random effects meta-analysis

deep_dive_cyp_reml <- rma(deep_dive_cyp, method = "REML")
deep_dive_cyp_reml

Random-Effects Model (k = 14; tau^2 estimator: REML)

tau^2 (estimated amount of total heterogeneity): 1.1101 (SE = 0.4786)
tau (square root of estimated tau^2 value):      1.0536
I^2 (total heterogeneity / total variability):   94.78%
H^2 (total variability / sampling variability):  19.14

Test for Heterogeneity:
Q(df = 13) = 243.6762, p-val < .0001

Model Results:

estimate      se    zval    pval    ci.lb   ci.ub 
  0.4814  0.2956  1.6284  0.1034  -0.0980  1.0608    

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Forest Plot with recalcuated effect sizes

deep_dive <- viz_forest(deep_dive_cyp_reml, 
           study_labels = deep_dive_cyp$study,
           annotate_CI = T,
           variant = "rain")

deep_dive

Now the summary effect size goes in the opposite direction and indicates a greater efficacy for face-to-face cbt. Compare this to the forest Plot with data reported in the original paper:

luo


Summary

Unfortunalty, I could not reproduce the very large effect sizes reported in the original study and do not understand how they were calculated.

The summary effect changed from g = -2.01 [-4.54, 0.52] to g = 0.48 [-0.10, 1.06] after recoding all effect sizes > 2 (k = 5).