Using a delay-adjusted case fatality ratio to estimate under-reporting

Status: in-progress | First online: 22-03-2020 | Last update: 01-04-2020

This study has not yet been peer reviewed.

Aim

To estimate the percentage of symptomatic COVID-19 cases reported in different countries using case fatality ratio estimates based on data from the ECDC, correcting for delays between confirmation-and-death.

Methods Summary

  • In real-time, dividing deaths-to-date by cases-to-date leads to a biased estimate of the case fatality ratio (CFR), because this calculation does not account for delays from confirmation of a case to death, and under-reporting of cases.

  • Using the distribution of the delay from hospitalisation-to-death for cases that are fatal, we can estimate how many cases so far are expected to have known outcomes (i.e. death or recovery), and hence adjust the naive estimates of CFR to account for these delays.

  • The adjusted CFR does not account for under-reporting. However, the best available estimates of CFR (adjusting or controlling for under-reporting) are in the 1% - 1.5% range. We assume a baseline CFR, taken from a large study in China, of 1.38% (95% crI: 1.23–1.53%)[1]. If a country has an adjusted CFR that is higher (e.g. 20%), it suggests that only a fraction of cases have been reported (in this case, \(\frac{1.38}{20} = 6.9\%\) cases reported approximately).

Current estimates for percentage of symptomatic cases reported for countries with greater than ten deaths

Figure

Figure 1: Plotting the estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates. Blue shading is the 2.5% - 97.5% confidence range. Note that there is a mean delay of 13 days between confirmation and death, and so these estimates reflect the percentage of cases being reported as of around two weeks ago.

Table

Country Percentage of cases reported (95% CI) Total cases Total deaths
Albania 9.1% (5.2% - 18%) 243 15
Algeria 6.8% (4.6% - 11%) 584 35
Andorra 15% (7.8% - 31%) 376 12
Argentina 19% (11% - 32%) 966 24
Australia 100% (69% - 100%) 4707 20
Austria 42% (32% - 56%) 10182 128
Belgium 7.4% (6.2% - 8.8%) 12775 705
Bosnia and Herzegovina 13% (7.1% - 28%) 413 12
Brazil 11% (8.9% - 14%) 5717 201
Burkina Faso 8.7% (4.7% - 18%) 246 12
Canada 31% (23% - 42%) 8536 96
Chile 89% (46% - 100%) 2738 12
China 33% (29% - 38%) 82295 3310
Colombia 24% (13% - 46%) 906 16
Czech Republic 53% (33% - 86%) 3308 31
Denmark 20% (15% - 27%) 2860 90
Dominican Republic 7.1% (5% - 10%) 1109 51
Ecuador 14% (9.9% - 19%) 2302 79
Egypt 9.7% (6.6% - 15%) 656 41
Finland 48% (27% - 92%) 1384 17
France 6.9% (6% - 7.9%) 52128 3523
Germany 47% (39% - 56%) 67366 732
Greece 16% (11% - 24%) 1314 49
Hungary 15% (8.6% - 29%) 492 16
India 17% (11% - 27%) 1397 35
Indonesia 5.3% (4.1% - 6.9%) 1528 136
Iran 9.9% (8.5% - 11%) 44606 2898
Iraq 7% (4.9% - 10%) 694 50
Ireland 20% (15% - 29%) 3235 71
Israel 100% (60% - 100%) 4916 20
Italy 5.8% (5.1% - 6.5%) 105792 12430
Japan 26% (18% - 38%) 1953 56
Lebanon 26% (14% - 55%) 463 12
Luxembourg 48% (29% - 84%) 2178 23
Malaysia 46% (30% - 72%) 2626 37
Mexico 17% (11% - 27%) 1215 29
Morocco 6% (4.1% - 9.2%) 617 36
Netherlands 5.9% (5% - 6.9%) 12595 1039
Norway 100% (63% - 100%) 4447 28
Pakistan 37% (22% - 62%) 2039 26
Panama 17% (11% - 28%) 1181 30
Peru 16% (10% - 26%) 1065 30
Philippines 7.4% (5.5% - 10%) 2084 88
Poland 30% (19% - 48%) 2311 33
Portugal 18% (14% - 23%) 7443 160
Romania 12% (8.9% - 17%) 2245 69
Russia 40% (23% - 76%) 2337 17
San Marino 7.7% (4.9% - 13%) 230 26
Serbia 14% (8.8% - 25%) 900 23
Slovenia 40% (21% - 83%) 814 13
South Korea 69% (53% - 90%) 9786 163
Spain 5.4% (4.7% - 6.1%) 94417 8189
Sweden 14% (11% - 18%) 4435 180
Switzerland 24% (20% - 30%) 16108 373
Turkey 15% (12% - 19%) 15422 214
Ukraine 11% (5.9% - 22%) 549 13
United Kingdom 5.4% (4.6% - 6.2%) 25150 1789
United States of America 16% (14% - 19%) 189618 4079

Table 1: Estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates based on case and death timeseries data from the ECDC. Total cases and deaths in each country is also shown. Confidence intervals calculated using an exact binomial test with 95% significance.

Adjusting for outcome delay in CFR estimates

During an outbreak, the naive CFR (nCFR), i.e. the ratio of reported deaths date to reported cases to date, will underestimate the true CFR because the outcome (recovery or death) is not known for all cases [2]. We can therefore estimate the true denominator for the CFR (i.e. the number of cases with known outcomes) by accounting for the delay from confirmation-to-death [4].

We assumed the delay from confirmation-to-death followed the same distribution as estimated hospitalisation-to-death, based on data from the COVID-19 outbreak in Wuhan, China, between the 17th December 2019 and the 22th January 2020, accounting right-censoring in the data as a result of as-yet-unknown disease outcomes (Figure 1, panels A and B in [5]). The distribution used is a Lognormal fit, has a mean delay of 13 days and a standard deviation of 12.7 days [5].

To correct the CFR, we use the case and death incidence data to estimate the number of cases with known outcomes [3,4]:

\[ u_{t} = \frac{\sum_{i = 0}^t \sum_{j = 0}^{\infty} c_{i-j} f_j}{\sum_{i = 0}^t c_i}, \]

where \(u_t\) represents the underestimation of the known outcomes [2–4] and is used to scale the value of the cumulative number of cases in the denominator in the calculation of the cCFR, \(c_{t}\) is the daily case incidence at time, \(t\) and \(f_t\) is the proportion of cases with delay of \(t\) between confirmation and death.

Approximating the proportion of symptomatic cases reported

At this stage, raw estimates of the CFR of COVID-19 correcting for delay to outcome, but not under-reporting, have been calculated. These estimates range between 1% and 1.5% [1,4,6]. We assume a CFR of 1.38% (95% crI 1.23% - 1.53%), taken from a recent large study [1], as a baseline CFR. We use it to approximate the potential level of under-reporting in each country. Specifically, we perform the calculation \(\frac{1.38\%}{\text{cCFR}}\) of each country to estimate an approximate fraction of cases reported.

Limitations

Implicit in assuming that the under-reporting is \(\frac{1.38\%}{\text{cCFR}}\) for a given country is that the deviation away from the assumed 1.38% CFR is entirely down to under-reporting. In reality, burden on healthcare system is a likely contributing factor to higher than 1.38% CFR estimates, along with many other country specific factors.

The following is a list of the other prominent assumptions made in our analysis:

  • We assume that people get tested upon hospitalisation. A few examples where this is not the case are Germany and South Korea, where people can get tested earlier.

  • We assume that hospitalisation to death from early Wuhan is representative of the all the other countries (by using the distribution parameterised using early Wuhan data) and that all countries have the same risk and age profile as Wuhan.

  • Severity of COVID-19 is known to increase with age. Therefore, countries with older populations will naturally see higher death rates. We are extending this analysis to adjust for the age distribution for countries with more than five reported deaths and where age distribution data is available.

  • All results are linked and biased by the baseline CFR, assumed at 1.38% [1].

  • The under-reporting estimate is very sensitive to the baseline CFR, meaning that small errors in it lead to large errors in the estimate for under-reporting.

Code and data availability

The code is publically available at https://github.com/thimotei/CFR_calculation. The data required for this analysis is a time-series for both cases and deaths, along with the corresponding delay distribution. The data is taken from ECDC, using the NCoVUtils package [7].

References

1 Verity R, Okell LC, Dorigatti I et al. Estimates of the severity of covid-19 disease. medRxiv 2020.

2 Kucharski AJ, Edmunds WJ. Case fatality rate for ebola virus disease in west africa. The Lancet 2014;384:1260.

3 Nishiura H, Klinkenberg D, Roberts M et al. Early epidemiological assessment of the virulence of emerging infectious diseases: A case study of an influenza pandemic. PLoS One 2009;4.

4 Russell TW, Hellewell J, Jarvis CI et al. Estimating the infection and case fatality ratio for covid-19 using age-adjusted data from the outbreak on the diamond princess cruise ship. medRxiv 2020.

5 Linton NM, Kobayashi T, Yang Y et al. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data. Journal of Clinical Medicine 2020;9:538.

6 Guan W-j, Ni Z-y, Hu Y et al. Clinical characteristics of coronavirus disease 2019 in china. New England Journal of Medicine 2020.

7 Abbott S MJ Hellewell J. NCoVUtils: Utility functions for the 2019-ncov outbreak. doi:105281/zenodo3635417 2020.