Temporal variation in transmission during the COVID-19 outbreak

Status: In Progress | First online: 02-03-2020 | Last update: 04-04-2020

This study has not yet been peer reviewed.

* This analysis is now archived, please visit the updated version.

updated: 2020-04-04

Note: this is preliminary analysis, has not yet been peer-reviewed and is updated daily as new data becomes available. This work is licensed under a Creative Commons Attribution 4.0 International License. A summary of this report can be downloaded here

Summary

Aim: To identify changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting.

Latest estimates as of the 2020-03-19

Global map


Figure 1: Global map of the expected change in daily cases based on data from the 2020-03-19. Note: only country level estimates are shown.

Summary of latest reproduction number and case count estimates


Figure 2: Cases with date of onset on the day of report generation and the time-varying estimate of the effective reproduction number (bar = 95% credible interval) based on data from the 2020-03-19. Countries/Regions are ordered by the number of expected daily cases and shaded based on the expected change in daily cases. The dotted line indicates the target value of 1 for the effective reproduction no. required for control and a single case required fror elimination.

Reproduction numbers over time in the six countries with the most cases currently


Figure 3: Time-varying estimate of the effective reproduction number (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-19 in the countries/regions expected to have the highest number of incident cases. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.

Latest estimates summary table

Country/Region Cases with date of onset on the day of report generation Expected change in daily cases Effective reproduction no. Doubling time (days)
United States 1200 – 8956 Increasing 1.9 – 5.9 1.2 – Cases decreasing
Italy 1042 – 7984 Increasing 1 – 1.8 3.4 – Cases decreasing
Spain 723 – 5247 Increasing 1 – 2.4 1.8 – Cases decreasing
France 397 – 2886 Increasing 1 – 2.1 2.1 – Cases decreasing
Iran 290 – 2325 Unsure 0.8 – 1.4 3.3 – Cases decreasing
Germany 451 – 2056 Increasing 1.3 – 2.5 2.6 – Cases decreasing
United Kingdom 200 – 1468 Increasing 1.2 – 2.8 1.6 – Cases decreasing
Belgium 151 – 995 Increasing 1.1 – 3.2 0.2 – Cases decreasing
Switzerland 105 – 837 Likely increasing 0.9 – 2 1.5 – Cases decreasing
Netherlands 107 – 784 Increasing 1 – 2.4 0.49 – Cases decreasing
Austria 84 – 702 Increasing 1 – 2.4 1.5 – Cases decreasing
Portugal 82 – 547 Increasing 1.3 – 4.1 0.2 – Cases decreasing
Israel 51 – 346 Increasing 1.2 – 3.7 0.19 – Cases decreasing
Australia 37 – 332 Increasing 1.3 – 3.6 1.4 – Cases decreasing
Canada 49 – 328 Increasing 1.5 – 3.3 2.1 – Cases decreasing
Norway 27 – 262 Unsure 0.8 – 1.6 1.6 – Cases decreasing
Malaysia 42 – 259 Increasing 1.3 – 2.9 0.57 – Cases decreasing
Sweden 28 – 250 Unsure 0.7 – 1.4 2 – Cases decreasing
Czechia 27 – 222 Increasing 1 – 2.5 0.2 – Cases decreasing
South Korea 34 – 191 Unsure 0.6 – 1.3 8.5 – Cases decreasing
Romania 23 – 190 Increasing 1.1 – 3.2 0.17 – Cases decreasing
Ireland 19 – 185 Increasing 1.2 – 3.5 0.19 – Cases decreasing
Denmark 20 – 176 Decreasing 0.5 – 0.9 0.15 – Cases decreasing
Brazil 18 – 163 Increasing 1 – 2.4 0.2 – Cases decreasing
Philippines 22 – 151 Increasing 1.2 – 4.1 0.15 – Cases decreasing
Poland 12 – 124 Increasing 1 – 3 0.2 – Cases decreasing
China 23 – 123 Unsure 0.8 – 1.5 3.3 – Cases decreasing
Finland 10 – 119 Likely increasing 0.8 – 2.1 0.23 – Cases decreasing
Estonia 14 – 115 Likely increasing 0.9 – 2.8 0.15 – Cases decreasing
Japan 14 – 113 Likely decreasing 0.6 – 1.1 5.8 – Cases decreasing
Greece 9 – 100 Likely increasing 0.8 – 2.1 0.23 – Cases decreasing
Singapore 19 – 98 Increasing 1.3 – 2.5 2.3 – Cases decreasing
Iceland 8 – 92 Increasing 1 – 2.9 0.19 – Cases decreasing
Bahrain 4 – 71 Unsure 0.5 – 1.3 0.16 – Cases decreasing
China Excluding Hubei 5 – 71 Increasing 1 – 2.2 2.2 – Cases decreasing
Hubei 2 – 50 Decreasing 0.1 – 0.4 4.6 – Cases decreasing
Slovenia 3 – 48 Unsure 0.7 – 1.7 0.19 – Cases decreasing
Hong Kong 1 – 41 Likely increasing 0.9 – 2.7 0.24 – Cases decreasing
Qatar 2 – 34 Unsure 0.7 – 1.6 0.16 – Cases decreasing


Table 1: Latest estimates of the number of cases by date of onset, the effective reproduction number, and the doubling time for the 2020-03-19 in each region included in the analysis. Based on the last 7 days of data. The 95% credible interval is shown for each numeric estimate. China excludes Hubei.

Methods

Summary

  • Case counts by date, stratified by import status (local or imported), were constructed using the World Health Organization (WHO) situation reports and partial line-lists for each region [1,2].
  • Case onset dates were estimated using case counts by date of report and a distribution of reporting delays fitted to partial line-lists from each region considered where available.
  • Censoring of cases was adjusted for by assuming that the number of cases is drawn from a binomial distribution.
  • Time-varying effective reproduction estimates were made with a 7-day sliding window using EpiEstim [5,6] adjusted for imported cases and assuming an uncertain serial interval with a mean of 4.7 days (95% CrI: 3.7, 6.0) and a standard deviation of 2.9 days (95% CrI: 1.9, 4.9) [7].
  • Time-varying estimates of the doubling time were made with a 7-day sliding window by iteratively fitting an exponential regression model.

Limitations

  • All data used are at a national/regional level taken from WHO situation reports; diagnostic capabilities and testing protocols may vary in different parts of each country/region, adding uncertainty to the reported numbers. The true number of infections reflect in a given number of confirmed cases probably varies substantially geographically.
  • The estimated onset dates are based on available data for the delay from symptom onset to confirmation, which mostly stems from the early days of the outbreak. These data may not be representative of the underlying delay distribution.
  • The estimate of not-yet-confirmed cases to scale up recent numbers is uncertain and relies on the observed delays to confirmation to remain constant over the course of the outbreak.
  • Trends identified using our approach are robust to under-reporting assuming it is constant but absolute values may be biased by reporting rates. Pronounced changes in reporting rates may also impact the trends identified.
  • The reporting delay could not be estimated from line-list data for all regions. Region specific details are given in the individual regional reports.
  • Data on imported cases were only partically available, and even where available may not be fully complete. This may bias estimates upwards when overall case counts are low.
  • As our estimates are made at the date of symptom onset any changes in the time-varying parameters will be delayed by the incubation period.

Detail

Data

We used partial line-lists from each region that contained the date of symptom onset, date of confirmation and import status (imported or local) for each case [3] where available. The region reports give details of the steps taken where this data were not available. Daily case counts by date of report were extracted from the World Health Organization (WHO) situation reports for every location considered [1,2]. The case counts (and partial line-lists where available) were used to assemble the daily number of local and imported cases. Where the partial line-lists and case counts disagreed, it was assumed that the partial line-lists were correct and the WHO case counts were adjusted so that the overall number of cases occurring remained the same but the number of local cases being adjusted as needed.

Adjusting for reporting delays

Reporting delays for each country were estimated using the corresponding partial line-list of cases. The reporting delay could not be estimated from line-list data for all regions. Region specific details are given in the individual regional reports. The estimated reporting delay was assumed to remain constant over time in each location. We fitted an exponential distribution adjusted for censoring [8] to the observed delays using stan [9]. We then took 1000 samples from the posterior distribution of the rate parameter for the exponential delay distribution and constructed a distribution of possible onset dates for each case based on their reporting date. To prevent spuriously long reporting delays, we re-sampled delays that were greater than the maximum observed delay in the observed data.

To account for censoring, i.e. cases that have not yet been confirmed but will show up in the data at a later time, we randomly sampled the true number of cases (including those not yet confirmed) assuming that the reported number of cases is drawn from a binomial distribution, where each case has independent probability \(p_i\) of having been confirmed, \(i\) is the number of days of the symptom onset before the report maximum observed report delay, and \(p_i\) is the cumulative distribution of cases that are confirmed by day \(i\) after they develop symptoms. We did not account for potential reporting biases that might occur due to changes in the growth rate of the outbreak over time.

Statistical analysis

We used the inferred number of cases to estimate the reproduction number on each day using the EpiEstim R package [5]. This uses a combination of the serial interval distribution and the number of observed cases to estimate the reproduction number at each time point [11,12], which were then smoothed using a 7-day time window. We assumed that the serial interval was uncertain with a mean of 4.7 days (95% CrI: 3.7, 6.0) and a standard deviation of 2.9 days (95% CrI: 1.9, 4.9) [7]. We used a common prior for the reproduction number with mean 2.6 and a standard deviation of 2 (inflated from 0.5 found in the reference) [13]. Where data was available, we used EpiEstim to adjust for imported cases [6]. The expected change in daily cases was defined using the proportion of samples with a reproduction number less than 1 (subcritical). It was assumed that if less than 5% of samples were subcritical then an increase in cases was definite, if less than 20% of samples were subcritical then an increase in cases was likely, if more than 80% of samples were subcritical then a decrease in cases was likely and if more than 95% of samples were subcritical then a decrease in cases was definite. For countries/regions with between 20% and 80% of samples being subcritical we could not make a statement about the likely change in cases (defined as unsure).

We estimated the rate of spread (\(r\)) using linear regression with time as the only exposure and logged cases as the outcome for the overall course of the outbreak [14]. The adjusted R^2 value was then used to assess the goodness of fit. In order to account for potential changes in the rate of spread over the course of the outbreak we used a 7-day sliding window to produce time-varying estimates of the rate of spread and the adjusted R^2. The doubling time was then estimated using \(\text{ln}(2) \frac{1}{r}\) for each estimate of the rate of spread.

We report the 95% confidence intervals for all measures using the 2.5% and 97.5% quantiles. The analysis was conducted independently for all regions and is updated daily as new data becomes available. Confidence in our estimates is shown using the proportion of data that were derived using binomial upscaling. Code and results from this analysis can be found here and here.

Regional reports

United States

Summary


Figure 4: A.) Cases by date of report (bars) and estimated cases by date of onset. B.) Time-varying estimate of the effective reproduction number. Light grey ribbon = 95% credible interval. Dark grey ribbon = the interquartile range. Based on data from the 2020-03-19. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.

Estimate
Cases with date of onset on the day of report generation 1200 – 8956
Expected change in daily cases Increasing
Effective reproduction no. 1.9 – 5.9
Rate of spread -0.21 – 0.57
Doubling time (days) 1.2 – Cases decreasing
Adjusted R-squared -0.17 – 0.92


Table 4: Latest estimates of the number of cases by date of onset, the expected change in daily cases, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-19. Based on the last 7 days of data. The 95% credible interval is shown for each numeric estimate.

Time-varying rate of spread and doubling time


Figure 5: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-19. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.

Implementation details

Italy

Summary


Figure 7: A.) Cases by date of report (bars) and estimated cases by date of onset. B.) Time-varying estimate of the effective reproduction number. Light grey ribbon = 95% credible interval. Dark grey ribbon = the interquartile range. Based on data from the 2020-03-19. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.

Estimate
Cases with date of onset on the day of report generation 1042 – 7984
Expected change in daily cases Increasing
Effective reproduction no. 1 – 1.8
Rate of spread -0.042 – 0.2
Doubling time (days) 3.4 – Cases decreasing
Adjusted R-squared -0.14 – 0.99


Table 5: Latest estimates of the number of cases by date of onset, the expected change in daily cases, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-19. Based on the last 7 days of data. The 95% credible interval is shown for each numeric estimate.

Time-varying rate of spread and doubling time


Figure 8: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-19. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.

Implementation details

Spain

Summary