Primary author: Olivia Boyd

Olivia Boyd, Lily Geidelberg, David Jorgensen, Manon Ragonnet, Igor Siveroni, Erik Volz and the Imperial College COVID-19 Response Team

Report prepared on 2020-05-01

Key points

  • This is a preliminary analysis based on a small number of genetic sequences sampled in Auvergne-Rhone-Alps (ARA) uploaded to gisaid before 2020-05-05.
  • Effective sample sizes for several parameters are very small and results could change with further analysis
  • We estimate a number of key epidemiological parameters based on the genetic diversity of these samples alongside a set of closely related sequences from elsewhere which act as a global reservoir.
  • We estimate a high initial R (R0) of 4.25 which falls to 3.57 [CrI 0.0331-4.66] by 2020-03-24. Note the credible interval falls below 1 by this point. As only a small number of sequences are available it is difficult to estimate changes in R(t) over time and these estimates are likely to change with more sequence data.
  • We estimate that approximately 0.5% of the ARA population are infected by date of last sample, 2020-03-24.

This is analysis is based on :

  • 49 whole genomes sampled from within ARA
  • 70 whole genomes sampled from outside of ARA
  • Samples within ARA were collected between 2020-03-03 and 2020-03-24

These numbers will differ from the number of uploaded sequence on GISAID as we remove sequences with likely sequencing errors or significant gaps. Sequences were deduplicated prior to analyses. Figure 1 shows the distribution of sequences included in analyses over time, including both endogenous (gold) and exogenous sequences (grey) used in the sample. Due to the lack in metadata linked to sequences, there may be some sampling bias in sampling strategies used to collect sequences that we are unaware of (eg. contact tracing, non-random samples in hospitals). This could mean samples are more closely related than would be expected from a random sample in the population and could lead to bias in our estimates.

Using a phylodynamic model we estimate epidemiological parameters by fitting SARS-CoV 2 sequence data from ARA together with a background set of sequences sampled from the larger internationational viral population to the model developed. The model is explained in detail here. Daily reported case numbers from ARA for comparison are extracted from Sante Publique France, the Ministry of Health, which hosts a site for official daily statisitcs on COVID-19 testing .These data are used solely for comparison and as such, do not affect any aspects of the analysis described.

plot of chunk sampling dist

Figure 1: Sampling distributions over time of number of sequences included within the region versus sequences included from the international reservoir.

Estimated infections

In this preliminary analysis, we estimate 41850 [10028-332484] median [95%CI] cumulative infections at last sample (2020-03-24) by fitting Sars-Cov2 sequence data to our phylogynamic model mentioned prior. The cumulative confirmed infections reported from testing was 1991 on the date of last sample. As such, We estimate about 0.5% of the population of ARA to be infected by the date of our last sample (2020-03-24).

Figures 2 and 3 show cumulative and daily estimated infections respectively. Towards the end of the curve the increase in daily number infections each day begins to stabilise. This may be due to the shutting of all schools on 2020-03-16 and the implementation of lockdown on [2020-03-17 across all of France, including the ARA regiong] ( . However, further anlayses are required to assess the impact of non-pharmaceutical interventions on transmission in this region.

plot of chunk Cumulative estimated infections through time

plot of chunk Cumulative estimated infections through time log scale

Figure 2: Estimated cumulative infections through time represented by solid black line (median) and 95% CrI (ribbon). Black points represent reported cases in ARA. The dashed line indicates the date of last sample in ARA in this analysis.

plot of chunk daily estimated infections through time

plot of chunk daily estimated infections through time log scale

Figure 3: Estimated daily infections through time represented by solid black line (median) and 95% CrI (ribbon). Black points represent reported cases in ARA. The dashed line indicates the date of last sample in ARA in this analysis.

Figure 4 shows estimates of cumulative reporting rate over time. We estimate reporting rate for ARA over time by comparing our model estimates to reported case data. These estimates have high uncertainty due to the small number of local sequences used in this analysis.

plot of chunk reporting

Figure 4: Estimated percentage of daily cases reported in ARA. Error bars represent the 95% credible interval.

We estimate the reproduction number to remain relatively stable prior to non-pharmaceuticial intervention measure being put in place. Following the introduction of lockdown in ARA, the median of the credible interval drops below 1 by 2020-04-10, suggesting R is likely to have decreased quite significantly by this point. As we only use 49 internal genome sequences for analyses, we expect the confidence intervals to be quite wide in our analyses. We do expect R to decrease following introduction of lockdown in the region, however it is likely we need more sequence data (greater than 49 sequences and later than a week following lockdown) to more accuratly estimate the relationship between lockdown and reduction in transmission.

plot of chunk Rt

*Figure 5: Reproduction number through time. The black vertical dashed line indicates the date of last sample in ARA in this analysis. Orange and red dashed lines indicate dates of school closure and general lockdown in ARA, respectively. *

Reproduction number at last sample (2020-03-24): 3.57 [0.0331-4.66] median [95% CrI]

How quickly has the epidemic in ARA grown?

Quantile Reproduction number Growth rate (per day) Doubling time (days)
50% 4.25 0.270 2.57
2.5% 3.55 0.224 1.73
97.5% 6.63 0.401 3.09

Table 1. Reproduction number, growth rate and doubling times

How has SARS-CoV 2 evolved in ARA?

We present here a time scaled phylogeny of SARS-CoV-2 in ARA and the included international reservoir (Figure 6). ARA sequences tend to cluster closely together suggesting that the majority of sequenced SARS-CoV-2 results from transmission within the region rather than introductions from elsewhere. This could be the result of significant local transmission or due to a local sampling strategy targeted at interconnected individuals.

plot of chunk mcc_tree

Figure 6: Time scaled phylogeny co-estimated with epidemiological parameters. The colour of the tips corresponds to location sampling; red tips were sampled from within ARA, blue tips were sampled from outside the reiong.

Molecular clock rate of evolution: 0.00102 [0.000867-0.00119] median [95% CrI]

Methods summary

Details on methods and priors can be found here.

Statistic mean ESS
posterior -42778 26
likelihood -42730 139
prior -47.96 25
treeLikelihood.algn -42730 139
TreeHeight 0.3145 104
clockRate 0.001021 165
kappa 6.789 11831
PhydynSEIR -3.53 26
seir.E 11.55 108
seir.S 1867540 21
seir.b 29.03 26
seir.exog 0.00175 98
seir.exogGrowthRate 32.06 27
seir.importRate 8.466 124
seir.p_h 0.2449 41
seir.tau 74.79 302
freqParameter.1 0.2979 4195
freqParameter.2 0.1827 4762
freqParameter.3 0.1948 4601
freqParameter.4 0.3246 3970

Table 2. Effective sample size of model parameters

Model version: seijr0.1.0

Report version: 20200501-163346-17284927


This work was supported by the MRC Centre for Global Infectious Disease Analysis at Imperial College London.

Sequence data were provided by GISAID and these laboratories.