The course of a pandemic – epidemiological statistics in times of (describing) a crisis, pt. 2

Modelling approaches

This is the second blog entry on existing key indicators used in the current pandemic. See here for the first one, which deals with non-modelling approaches.

Time dependent case reproduction number R(t)

A few weeks into the pandemic, the RKI switched its main reporting indicator from the absolute case numbers to the time dependent case reproduction number R(t). R(t) represents the number of unaffected persons that are infected by an index case. More specifically, it is the expected number of secondary cases per primary case in a completely susceptible population (1).

R(t) > 1 means that one index case infects more than one other person, resulting in a spread of the virus, whereas R(t) < 1 translates into containment of the virus, as one case infects less than one other person.

A few assumptions have to be made in order to calculate R(t):

The time from infection to the first signs of symptoms,
The time from infection to being infectious, and
The time from infection to infecting other people.

The RKI estimates the time from infection to first symptoms to be around 5 days, the time from infection to being infectious around 3 days (which results in two days during which an affected person is infectious, but might not suspect he/she is sick him/herself), and the time from infection to infecting other people around 4 days (generation time). With those assumptions (the latter one being subject to change over time as it can be influenced by political restrictions) the RKI calculates the time dependent case reproduction number R(t). It has to be kept in mind that R(t) is hence not a measure of one single day, but covers a four-day period and can only be calculated in hindsight (2).

There is more than one method to calculate R(t), with only small differences between them, however, the method used by the RKI has an inherent bias, which might be relevant when R(t) raises above 1 after restrictions are lifted (1). Another obstacle in calculating R(t) is the time lag of case reporting by local health authorities. In times of a pandemic, real-time surveillance is warranted, specifically when evaluating the effects on interventions put in place. Hence, reported case counts ideally are adjusted for occurred-but-not-yet-reported events.

Therefore, the RKI uses a methodology called nowcasting (2), which is based on a model originally developed for the outbreak of EHEC in Germany in 2011 (3, 4).

Nowcasting

As Günther et al. put it, the “basic idea of nowcasting is to estimate the reporting delay based on observations where both, the symptom onset and the reporting date are known” (5). The actual number of cases can be inferred for a specific day based on the reported number of cases on that day given the onset delay distribution. Taking time-dependent changes of the correction distribution into account, the information of reported cases was put in context of cases within a time frame of 7 days of disease onset. Those 7 days were chosen for SARS-CoV-2 by the RKI as it yields stable results and flattens the differences between weekdays.

Nowcasting first uses imputation methods in order to estimate missing data on disease onset (assuming missing-at-random) and secondly uses the information on reporting data (which is available for all cases) and the date of disease onset (partly available and partly imputed).

Excess mortality

One indirect way of estimating how a disease affects mortality in a population is the indicator of excess mortality. For that, the observed numbers of deaths are compared with the expected number of deaths, based on the background (i.e. pre-COVID-19) mortality risks in the respective population. This indicator is somewhat broader than the above mentioned indicators, as it is not cause-specific.

For example, excess mortality would also include deaths that might not be directly related to SARS-CoV-2, but rather indirectly, such as possible higher suicide rates because of social distancing restrictions. Then again, with the current restrictions in place, one could also expect less deaths in other areas, such as traffic, which might dilute the results and complicate their interpretation. Furthermore, there are currently not many data sets on a country level with which excess mortality could be reliably calculated.

However, the concept of excess mortality can also be applied to political interventions, as conducted by researchers from the UK who showed how many deaths can be prevented/expected under various circumstances (6).

Where does that leave us?

As unsatisfying as it might sound: most of the epidemiological indicators available to describe the severity of the pandemic can only be calculated with satisfying validity and reliability in hindsight. From a political perspective, it is reasonable to rely on some measures already in the current acute phase of the pandemic, as interventions (such as physical distancing, wearing masks) cannot be evaluated otherwise.

However, communicating those indicators should come along with a description of their inherent limitations, as they might otherwise result in either triggering false hope or hysteria.

References

Höhle M and an der Heiden M. Effective reproduction number estimation 2020.
Abrufbar unter: https://staff.math.su.se/hoehle/blog/2020/04/15/effectiveR0.html
An der Heiden M and Hamouda O. Schätzung der aktuellen Entwicklung der SARS-CoV-2- Epidemie in Deutschland – Nowcasting. Epidemiologisches Bulletin. 2020;2020(17):10–5.
LMU. Näher an der Wahrheit. 2020.
Höhle M and an der Heiden M. Bayesian nowcasting during the STEC O104:H4 outbreak in Germany, 2011. Biometrics. 2014;70(4):993-1002.
Günther F, Bender A, Katz K, Küchenhoff H and Höhle M. Nowcasting COVID19. 2020.
Banerjee A, Pasea L, Harris S, Gonzalez-Izquierdo A, Torralbo A, Shallcross L, et al. Estimating excess 1-year mortality associated with the COVID-19 pandemic according to underlying conditions and age: a population-based cohort study. The Lancet.