In this jupyter notebook, I will try to estimate how much more infectious Delta is than the 'original' COV-19 strain. Here, I define 'original' as the ensemble of viruses that spread from 2020 through June 2021.
The approach will be as follows:
pywt
package to do this.import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt
sys.path.append('./utils')
# see https://github.com/dangeles/dangeles.github.io/blob/master/jupyter/utils/{covid_utils, cfr}.py
import covid_utils as cv
import cfr
df = cfr.fetch_data()
smooth = cfr.smooth_active(df, level=3) # wavelet reconstruction, per state
pca = cfr.pca_plot(smooth, 'CasesPerPeriod', n_comps=4, plot=False)
groups = cfr.partition(pca, smooth) # leiden clustering
fig, ax = cfr.plot_partitions(smooth, 'CasesPerPeriod', groups,
pivot_kwargs={'transform': lambda x: x})
fig.suptitle('Weekly Recorded Cases', fontsize=20)
ax[0].set_ylabel('Daily Cases Per Million Inhabitants')
plt.tight_layout()
/Users/davidangeles/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py:4468: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy downcast=downcast,
pca = cfr.pca_plot(smooth, 'DeathsPerPeriod', n_comps=4, plot=False)
groups = cfr.partition(pca, smooth)
fig, ax = cfr.plot_partitions(smooth, 'DeathsPerPeriod', groups,
pivot_kwargs={'transform': lambda x: x})
fig.suptitle('Weekly Recorded Deaths', fontsize=20)
ax[0].set_ylabel('Daily Deaths Per Million Inhabitants')
plt.tight_layout()
All states, plotted jointly. Color indicates whether the virus is actively spreading or not.
# Reff values need to be smoothed out a bit, especially because at early times
# testing led to overly high estimates of R0 due to a lack of randomized tests
smooth.Reff = smooth.Reff.replace(np.inf, np.nan)
reff = smooth.Reff.values.copy()
reff[reff > 5]= 5
reff[reff < .6] = 0.6
smooth['Reff'] = reff
color = {True: 'tab:red', False: 'tab:blue'}
smooth['Spreading'] = (smooth.Reff > 1)#.map(color)
sns.scatterplot('date', 'Reff', hue='Spreading', data=smooth.reset_index(), units='state',
palette=color, s=5, estimator=None, alpha=0.3)
_ = plt.xticks(rotation=45)
/Users/davidangeles/opt/anaconda3/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation. FutureWarning
# only keep dates after the NYC / Boston waves:
pca = cfr.pca_plot(smooth.loc[smooth.index > dt.datetime(2020, 5, 1)], 'Reff', n_comps=8, plot=False)
groups = cfr.partition(pca, smooth.loc[smooth.index > dt.datetime(2020, 5, 1)])
fig, ax = cfr.plot_partitions(smooth.loc[smooth.index > dt.datetime(2020, 5, 1)], 'Reff', groups,
pivot_kwargs={'transform': lambda x: x})
fig.suptitle('Weekly Recorded Reff', fontsize=30)
for ai in ax:
ai.axhline(1, color='black', ls='--')
ai.set_ylim(0, 3)
plt.tight_layout()
colors = {0: 'red', 1: 'blue', 2: 'black', 3: 'yellow', 4: 'orange'}
dp = []
for s, g in smooth.groupby('state'):
group = groups.loc[s]
# find location of delta peak:
reff_delta = g.Reff.loc[g.index > dt.datetime(2021, 5, 1)].max()
# find location of cov19 original strain peak:
original = (g.index < dt.datetime(2021, 5, 1)) &\
(g.index > dt.datetime(2020, 7, 1))
reff_cov19 = g.Reff.loc[original].max()
# store:
dp += [(group, s, reff_delta, 'delta')]
dp += [(group, s, reff_cov19, 'cov19')]
delta = pd.DataFrame(dp, columns=['group', 'state', 'Reff', 'strain'])
delta.group = delta.group.astype('category')
sns.boxplot(x='Reff', y='strain', data=delta, color='gray')
sns.stripplot(x='Reff', y='strain', data=delta, hue='group',
jitter=False, dodge=True, s=10)
plt.legend(loc=(1, .2))
plt.xlabel('Maximal Recorded Reff')
plt.xlim(1, 5.1)
mean_Reff = delta.groupby(['strain', 'group']).Reff.median().groupby('strain').apply(np.mean)
print(mean_Reff)
strain cov19 1.667626 delta 1.901461 Name: Reff, dtype: float64
We can compare the raw $R_{eff}$ numbers:
worse = (mean_Reff.loc['delta'] / mean_Reff.loc['cov19'] - 1) * 100
m = 'Delta is {0:.3g}% more infectious than the original COV-19: R for Delta is {1:.2g}, R for SARS-Cov-2 is: {2:.2g} '
print(m.format(worse, mean_Reff['delta'], mean_Reff['cov19']))
Delta is 14% more infectious than the original COV-19: R for Delta is 1.9, R for SARS-Cov-2 is: 1.7
However, this would be naive. Let's pretend that $R_{eff} = R_0 * S * X$, where $R_0$ is the viral reproductive number at the beginning of the pandemic, in the absence of all social distancing measures; $S$ is the fraction of the total population susceptible to the virus, and $X$ represents governmental policies intended to dampen viral transmission. Clearly, $S \sim 1$ when the first COVID strains swept the US. By July 2021, however, most of the US has been immunized either through vaccines or infection. Now, it's as of yet unclear just how protective natural immunity is, and it appears that in a 2-dose mRNA regime (absent booster doses), vaccine efficacy against delta may not be ideal. Given that the US has measured about 40 million cases, and it's likely that COVID cases are under-diagnosed, this means at LEAST there have been 40 million cases in the US. At most, everyone may have had COVID once already -- taking the geometric mean of these two estimates suggests that ~30-40% of the US has already been infected. Around July 2021, something close to 40% of all americans had also been vaccinated.
If these two numbers are independent, then up to 80% of the US population was immunized before Delta came around. Let's assume that number is a little lower, maybe 50-60%.
In this case, rather than compare $R_{eff}$, we really should compare $R_0 *X$, assuming that governmental policies have not changed between the two timepoints we are comparing. Assuming the susceptible population for the original COVID-19 strain was 100%, and 50% for delta, then:
f = .5
worse = (mean_Reff.loc['delta'] / f / mean_Reff.loc['cov19'] - 1) * 100
print(m.format(worse, mean_Reff['delta'] / f, mean_Reff['cov19']))
f = .6
worse = (mean_Reff.loc['delta'] / f / mean_Reff.loc['cov19'] - 1) * 100
print(m.format(worse, mean_Reff['delta'] / f, mean_Reff['cov19']))
Delta is 128% more infectious than the original COV-19: R for Delta is 3.8, R for SARS-Cov-2 is: 1.7 Delta is 90% more infectious than the original COV-19: R for Delta is 3.2, R for SARS-Cov-2 is: 1.7
Oh wow. If this is true, then the $R_0$ for Delta is about twice that of the original SARS-Cov-2 virus that first landed in the US. That seems like a lot.
Finally, we know that the value of governmental and social interventions of the virus is such that the $R_0$ of the original SARS-Cov-2 was cut down by about 50-70% (the original Cov-2 virus is estimated to have a true $R_0$ of ~3, and interventions brought $R_0 \sim 1$). Therefore, $R_0$ for delta in the absence of governmental interventions would be...:
delta_R0 = mean_Reff.loc['delta'] / f / (1 - .6)
print('Delta R_0: {0:.2g}'.format(delta_R0))
Delta R_0: 7.9