Wavelets, Delta and COV19, an exercise in time series analysis

In this jupyter notebook, I will try to estimate how much more infectious Delta is than the 'original' COV-19 strain. Here, I define 'original' as the ensemble of viruses that spread from 2020 through June 2021.

The approach will be as follows:

  1. Download the NYT database, and normalize the COVID cases to each state's population.
  2. For each state, I will find the new cases reported each day -- this data oscillates wildly and needs to be smoothed. Though most analyses these days are done by taking the rolling average, another approach is to use wavelets to identify the sources of variation and smooth them out. I used the pywt package to do this.
  3. Calculate a poor man's $R_{eff}$ by dividing the total cases at week t by the cases reported at week t-1.
  4. Cluster the states according to their $R_{eff}$ behavior by finding the K-nearest neighbors for each state, then doing leiden clustering.
  5. Identify the peak $R_{eff}$ for the original strain (prior to July 2021), and identify the peak $R_{eff}$ for delta (posterior to July 2021) for each cluster, compare and contrast.

Smoothed case estimates

Smoothed Death Estimates