Correcting pandemic data analysis through environmental fluid dynamics

It is well established that the data reported for the daily number of infected cases during the first wave of the COVID-19 pandemic were inaccurate, primarily due to insufficient tracing across the populations. Due to the uncertainty of the first wave data mixed with the second wave data, the general conclusions drawn could be misleading. We present an uncertainty quantification model for the infected cases of the pandemic's first wave based on fluid dynamics simulations of the weather effects. The model is physics-based and can rectify a first wave data's inadequacy from a second wave data's adequacy in a pandemic curve. The proposed approach combines environmental seasonality-driven virus transmission rate with pandemic multiwave phenomena to improve statistical predictions' data accuracy. For illustration purposes, we apply the new physics-based model to New York City data.


I. INTRODUCTION
Modeling and analysis of global epidemiology are very challenging. 1 The COVID-19 pandemic has been widely spreading due to airborne virus transmission. [2][3][4][5] The coronavirus pandemic has impacted the economics and the environment at a global worldwide scale 6,7 since March 2020.
The daily number of infected cases, reported during the first wave of the pandemic, was inaccurate due to several factors: • Insufficient tracing across the population. • Inaccuracy of testing equipment. 8 • Lack of reproducible confirmation tests. • Inaccuracy of management by public and private health institutions.
Many online digital libraries, platforms, and institutions continue to utilize inaccurate data from the first wave. However, statistical comparisons between the first and second wave could lead to misleading conclusions for the reasons mentioned above.
The number of deaths is a more reliable source of information than the number of infected cases. This has been a topic of recent debate. 9, 10 Ioannidis et al. 9 investigated the second vs the first wave of COVID-19 deaths. From available data on the age vs the number of deaths, they correlated the second vs the first wave of COVID-19 deaths to the shifts in age distribution and nursing home fatalities. Graichen 10 investigated the difference between the first and the second (and third wave) of COVID-19 to form a German and European perspective. Other research works emerge along similar lines. Yue et al. 11 tried to estimate the actual size of a pandemic from surveillance systems. Stamatakis et al. 12 and Huang 13 investigated pandemic data reported in the United Kingdom to examine associations between lifestyle risk factors and mortality outcomes. The above is due to an unknown lack of representativeness that can affect the magnitude and direction of effect estimates.
We recently showed that two pandemic outbreaks would be inevitable due to environmental weather seasonality. 14 Our findings were based on high-fidelity, multiphase fluid dynamics, and heat and mass transfer simulations of airborne virus transmission. 15 We combined the simulation results with epidemiological modeling enhanced by a new airborne infection rate index (AIR) and meteorological data. 14 It is generally believed that the high-temperature and high-humidity environment is conducive to reducing the transmission rate of the new coronavirus. In our previous research, 14,15 we showed a complex mechanism associated with the effects of weather conditions on virus transmission. It incorporates combinations of three major parameters: relative humidity, temperature, and wind speed.
The daily infected cases' published data during the first wave is inaccurate and thus can be misleading. We develop a new uncertainty quantification model using environmental-climate data that corrects the daily number of infected cases during the first wave, thus improving the second and first waves' comparative analysis. For illustration purposes, we apply the new model to correct the first wave data reported for NYC between March 2020 and March 2021 for the daily number of infections.

II. MATHEMATICAL MODEL DEVELOPMENT
The pandemic's second wave data constitute a more reliable source of information than the pandemic's first wave's data recorded starting from early March 2020.
In March 2021, thus after one year's period of the COVID-19 pandemic, we know the following: • The order of magnitude of the total number of deaths from COVID-19 did not change significantly between the first and the second wave periods. 16 • The order of magnitude of the number of patients in intensive care units (ICUs) did not change importantly between the first and the second wave periods. 16 Given the above, one can define an accurate mortality rate, W, in an infected population as the following: Moreover, W 0 ðnÞ could be considered as an inaccurate representation of WðnÞ which is per unit time and depends on the wavenumber (n) of the pandemic such that Note that the inaccuracy of W 0 is placed only in I 0 c . n ¼ 1 and n ¼ 2 denote the first and second wave of the pandemic; I c and D c are the stationary cumulative number of infected cases and deaths due to the infection; b is the transmission rate of infection per unit time inside a general population N p . In other words, b represents the probability per unit time that a susceptible individual becomes infected.
Liu et al. 17 shed light on the role of seasonality in the spread of the COVID-19 pandemic. Dbouk and Drikakis 14 quantified the relationship between the weather seasonality and the transmission rate b through extensive fluid mechanics simulations for crucial weather parameters such as temperature, relative humidity and wind. They quantitatively showed how a weather-dependent b could produce two pandemic waves during one year. Our analysis and modeling approaches are based on the following premises: (1) The weather seasonality is a major driving force behind two pandemic waves occurred annually. (2) The virus fatal strength level did not change importantly between two similar seasons between the first and second wave of the pandemic, i.e., winter 2020 vs winter 2021. (3) The social behaviors and global restriction strategies did not change significantly between the two waves' periods (i.e., masks, social distance, lockdowns, etc.). (4) The age-pyramid of a country did not change significantly between the two pandemic waves.

FIG. 1. Weather-dependent transmission rate (b) in NYC between March 2020 and
March 2021 (one-year period). A maximum transmission rate of 0.5 per day means that the probability is P ¼ 1 (100%) for a susceptible individual to be infected in two days (1=0:5) due to the weather conditions (wind speed, temperature, and relative humidity) in a region. 14 (a) NYC weather history data with the hat symbol denoting daily weather data averaged per month. (b) Weather dependent transmission rate (airborne infection rate index (AIR ¼ b)) showing three trends denoted high, medium, and low separated by the respective threshold values 0.4 and 0.3.

Physics of Fluids
ARTICLE scitation.org/journal/phf Using the above, the mortality rate among the stationary cumulative number of infected individuals should be approximately the same in two consecutive waves of the pandemic, i.e., where Dt n ¼ t f n À t i n denotes the n th -wave period of a pandemic. The final time t f n is determined by where e is a positive infinitesimal value. It is worth noting that j @bðnÞ @t j ¼ j @WðnÞ @t j.
It is well established that the non-cumulative number of infected cases I 0 ð1Þ reported during the first pandemic wave is inaccurate, primarily due to the insufficient tracing across the population. Therefore, we aim to correct the inaccurate (old) data, I 0 ð1Þ, to obtain the more accurate (new) data, I(1), Substituting n ¼ 1 in Eq. (3) and using Eqs. (1) and (5) yield III. NYC CASE Figure 1 shows the weather-dependent transmission rate (b) in NYC between March 2020 and March 2021. A maximum transmission rate of 0.5 per day À1 , related to the coronavirus airborne concentration rate, means that the probability is P ¼ 1 (100%) for a susceptible individual infected in two days due to the weather conditions (wind speed, temperature, and relative humidity). 14 Figure 1(a) shows the NYC weather history data with the hat symbol denoting daily weather data averaged per month. Figure 1 Fig. 2(c)] driven by the force of weather seasonality in NYC. 14 Figure 3 shows the mortality rate in the infected population of NYC between 03 March 2020 and 03 March 2021, computed using Eq. (1) with n ¼ 1, and Eq. (2) with n ¼ 2. If both waves data were accurate, we should have Wð1Þ % Wð2Þ. W 0 ð1Þ is, however, different from Wð2Þ. Thus, any asymmetry between W 0 ð1Þ and Wð2Þ is a measure of uncertainty quantification associated with the first wave. The above prompt to correct the inaccurate daily number of infected cases I 0 ð1Þ reported during the first pandemic wave.
The corrected first wave data for the daily number of infected cases in NYC are shown in Fig. 4. One can see that the correction results in increasing the infected cases fourfold.

IV. CONCLUSIONS AND PERSPECTIVES
The coronavirus pandemic data for the daily number of infections vs the number of deaths during the first wave were incomplete. They lead to misleading conclusions if considered as a data reference. The data inaccuracy for the first wave was primarily due to insufficient tracing across the population. Unfortunately, various online digital libraries and platforms continue to adopt, host, and diffuse these inaccurate data from the first wave, followed by more accurate data of the daily number of infections from the second and subsequent waves.
We proposed a new fluid dynamics, physics-based uncertainty quantification, and correction model that rectifies the first wave data's inadequacy. As an illustration example, we applied the new model to correct the pandemic's first wave data for the daily number of infected cases reported in NYC, USA. The proposed model is limited to regions that witnessed more than one pandemic wave. It can be used to correct their first wave data reported for the daily number of infected cases.
Environmental temperature and humidity affect the ability of the virus to infect, but they are not themselves the decisive factor in preventing the spread of the virus. We cannot rely on seasonal temperature rises to suppress the epidemic. Instead, we should focus more on the formulation and implementation of active epidemic prevention control policies. Social protective measures such as social distancing and face masks will remain important during a pandemic.

ACKNOWLEDGMENTS
The authors would like to thank the Editor-in-Chief and Physics of Fluids staff for their assistance during the peer-review and publication of the manuscript.

APPENDIX A: MATHEMATICAL DERVATION
The reader can find the computational models and the associated assumptions in previous studies published by the authors. 3,14 For the carrier bulk multiphase fluid mixture, we have employed the compressible multiphase mixture Reynolds-averaged Navier-Stokes equations in conjunction with the k À x turbulence model in the shear-stress-transport formulation. The governing equations are detailed in many textbooks. 19,20 The models accounted for the concentration variation in saliva and predicted airborne virus concentration in expelled saliva droplets under different environmental conditions. From hundreds of computational fluid dynamics simulations, we developed a reducedorder model (ROM) as a new virus airborne infection rate (AIR) index that is directly proportional to the virus concentration rate (CR). AIR is employed to quantify the potential of airborne coronavirus survival under different climate conditions (average temperature, relative humidity, and wind speed) in several worldwide cities.

Airborne virus particles in saliva: Concentration Rate
For an initial uniform distribution of virus particles, the concentration, C, decreases in each airborne saliva droplet as a function of time at different proportions and different rates such that CR ¼ @ðC=C 0 Þ=@t þ 0:5: (A2)

Airborne Infection Rate
The CR is directly proportional to the virus survivability. It provides an appropriate indicator for the airborne transmission, which is defined as an "Airborne Infection Rate (AIR)": The CR values between 0 and 0.5 are bounded between 0 and 1 using the operator hÃi, which transforms a dimensional physical variable n into a dimensionless one denoted by n Ã such that n Ã ¼ n À minðnÞ maxðnÞ À minðnÞ ; where min and max are the minimum and maximum values of n, respectively. Many results are obtained for CR Ã at different weather conditions from several advanced computational fluid dynamics multiphase simulations. 14 Then, all the data points are well fitted as a function of the relative humidity (RH), the wind speed (U), and the temperature (T), where F is given by Note that CR Ã can be transformed back into CR using Eq. (A4) with minðCRÞ ¼ 0; maxðCRÞ ¼ 0:5.
In the Subsection, 2 we will show how CR can be incorporated into epidemiology models, e.g., through a weather-dependent transmission rate b.

Weather-dependent epidemiological model
The extensive high-fidelity simulations led to CR ¼ AIR as a function of T, RH, and U. We consider AIR as a good indicator for airborne virus transmission and proposed it as a flow physics relevant parameter in epidemiological models. 14 As a physics-based simulation model, we considered the standard a standard SIR model 21 given by b ¼ AIR is a physics-based weather-dependent parameter transmission rate, and c is the recovery rate coefficient that depends on the individual's health and immunity system. The beta and c parameters represent the probability per unit time that a susceptible individual becomes infected and the probability per unit time that an infected person becomes recovered and immunized. t is time, and N is the population number. S, I, and R are the number of susceptible, infected, and recovered individuals, respectively. b represents the probability per unit time that a susceptible individual becomes infected. c represents the probability per unit time that an infected person becomes recovered and immune.

DATA AVAILABILITY
The data that support the findings of this study are available on request from the authors.