Modeling COVID-19 for Li ing Non-Pharmaceutical Interventions

Asa result of theCOVID-19worldwidepandemic, theUnitedStates institutedvariousnon-pharmaceutical interventions (NPIs) in an e ort to slow the spread of the disease. Although necessary for public safety, these NPIs can also have deleterious e ects on the economy of a nation. State and federal leaders need tools that provide insight into which combination of NPIs will have the greatest impact on slowing the disease and at what point in time it is reasonably safe to start li ing these restrictions to everyday life. In the present work, we outline a modeling process that incorporates the parameters of the disease, the e ects of NPIs, and the characteristics of individual communities to o er insight into when and to what degree certain NPIs should be instituted or li ed based on the progression of a given outbreak of COVID-19. We apply the model to the 24 county-equivalents of Maryland and illustrate that di erent NPI strategies can be employed in di erent parts of the state. Our objective is to outline a modeling process that combines the critical disease factors and factors relevant to decision-makers whomust balance the health of the population with the health of the economy.


Introduction
. In December of , a cluster of pneumonia cases of unknown origin were identified in Wuhan, China. An investigation into the cases commenced in early January that led to the discovery of a novel coronavirus now designated SARS-CoV-. The virus causes an infectious disease now known as Coronavirus Disease this result for the SARS outbreak of , noting that outbreaks of SARS occurring nearly simultaneously in di erent parts of Canada had vastly di erent progressions. In order to have a practically useful model of disease progression, we must therefore estimate the degree distribution of the contact network of each community. Fortunately, the U.S. Census conducts an extensive annual survey known as the American Community Survey (ACS) (Census ). The ACS provides county level information on characteristics of the population such as households, household size, school enrollment, occupation, and age distribution. Meyers et al. ( ) outlined a process for taking comparable data (their study focused on Vancouver, Canada) and inferring a network structure based on a few simplifying assumptions. Here we design a similar algorithm using the ACS data to create county-level social contact networks that are likely to be representative of human-to-human contact before the outbreak of COVID-and the associated NPIs.
. Meyers et al. ( ) assumed that individuals living in the same household would have physical contact with probability p h = 1.0. That is, every vertex in a household has an edge with every other vertex in the same household. Once the household subgraphs are created, the population is then assigned to schools based on their age group and school enrollment data, and workplaces based on occupation data. Individuals who attend the same school have a physical contact with probability p s = 0.3 and those who work together have contact with probability p o = 0.03. Finally, people have friends, go to restaurants and stores, and otherwise interact socially. These public contacts occur with probability p p = 0.003 and can involve any two individuals in the community. These parameters were also taken from Meyers et al. ( ) and we found empirically that they work well for the counties we studied. The full algorithmic statement is shown in Figure , but the basic premise is straightforward. Using the necessary values from the ACS, the contact network for each county-equivalent is built incrementally as follows: . Create a vector of vertices distributed according to the ACS age brackets . Group the vertices into households according to the ACS data . Assign edges between members of a household with probability p h . For each school level, assign edges between pairs of appropriate vertices with probability p s . For each occupation assign edges between pairs of appropriate vertices with probability p o . For all vertices assign edges between pairs with probability p p Input: Data values taken from American Census Survey 2020 Repeat for each county-equivalent: a = {0 − 5, 5 − 9, . . . , 80 − 84, 85+} defines the age brackets a > 65 = the number of people over 65 living alone a 18−65 = the number of people 18 to 65 living alone N a N a N a = vector of vertices distributed by age H = the total number of households HS = average household size mc = the number of married couples mc 18 = the number of married couples with 1 or more children under 18 sg 18 = the number of single adults with 1 or more children under 18 S S S = school enrollment by level {Nursery, Elementary, High, College} O O O = occupation numbers by industry sector p h = probability of contact between 2 people in same household p s = probability of contact between 2 people in same school level p o = probability of contact between 2 people in same occupation p p = probability of contact between 2 randomly chosen people in public e e e ← random edge e e e = (e 1 , e 2 ) selected from where p ∈ P s P s P s and every p is used at least once . . , n 2 p p } and n 1,i ∈ N 1 N 1 N 1 and n 2,i ∈ N 2 N 2 N 2 The Agent-based model .
The ABM was built in the NetLogo multi-agent programmable modeling environment version . . (Wilensky ). The basic model components are: • the disease progression model • the population of agents and their social contact structure • the environment consisting of schools, homes, hospitals, workplaces, and public venues, and • the logic of NPIs, testing, and contact tracing.
. The disease progression model was designed to follow the dynamics of COVID-as they were understood as of March of . Based on the classic Susceptible, Exposed, Infected, Recovered (SEIR) model, the agents move through discrete states as the virus runs its course. The disease states include Susceptible, Exposed, Mild, Severe, Critical, Deceased, and Recovered. When the model is initiated, most agents are instantiated as Susceptible except for a small number of user-specified number of agents who are initiated as Exposed in order to generate the outbreak. These few exposed agents progress to either Mild or Severe a er the parameterized time duration of the Exposed state, at which point they can infect other agents with whom they come into contact. The duration of each disease state is a parameter that can be overridden by the user, which facilitates flexibility and response to changes driven by emerging research in COVID-disease progression. The default dynamics are outlined in . We used a combination of published studies and pre-publication data on MedRxiv to establish the disease transmission parameters, but it should be noted that studies of COVID-are ongoing and these parameters should be updated at the time the model is being used for decision-making. For our purposes, these parameters worked well when fitting to the empirical data available at the time of our study. The dynamics of COVID-progression are both time-based and stochastic. That is, di erent states of the disease have well-documented time frames, but the chance of an agent progressing through a mild or severe case is stochastic. When agents are co-located, there is some chance that the disease will be transferred from one agent to another. At each time step agents that are contagious (those in the Mild, Severe, or Critical states) will look for another agent on the same patch (small piece of the environment). If there is at least one other agent on the patch contagious agents will attempt to pass the virus to one other agent. Successful passing of the virus is a function of a user-specified probability of successfully getting the virus on the other agent, a user-specified mitigation probability, and the health status of the target agent (if they are already sick, they will not get sicker). The first two probability parameters allow the user to simulate situations such as wearing masks, maintaining social-distance, and hand-washing.
. Once an agent becomes Exposed a er five days they will move to either the Mild state ( . chance) or Severe state ( . chance). If the agent transitions to the Severe state, then a er four days they will transition to either the Critical state ( . chance) or the Recovered state ( . chance). If on the other hand, the agent transitions to the Mild state from the Exposed state, a er six days they will either transition to Severe state ( . chance) or the Recovered state (. chance). Once in the Recovered state agents are no longer able to contract the disease. Agents in the Critical state are assumed to need breathing assistance and significant medical support; therefore, if the agent is unable to go to the hospital within three days of becoming critical they will die with probability . . Agents are contagious when they are in the Mild, Severe, or Critical states. The Mild state includes both symptomatic and asymptomatic agents and each agent has an attribute indicating which category they are in. It is assumed that agents in the Exposed state are not contagious. That is, the Exposed state represents an incubation period where the viral load in the agent is too low to infect others. This state is not the same as an asymptomatic state. Once agents progress to the Mild state they may be asymptomatic or symptomatic.
. The next component of the ABM is the environment, which includes both physical spaces and the initial social contact structure of the population. The model is instantiated with the inferred contact network of a U.S. county scaled to be approximately , agents. These contact structures are generated using U.S. Census data and the algorithm described in the previous section. The size of the ABM environment is then adjusted to approximate the population density of the county in question. The physical space is created from the aforementioned networks. Agents are assigned to a school, work, public venue, and homes. The locations are then placed in the modeling environment. They can be mixed, as one might see in an urban area where individuals live, work, shop, and learn in close proximity or home locations can be more separated as one might see in rural or suburban counties. .
The NPI component of the ABM is integrated into these physical locations and the agent behaviors. As the simulation runs, each agent spends one -hour time step at home and the next -hour time step at work, school, or a public venue as long as going to that venue is not prohibited by an NPI. The available NPIs include closing schools, closing work places, closing public venues, imposing social distancing requirements (i.e., stay home orders), and isolating individuals. Individual mitigation steps can also be modeled using the probabilities associated with disease transmission, as mentioned earlier. .
The final model component is the logic of testing and contact tracing. There are two alternative testing strategies incorporated into the ABM; random selection of everyone not in the hospital, or targeted testing of a specific percentage of symptomatic and asymptomatic agents. For the first alternative, a user-specified number of agents are randomly chosen with replacement to be tested whenever testing takes place. For the second alternative, at a given point in time a number of tests are made available. This number is divided between symptomatic and asymptomatic agents. A set of agents equal to or less than the number of allocated tests is then randomly chosen and tested. Testing accuracy includes a user-defined false negative rate, but false-positives are not modeled. Thus an agent who received a positive test has COVID-, but an agent receiving a negative test result is not guaranteed to be free of the disease. Symptomatic agents are defined as those with Mild, Severe, or Critical disease states. Agents in states Susceptible, Exposed, or Recovered are considered asymptomatic. All agents who are tested are placed in isolation for the duration of the testing period, which is a user-defined time parameter. Those who test positive are placed in isolation for days and, if applicable, contact tracing is performed.
. When agents are initialized they 'decide' to participate in contact tracing (opt-in) or not. This is done via a random draw against a user-specified parameter. Contact tracing can be triggered in two basic ways. The first is when a symptomatic individual arrives at the hospital and the second is through testing. If an agent who has opted-in tests positive, then contact tracing from that agent will commence.
. Each agent collects data on the other agents it comes into contact with. The model assumes that agents colocated on a patch that is . km on a side are likely enough to be in significant contact with each other as to warrant being considered in the contract trace. At the beginning of each time step, agents update their health status and then collect contact data in a first-in-first-out queue that is elements long ( days). Agents from the contact list deemed to have been in contact with an infected agent are told to isolate for days.

Verification and validation .
Verification and Validation (V&V) is an important aspect of using simulations for decision-making. The two concepts are o en conflated, so we based our verification and validation e orts on the established literature. Sargent ( ) defined verification as "ensuring that the computer program of the computerized model and its implementation are correct." That is, did the modelers build the model correctly? Sargent defined validation as "substantiation that a model within its domain of applicability possesses a satisfactory range of accuracy consistent with the intended application of the model." Stated another way, is the model useful for decisionmaking in the given domain? In order to determine if the model is built correctly we establish expectations for how the model will behave as it runs. But those expectations are driven by the domain being modeled. In this section we show a selection of results from our V&V experiments. .
In the context of the ABM and COVID-, the original form of the model is an extension of the canonical SEIR model where the assumptions of continuity are relaxed in favor of discrete agents and the concepts of behavioral changes induced by NPIs are incorporated. The final form of the model is a NetLogo representation of that logic. Verification begins with ensuring that logic was written correctly for the intended results. Beyond reviewing the code for errors, much of the verification process involves running the model under extreme settings to ensure the logic responds appropriately. For example, if no agents are initially set to the Exposed state, then no outbreak will ever occur and all the agents should remain in the Susceptible state for the duration of that run. Additionally, when we run multiple replications with di erent random number seeds we would expect the average of the time series from each disease state to become smoother with increased replications. Thus the averaged curves would behave similar to the standard SEIR models that are dominated by the decline in the Susceptible population and the increase in the Recovered and Deceased populations. Each curve from a single replication would be more rugged than one would expect from the canonical SEIR models due to the heterogeneous mixing facilitated by the contact network. When we sum the Mild, Severe, and Critical states to make a single Infected curve, this familiar set of smooth curves is indeed reproduced qualitatively by the ABM as illustrated in Figure . Note the dark lines in the figure represent the mean of simulation replications and the shaded areas represent one standard deviation above and below that mean. .
For the run generating Figure we incorporate the NPIs of social distancing, testing, and tracing during the model run and note that the peak is lower than the baseline no-NPI run and the curve is rougher due to the more heterogeneous nature of social contact, which is what social distancing is designed to induce.  . Finally, we compared results from the simulation with the actual case counts reported in Maryland. This is a good exercise for ensuring model results are realistic, but it should be noted that precise statistical matches are not expected. There is uncertainty due to testing, the timing of NPIs, the actual adherence to NPIs, and relative scale of our , person simulations and the actual populations of a given county. Nevertheless the results shown in Figure indicate that our model results reasonably represent the counties they are intended to model. As currently implemented the model contains a number of parameters. Most of the parameters are used to define the population and its structure. There are also a number of other parameters associated with the use of NPIs and a number of parameters directly associated with the spread of the disease. .
In our validation exercise the simulation demonstrated sensitivities one would expect to find in a model of this type. Disease spread was highly correlated to transmission probability and mitigation probability. Furthermore, testing made a significant di erence when it was coupled with a population that complied with isolation orders. No unexpected sensitivities were uncovered, but it is important to note that SARS-Cov-is a novel virus and studies are producing new insights on a near daily basis. Parameterizing the model should thus remain an evolving exercise.

Experiment Design
. To illustrate the utility of our modeling approach, we chose to model the counties of Maryland, including the independent city Baltimore. The experiment is designed to answer the question: what is the impact of a given percent of the population being tested and a given level of participation in contact tracing if NPIs are li ed days a er the onset of the pandemic? This is approximately the time frame that Maryland followed when li ing NPIs in reality. .
Maryland includes a mix of rural and urban counties. Baltimore and the suburbs of Washington, D.C. are the most populated areas with over Million people, while Kent County has only around , inhabitants (Census ). Using the U.S. Census ACS data and the algorithm described earlier, we constructed the contact structure graphs to represent the pre-NPIs state for initializing each simulation. The parameters used in constructing the graphs are shown in Table . We chose the number of agents for computational e iciency and the probabilities were taken from Meyers et al. ( ). These graphs were ingested into the ABM as the initial contact structure.
The degree distributions of each county graph are summarized by the means and standard deviations listed in Table and the kernel distribution plots show in Figure . JASSS, ( ) , http://jasss.soc.surrey.ac.uk/ / / .html Doi: . /jasss.  The full experiment consisted of three sets of simulation runs. The first was a baseline run of replications for each county with no NPIs or interventions. This scenario essentially represents the baseline course of the pandemic if no action were taken to slow the spread of disease. The second scenario instituted multiple NPIs days a er the start of each run and then partially li ed those NPIs days a er the start of each run. The NPIs were the closing of school and workplace venues, but not public venues. Social distancing was also enforced starting on day at % and then reduced to % a er days. When the NPIs were li ed a strategy of testing and contact tracing was instituted. For this particular set of experiments we employed one-step contact tracing. That is, agents who came in direct contact with an infectious agent and were participating in the program were traced. But agents who came in contact with an agent who was traced were not in turn traced. In this scenario only symptomatic people were tested and contact tracing commenced for those participating cases that tested positive. Five levels of testing and five levels of agreeing to isolate a er contact tracing were used for a full design of experiment consisting of settings and replications per county. The third scenario uses the same NPIs and timing as the second scenario, but employs a random testing strategy, which includes agents that are in the Susceptible, Exposed, Mild, or Recovered states. The same five levels of contact tracing participation were used, but the percent tested was varied across a set of higher testing levels. In our exploratory experimentation we found that higher percentages of random testing are required to achieve similar impact because the same quantity of tests uncovers fewer positive cases. This is because the symptomatic agents in the ABM only have COVID-. Testing symptomatic cases is therefore the same as testing agents with the disease. Table outlines the parameter settings for each of the three scenarios. Note that by setting social distancing to a high percentage but leaving venues open, we simulate the minimal interaction that occurs at essential businesses such as grocery stores.

Results
. In this section we review and compare the results of the di erent scenarios. It is important to note that these results do not represent a forecast of case counts or death rates. The purpose of this modeling approach is to provide insight into the relative impact that di erent NPI, testing, and tracing strategies will have when applied to areas of a particular state that have considerably di erent social contact structures.
. Table show the impact of population density and social contact structure. Each box-whisker plot represents the distribution of the maximum number of infections across the replications of the baseline scenario for a given county. Recall that the disease parameters are the same for each county and the populations are normalized to 10 4 . The two variables that change from county-to-county are only the social contact structure driven by the U.S. Census data, and the population density controlled by the size of the environment the agents have to move around in. The key insight is that any strategy the state of Maryland adopts will need to treat of the counties di erently. It is interesting to note that Harford County has roughly % fewer people than Montgomery County and Montgomery County is roughly % less dense than Baltimore City. Yet all three of these locations have an average maximum infected that is approximately four times larger than of the counties in rest of the state.

.
Next we analyzed the impact of a -day NPI strategy followed by a regime of symptomatic testing and contact tracing. From a broad perspective we can see in Figure that the NPIs and testing and tracing combine to reduce the total cumulative cases considerably. Here we can see that if each county has the ability to test % of its population daily then the overall number of cases can be drastically reduced. Also note that, for this level of testing and contact tracing, the dynamics in Baltimore City separate from the other counties, most likely due to its extremely high density. There is still a di erence from one county to the next due to the di erences in density and contact structure. We selected three of the counties to look at in greater detail that are notionally representative of high, medium, and low density locations. Figure : Cumulative cases for the baseline and symptomatic testing scenario ( . test, . optInRate) .
Baltimore City has the highest population density out of the locations. Harford is seventh out of in terms of density and Worcester is near the bottom. Figure and Tables , , and show the mean peak infections for each of the design points of the symptomatic testing scenario, along with the baseline mean peak infections for each of these counties. The combined impact of the NPIs, testing, and tracing is evident, but the variation in the impact is greatly a ected by the density and social contact structure of the county. It is less obvious from these plots whether di erent levels of testing or contact tracing have a significant e ect on the mean peak infections. Indeed, using a two-sample Kolmogorov-Smirnov test with α = 0.05 we can only reject the null hypotheses that the mean peak infections are the same when testing the baseline against the COVID-testing level of . and comparing that level of testing with any other level of testing for Baltimore and Harford Counties. In Worcester County all levels of testing are distinguishable from the baseline, but not from each other. Interestingly, the story changes when we analyze deaths in these three counties within the simulated days, as seen in Figure . The di erences between certain levels of testing becomes more pronounced. Again, applying a two-sample Kolmogorov-Smirnov test to the di erent pair-wise combinations of testing levels we can now di erentiate all pairs except the highest two levels in Baltimore and Harford County (α = 0.05). The levels remain indistinguishable in Worcester County. These results are important for decision-makers trying to allocate scarce resources, such as testing and contact tracing, across multiple regions of interest because the same results can be obtained in some places with fewer resources than in others.

.
If we focus on any one county, we can analyze the interaction e ects of di erent levels of testing and contact tracing, as well as if symptomatic or random testing produce di erent results. Recall that five of our design points between symptomatic testing and random testing overlap. Specifically, for all five levels of optInRate, the . % testing level is included in both experiments. Holding the optInRate constant at . , we can see in Figure that more than . % random testing is required to achieve the same results at . % symptomatic testing. This result may seem misleading at first because healthcare professionals agree that more testing and random testing of asymptomatic people is highly recommended. It is important to note that all symptomatic agents in the model are indeed infected with COVID-. So . % testing of symptomatic individuals is testing a large percentage of the infectious population. Conversely, the random testing regime is forced to distribute the tests across a mix of infectious and non-infectious people. Since the non-infectious population is larger when NPIs are employed, the diluted number of true-positive tests makes the random regime appear less e ective at higher levels of testing. In actuality, this reinforces the message of increased random testing. We know that many COVID-cases o en exhibit few symptoms even though the individuals are infectious. These individuals are less likely to submit for testing because they might not even know they are sick. Increased levels of random testing provides greater opportunity of finding and isolating those cases -as illustrated by the model results -but that also means a greater level of testing is required to actually find those who are infected. It is also important to note that random testing is required to estimate the prevalence of the disease.

Discussion
. In the present work, we illustrated a modeling approach for assessing alternate strategies of implementing and subsequently li ing non-pharmaceutical interventions in response to the COVID-pandemic. We underscored the previously-known result that social contact structure is a key factor in the size of an outbreak or pandemic and we illustrated how estimates of social contact structure combined with an agent-based model can be used to provide insight to decision-makers in the face of uncertainty. We summarized our findings in a limited design of experiments that focused on the counties and county-equivalents of the state of Maryland. We showed that the di erent counties of Maryland fall into at least two distinct categories in terms of risk of large outbreaks and illustrated how di erent levels of testing can be employed to the same e ect if the social contact structure is taken into account. It should be noted that the simulation is designed to enable the exploration of imposing and relaxing NPI strategies in a dynamic setting. For clarity of exposition, this initial e ort focused on the imposition of NPIs to fully suppress the spread of the disease. A possible future e ort could use the same model and input parameters to explore the optimal duration of a given NPI strategy or any combination of NPIs and duration.
. We believe our work contributes to the ongoing struggle to contain the pandemic in the following ways. First, it is designed to be used by local governments with limited resources. This is critical for countries that delegated the decision-making to local provinces or states. Second, it is designed to use readily-available data for any region that conducts a census. Third, it is designed to be easy to use to explore the use of NPIs and inform decisionmakers about how pandemic dynamics may change as a function of the timing of NPIs and the compliance of the population. .
It is important to note that no model or suite of models is a panacea. Ultimately decision-makers are forced to make a choice under uncertainty to protect both the health and the economic well-being of the population. The approach outlined here is designed to provide insight into the marginal impact of one NPI and testing strategy versus another. This approach should be utilized by the decision-makers in conjunction with empirical analysis of the current state of their county or region of interest. The model parameters or logic should be constantly updated and the models re-run with new information as it comes available. That is, this modeling approach is designed for use in real-time alongside decision-makers at the time decisions are being formulated and implemented. To that end, the analysis presented here should be taken as notional rather than indicative of what might or might not happen in Maryland over the coming months. .
During the writing of this report, unforeseen events extraneous to COVID-led to social unrest, protests, and riots in many major cities across the United States. Most of these locations were still operating under some level of restrictions to control the pandemic. Clearly, protests and riots bring people into close proximity and may ultimately prove to be super-spreading events. This sort of unpredictable event is not included in our model nor have we seen them included in the models we reviewed. This serves to underscore the di iculty and challenges of forecasting the progression of complex systems. Models and the insights they provide can help, but they are ultimately limited by assumptions. Decision-makers therefore require a combination of reliable data from their region of interest, rigorously designed models that make as few simplifying assumptions as possible, and ultimately the fortitude to make a decision in the face of uncertainty.