Evaluation of Nowcasting for Real-Time COVID-19 Tracking — New York City, March–May 2020

To account for delays between specimen collection and report, the New York City Department of Health and Mental Hygiene used a time-correlated Bayesian nowcasting approach to support real-time COVID-19 situational awareness. We retrospectively evaluated nowcasting performance for case counts among residents diagnosed during March–May 2020, a period when the median reporting delay was 2 days. Nowcasts with a 2-week moving window and a negative binomial distribution had lower mean absolute error, lower relative root mean square error, and higher 95% prediction interval coverage than nowcasts conducted with a 3-week moving window or with a Poisson distribution. Nowcasts conducted toward the end of the week outperformed nowcasts performed earlier in the week, given fewer patients diagnosed on weekends and lack of day-of-week adjustments. When estimating case counts for weekdays only, metrics were similar across days the nowcasts were conducted, with Mondays having the lowest mean absolute error, of 183 cases in the context of an average daily weekday case count of 2,914. Nowcasting ensured that recent decreases in observed case counts were not overinterpreted as true declines and supported health department leadership in anticipating the magnitude and timing of hospitalizations and deaths and allocating resources geographically.


Introduction
Timeliness is a key attribute of surveillance systems for reportable infectious diseases (1,2). Timely COVID-19 surveillance data are used by governments and communities to allocate resources and to decide when to tighten or loosen physical distancing and other prevention measures (3,4). However, public health authorities track reportable diseases at a lag, given delays from infection to symptom onset, care seeking, specimen collection, laboratory testing, and reporting (5). Monitoring pre-diagnostic data sources (e.g., emergency department syndromic surveillance (6), internet searches and social media (7), participatory surveillance of self-reported symptoms (8), smart thermometers (9), etc.) can improve timeliness at the expense of specificity, such as an inability to distinguish increases in respiratory illness attributable to influenza from COVID-19. Another approach that preserves specificity when monitoring COVID-19 disease trends is to leverage partially reported disease data, formally accounting for data lags.
The terms "nowcasting," or predicting the present, and "hindcasting," or predicting through the day prior to the present, describe a wide range of statistical adjustments used to fill in cases that are not-yet-reported, offering health officials a more up-to-date picture for situational awareness (10). For example, researchers have assessed the potential to nowcast COVID-19 cases and deaths using Google Trends data available in near-real time (11), and have applied a range of modeling approaches that leverage reporting delays to estimate the number of not-yetreported cases and deaths (12,13). Using mathematical models to exploit COVID-19 transmission dynamics, nowcasting also has been extended to COVID-19 forecasting systems (14,15). In a majority of these approaches, the nowcasting mechanism relies on accurately estimating the distribution of reporting delays; however, infectious disease transmission contains for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . an important temporal component, in that incidence is correlated from one time point to the next, which has also been shown to be important in nowcasting (10). We describe the use and evaluation of a time-correlated Bayesian nowcasting approach at the New York City Department of Health and Mental Hygiene (NYC DOHMH) during the first epidemic wave of COVID-19 to support real-time situational awareness and resource allocation.

Reportable disease surveillance data
Persons tested. Clinical and commercial laboratories are required to report all results (including positive, negative, and indeterminate results) for SARS-CoV-2 tests for New York State residents to the New York State Electronic Clinical Laboratory Reporting System (ECLRS) (16,17). For NYC residents, ECLRS transmits reports to NYC DOHMH. These laboratory reports include specimen collection date and patient demographic information, including residential address.
For nowcasting persons newly tested, NYC DOHMH deduplicated laboratory reports, retaining the first report received ("report date") in ECLRS per person of a SARS-CoV-2 polymerase chain reaction (PCR) test. We retained the first specimen collection date for that associated test report date and the patient's ZIP code of residence at time of report.
ZIP codes are collections of points constituting a mail delivery route. The United States Census Bureau developed ZIP code tabulation areas, which are aggregates of census blocks, to provide an areal representation of ZIP codes. NYC DOHMH created a custom geography referred to as modified ZCTA (modZCTA) by merging ZCTAs with populations <3000 to an for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  (22), nowcasts were conducted for persons newly tested by PCR for SARS-CoV-2. Each outcome was nowcasted citywide and also stratified by modZCTA of patient residence, to support targeting of community-based resources. Hospitalized cases were also nowcasted stratifying by health care facility, to support allocating resources to hospitals.
To account for reporting delays and the shape of the outcome-specific epidemic curve, we applied the R package Nowcasting by Bayesian Smoothing (NobBS) (10,23) to data for specimens collected or diagnoses during the 3 weeks prior to the nowcast through the date prior to the nowcast, assuming a Poisson distribution. Briefly, this approach corrects for underestimation of cases in real-time caused by delays in reporting, learning the historical distribution of delays and relationship between cases in sequential time points to estimate the number of cases not-yet-reported. The 3-week moving window was selected to balance recency with stability. In performing stratified nowcasts, NobBS estimated the delay distribution citywide and the epidemic curve uniquely by stratum. Reports visualizing nowcast results were distributed weekly to DOHMH leadership for situational awareness. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Retrospective nowcasting evaluation
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.18.20209189 doi: medRxiv preprint diagnosed less frequently on weekends, when health care provider offices were typically closed or had more limited hours, [2] window length, given time-varying SARS-CoV-2 testing availability and uptake in NYC, and [3] assumed underlying distribution (Poisson or negative binomial) for case occurrence. In addition, for nowcasting the number of cases stratified by modZCTA, we compared results using [1] the "strata" option in NobBS -which estimated the delay distribution citywide and epidemic curve separately for each modZCTA -vs. estimating both the delay distribution and epidemic curve separately for each modZCTA, and [2] 10,000 vs.
Data for the evaluation were frozen as of June 30, 2020, capturing reports received through 1-month after the end of the assessment period. We mimicked prospective surveillance at weekly intervals and daily temporal resolution, retaining the number of estimated cases for each of the prior 7 days (i.e. 1-7 day hindcasts). We used the mean absolute error and the average daily relative root mean square error across all days evaluated to compare the point estimate of the number of daily hindcasted cases over the time series to the true number of cases reported.
For each of these metrics, lower numbers indicate better performance of the hindcast. We also assessed the 95% prediction interval coverage, i.e., the proportion of days during the study period when the 95% prediction interval included the true number of cases (10), which should ideally be 95%.
This work was reviewed and deemed public health surveillance that is non-research by the DOHMH Institutional Review Board.

Results
for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Hindcasts were performed weekly on Mondays in real-time, with results visualized for DOHMH leadership (e.g., Figure 1). However, the retrospective performance evaluation determined that real-time hindcasts on Mondays using a 3-week window and an assumed Poisson distribution more often overestimated than underestimated the number of not-yet-reported cases and resulted in overly narrow 95% prediction intervals ( Figure 2 and and Web Figure 1).
We found that citywide hindcasts with a 2-week moving window and a negative binomial distribution had a 44% lower mean absolute error, a 31% lower relative root mean square error, and 0.65 higher 95% prediction interval coverage than hindcasts conducted with a 3-week moving window or with a Poisson distribution ( Hindcasts conducted towards the end of the week (Thursday-Saturday) performed better than hindcasts performed earlier in the week, presumably as they had the furthest distance from the weekends. Weekends had lower overall case counts than weekdays ( Figure 1). Until mid-May, hindcasts more often overestimated than underestimated true cases counts, whereas at the end of May hindcasts more often underestimated case counts, reflecting changes in the delay distribution over time (Figure 1, Web Figure 3).
To minimize day-of-week effects that were most prominent on weekends, we also restricted performance analysis to hindcasts of cases on weekdays only, which resulted in better for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.18.20209189 doi: medRxiv preprint metrics, as expected (Table 1 and Web Table 1). The hindcasts restricted to estimating case counts for weekdays with a 2-week moving window and negative biniomial distribution also performed better than the hindcasts with a 3-week moving window and Poisson distribution, with 54% lower mean absolute error, 46% lower relative root mean square error and 0.69 higher 95% prediction interval coverage (Table 1, Web Table 1). Performance metrics were similar across days the hindcasts were conducted, with Mondays having the lowest mean average error and relative root mean square error, as expected given the 2 additional days between the last day reported (Friday) and the day the hindcast was conducted (Monday). On weekdays during the study period, the average daily case count after data lags resolved was 2,914, the average hindcasted case count with a 2-week window and negative binomial distribution conducted on Mondays was 2,878, and the mean absolute error was 183.
For hindcasts at the modZCTA-level, a 2-week moving window and negative binomial distribution performed best across all metrics evaluated (Table 2 and Web Table 1), although the prediction interval coverage for the nowcasts with a Poisson distribution was higher than for citywide hindcasts. The hindcasts that assumed a citywide delay distribution performed slightly better than hindcasts that assumed different distributions by modZCTA. Metrics for 3,000 vs.
10,000 adapations were essentially the same.

Discussion
NYC DOHMH improved situational awareness of COVID-19 testing and cases during the first epidemic wave in near-real time by applying NobBS, a readily accessible nowcasting and hindcasting method. As a result of the retrospective performance evaluation, to improve nowcast accuracy prospectively effective August 2020, we implemented the following changes for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020. 10.18.20209189 doi: medRxiv preprint to the nowcasting approach: (i) used a negative binomial case distribution instead of a Poisson, (ii) linked the determination of the moving-window length (2 or 3 weeks) to the 90 th percentile of the lag between specimen collection and report for reports received in the most recent week, choosing 3 weeks if the 90 th percentile of the lag distribution is >14 days, and (iii) suppressed nowcasting results for specimens collected on weekends, given lack of adjustment for day-ofweek effects. The evaluation supported the results of nowcasting conducted on any weekday. A previous evaluation of influenza nowcasts also found a negative binomial distribution more appropriately captured uncertainty in case counts than a Poisson distribution (10).
Despite a mature electronic laboratory reporting system and strong informatics infrastructure and data cleaning procedures at NYC DOHMH, input data available for nowcasting had several limitations. First, for records with long lags between specimen collection and report, as long as the specimen was reported to have been collected during the pandemic period, it was not possible to distinguish long lags attributable to true delays in testing or reporting -and thus informative to the delay distribution -from long lags attributable to laboratory data entry errors in specimen collection dates. Second, nowcasting by patient modZCTA of residence relied on accurate laboratory reporting of patient address. For example, one week of real-time nowcasting results were biased when, for a batch of reports, one commercial laboratory misreported its own address as the residential address of all patients tested. Third, a large proportion of records had missing onset date. NobBS is designed for use with complete linelists with no missing onset or report dates. Given the complexities of imputing onset date from diagnosis date, nowcasts were instead conducted by specimen collection or diagnosis date. Fourth, patient hospitalization status was largely ascertained by matching administrative records. To allow time for record matching, hospitalization nowcasts were for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.18.20209189 doi: medRxiv preprint conducted at a 3-day lag, limiting the real time availability of results. Furthermore, records from certain facilities were unavailable in near-real time, so nowcasts of hospitalizations by patient residence and by facility were subject to spatial bias, although still considered by DOHMH leadership to be useful for situational awareness.
This version of NobBS (0.1.0) also had several limitations when applied for nowcasting COVID-19 in NYC. First, citywide nowcasting was based on a linelist with only 2 data elements: specimen collection or diagnosis date and report date. Stratified nowcasts included a third data element: modZCTA or health care facility. There was no built-in functionality in NobBS to account for additional observable factors influencing data lags, including day-of-week and holiday effects in outpatient testing, and time-varying testing backlogs at specific laboratories differentially processing specimens for residents across neighborhoods. Similarly, there was no functionality to account for temporal trends in testing, e.g., the time-varying ratio of number of tests performed to number of cases detected. Increased testing uptake strongly influenced the shape of the observed epidemic curve during the study period as testing criteria at public health laboratories were relaxed, commercial and hospital labs developed testing capacity, and additional testing sites were opened and promoted. Third, while 95% prediction intervals reflected uncertainty in the nowcasts themselves -encompassing uncertainty in the estimation of the delay distribution as well as in the time evolution of the epidemic curve -they did not reflect uncertainty introduced by the user-specified window length. In other words, the moving window length had the potential to change nowcast estimates considerably. Analysts found it challenging to select the optimal window length in real time, and, given competing priorities during a pandemic, busy DOHMH officials would not have had adequate time to consider multiple nowcast versions with different window lengths as sensitivity analyses. The for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . retrospective analysis, which covered a period when the 90 th percentile of delay between specimen collection and report was 7 days, found that a 2-week nowcasting window resulted in higher performance than a 3-week window. This finding might not be generalizable to periods with more extensive reporting delays (24). Fourth, in generating geographically stratified nowcasts, the "strata" option in NobBS estimated the delay distribution citywide and epidemic curve separately for each modZCTA or health care facility stratum. For a highly transmissible infectious disease, nowcasting performance might be improved by considering spatial relationships across geographic strata, including spatial autocorrelation. Fifth, early in the pandemic when case counts were sparse, cumulative case counts with 95% prediction intervals were of interest to DOHMH leadership, but this functionality was not built into NobBS and required a separate solution, though is now in a development version of the package available at https://github.com/sarahhbellum/NobBS. Finally, although government officials have demonstrated interest in publicizing test percent positivity by report date (25, 26), which can be biased by data lags, NobBS did not have functionality to nowcast percentages as an outcome.
NobBS could be used to separately nowcast persons testing positive and negative and then to calculate test percent positivity, but there is no functionality to appropriately account for the separate uncertainties in the numerator and denominator of this percentage.

Conclusion
When tracking ongoing outbreaks using epidemic curves, public health officials recognize that data for recent days are incomplete because of reporting delays. Data lags can make it difficult for policymakers to discern in near real-time whether apparent decreases in for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . This article's contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention, the National Institutes of Health, or the Department of Health and Human Services.
Conflict of interest: none declared.
for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. .
(including "rapid" tests) and ensuring complete reporting of patient demographics for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

New York State Governor's Press
The copyright holder for this preprint this version posted October 20, 2020. . This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020.  This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.18.20209189 doi: medRxiv preprint on for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.18.20209189 doi: medRxiv preprint on for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 20, 2020. .