Warning: This analysis contains the results of a predictive model. There are a number of assumptions made which include some speculation. Furthermore, this analysis was not prepared or reviewed by an Epidimiologist. Therefore, the assumptions and methods presented should be scrutinized carefully before arriving at any conclusions.

Summary for the United States on 2020-05-06:

Reported Case Count: 1,195,376

Predicted Case Count: 1,426,405

Percentage Underreporting in Case Count: 16.2%

COVID-19 Case Estimates, by State

Definition Of Fields:

  • Reported Cases: The number of cases reported by each state, which is a function of how many tests are positive.
  • Est Cases: The predicted number of cases, accounting for the fact that not everyone is tested.
  • Est Range: The 95% confidence interval of the predicted number of cases.
  • Ratio: Estimated Cases divided by Reported Cases.
  • Tests per Million: The number of tests administered per one million people. The less tests administered per capita, the larger the difference between reported and estimated number of cases, generally.
  • Cases per Million: The number of reported cases per on million people.
  • Positive Test Rate: The reported percentage of positive tests.
Reported Cases Est Cases Est Range Ratio Tests per Million Cases per Million Positive Test Rate
state
NY 321192 374234 (342466, 444162) 1.2 52890.0 16510.7 31%
NJ 130593 154689 (140563, 183970) 1.2 32382.0 14702.8 45%
MA 70271 82005 (74573, 99028) 1.2 48364.0 10195.3 21%
IL 65962 78460 (70918, 94522) 1.2 27327.2 5205.4 19%
CA 56212 68518 (60857, 86185) 1.2 19738.2 1422.6 7%
PA 50957 62013 (55230, 75561) 1.2 19597.1 3980.4 20%
MI 44397 53681 (48087, 66679) 1.2 22312.9 4445.5 20%
FL 37439 45387 (40684, 55012) 1.2 21682.5 1743.2 8%
TX 33369 41048 (35740, 51554) 1.2 14733.5 1150.8 8%
CT 30621 36495 (32958, 43636) 1.2 30472.4 8588.6 28%
GA 29711 36040 (32032, 43924) 1.2 18920.1 2798.3 15%
LA 29996 35058 (31889, 41555) 1.2 40490.3 6452.4 16%
MD 27117 32545 (29152, 39449) 1.2 23174.1 4485.4 19%
OH 20969 25989 (22942, 33519) 1.2 13750.8 1793.9 13%
IN 21033 25685 (22770, 31680) 1.2 17205.9 3124.2 18%
VA 20256 25090 (22178, 31513) 1.2 13321.7 2373.1 18%
CO 16907 20898 (18306, 26024) 1.2 14760.9 2935.9 20%
WA 15462 18391 (16576, 22465) 1.2 28407.5 2030.5 7%
TN 13690 16245 (14721, 19577) 1.2 32048.1 2004.6 6%
NC 12256 15219 (13413, 19288) 1.2 14473.6 1168.6 8%
IA 10111 12362 (11028, 14915) 1.2 19197.4 3204.7 17%
AZ 9305 11651 (10270, 14663) 1.3 12125.8 1278.4 11%
RI 9933 11387 (10498, 13222) 1.1 72152.0 9376.4 13%
MO 8916 11017 (9793, 13503) 1.2 15489.6 1452.7 9%
WI 8566 10527 (9262, 13269) 1.2 15892.5 1471.2 9%
AL 8285 10009 (8930, 11972) 1.2 21774.8 1689.7 8%
MS 8207 9790 (8863, 11770) 1.2 26983.9 2757.6 10%
MN 7851 9723 (8552, 12229) 1.2 15605.5 1392.1 9%
SC 6757 8422 (7393, 10689) 1.2 13162.7 1312.4 10%
NE 6083 7397 (6599, 9239) 1.2 17891.3 3144.6 18%
NV 5594 6842 (6113, 8295) 1.2 15546.0 1816.1 12%
KS 5458 6753 (6012, 8234) 1.2 13761.6 1873.5 14%
KY 5245 6524 (5768, 7940) 1.2 13440.3 1174.0 9%
DE 5371 6446 (5795, 7699) 1.2 25345.0 5515.7 22%
UT 5449 6426 (5776, 7664) 1.2 39524.8 1699.6 4%
DC 5322 6275 (5690, 7522) 1.2 34472.6 7540.9 22%
OK 4127 4985 (4459, 6109) 1.2 20070.1 1043.0 5%
NM 4031 4747 (4324, 5758) 1.2 38973.1 1922.4 5%
AR 3496 4264 (3784, 5320) 1.2 18104.2 1158.5 6%
OR 2839 3494 (3104, 4384) 1.2 15511.6 673.1 4%
SD 2721 3280 (2952, 3907) 1.2 21502.1 3075.8 14%
NH 2588 3150 (2809, 3816) 1.2 19761.6 1903.3 10%
PR 1924 2628 (2205, 3554) 1.4 3518.5 602.4 17%
ID 2106 2573 (2293, 3231) 1.2 16959.7 1178.5 7%
ME 1226 1523 (1341, 1928) 1.2 15452.9 912.1 6%
ND 1266 1475 (1349, 1757) 1.2 47792.7 1661.3 3%
WV 1238 1475 (1326, 1795) 1.2 30590.7 690.8 2%
VT 907 1085 (971, 1289) 1.2 28074.2 1453.6 5%
HI 621 751 (670, 927) 1.2 24012.8 438.6 2%
WY 596 727 (644, 892) 1.2 18859.3 1029.8 5%
MT 456 566 (502, 711) 1.2 14289.2 426.7 3%
AK 371 441 (399, 530) 1.2 31019.3 507.1 2%

Appendix: Model Diagnostics

Derived relationship between Test Capacity and Case Under-reporting

Plotted is the estimated relationship between test capacity (in terms of people per test -- larger = less testing) and the likelihood a COVID-19 case is reported (lower = more under-reporting of cases).

The lines represent the posterior samples from our MCMC run (note the x-axis is plotted on a log scale). The rug plot shows the current test capacity for each state (black '|') and the capacity one week ago (cyan '+'). For comparison, South Korea's testing capacity is currently at the very left of the graph (200 people per test).

About this Analysis

This analysis was done by Joseph Richards.

This project1 uses the testing rates per state from https://covidtracking.com/, which reports case counts and mortality by state. This is used to estimate the number of unreported (untested) COVID-19 cases in each U.S. state.

The analysis makes a few assumptions:

  1. The probability that a case is reported by a state is a function of the number of tests run per person in that state. Hence the degree of under-reported cases is a function of tests run per capita.
  2. The underlying mortality rate is the same across every state.
  3. Patients take time to succumb to COVID-19, so the mortality counts today reflect the case counts 7 days ago. E.g., mortality rate = (cumulative deaths today) / (cumulative cases 7 days ago).

The model attempts to find the most likely relationship between state-wise test volume (per capita) and under-reporting, such that the true underlying mortality rates between the individual states are as similar as possible. The model simultaneously finds the most likely posterior distribution of mortality rates, the most likely true case count per state, and the test volume vs. case underreporting relationship.


  1. Full details about the model are available at: https://github.com/jwrichar/COVID19-mortality