Anyone who obsessively follows the Florida Department of Health data API’s as much as we do will tell you something funny has been going on at DOH this week.
Yesterday, on August 6, the data came in late again, and with 66,000 more cases than what DOH stated in their morning press release. Where these 66,000 cases came from, and more importantly where they went, remains a mystery.
When questioned by members of the media about why the cases shot up and then disappeared, DOH Director of Communications Alberto Moscoso went on the offensive, blaming ESRI, the developer who hosts the software, for the cases who apparated from the data and the delay in updating data today.
“This issue stems from the vendor that the Department has partnered with to create and maintain the Dashboard,” his office said in an email. “The Department has engaged with the vendor to resolve this issue as quickly as possible.”
That’s strange because I asked the vendor, who Moscoso later identified as Environmental Systems Research Institute (ESRI), about it and they didn’t receive any communication about this problem today from DOH.
When asked how a file uploaded to their system could just create 66,000 unique new positive persons, one ESRI representative cleared up the matter adding, “we don’t touch that data at all.”
That’s true. The process of updating the data is as simple as transferring it from one system to another via Python. If you have to, you can just manually overwrite the data by uploading a csv file directly into the online interface.
Take this as an example.
On Saturday, April 11, Epidemiology decided to switch the language for querying Merlin (the case database) to SQL, and didn’t tell me. So when I ran the updates the next day, April 12, there were loads off issues and errors in the data. I managed to get it pushed out for the morning update, but was running behind on the evening update. Back then we updated twice daily – 11 AM ET and 6:30 PM ET.
I got a call around 7 that night, cramming to fix the code because I was already running behind about a half-hour behind, that my parent’s house had been destroyed in an EF-4 tornado and that my mom was missing. Her Jeep was still in the driveway, according to a neighbor, and the house was gone.
It wasn’t until the next morning that we learned she was ok.
The updates were still less than an hour behind that night.
A change in programming language and my mom missing with our house gone, and I still had the data up less than a hour behind. By myself.
There’s no excuse for this.
The public has the right to this data. DOH could have simply uploaded the csv files to their website for people to download if they thought it was an ESRI issue, but they didn’t.
There’s another issue with this data becoming increasingly unreliable and untrustworthy.
The case line data is the only location to get information about pediatric cases on a daily basis. DOH only updates their pediatric report once a week, and it doesn’t include all of the details that the case line data does, and it isn’t possible to track individual pediatric cases to see if any are removed from the weekly report.
Florida COVID Action launched a school monitoring project this week which allows the public to safely and anonymously report cases in their schools. We need the case line data to verify the data in those reports.
I wish I could just accio data, but I fear my wizarding powers are limited to multi-tasking and being ignored by one Stephen Colbert.