Objective: Perform spatial data analysis for COVID-19 data and visualize its spread in 2020.
Data Source:: https://github.com/CSSEGISandData/COVID-19 https://www.worldometers.info/
Collection Methodology:: https://github.com/imdevskp/covid_19_jhu_data_web_scrap_and_cleaning
Analysis: The dataset includes geographic information system (GIS) data including counts by latitude and longitude per day. Hence, we can count how many confirmed, recovered, death, and active cases exist per day by location including country and/or WHO region.
Data by Country: Provide the total confirmed, active, death, and recovered cases by country as of July 27, 2020. Apply a heat map, and order the results by total confirmed cases, to indicate how the total confirmed cases are highest in the USA, Brazil, and India. The USA has, by far, the highest deaths of any country due to COVID-19.
Rather than order by confirmed cases, let's apply a heat map again and order by deaths while calculating the deaths/100 cases. This is a better measure to compare the death rate between countries.
Geospatial Analysis: Now, let's plot some choropleth maps. Let's start with confirmed cases by country. First, let’s start with count of confirmed cases by country.
The same trend is observed when I plotted deaths by country. While Greenland had few confirmed cases, they appear to have even fewer deaths. Interestingly, the same African countries with few cases have more deaths than Greenland.
Next, I can plot the spread of confirmed COVID-19 cases over time.
Conclusion: One can see that COVID-19 started in China, then spread to Europe, and then to the USA.