Global Covid database exploration

I conducted a study on COVID-19 data from the last 2 years using SQL as my primary tool for data exploration. The study aimed to understand the impact of the virus on different countries and regions, as well as the effectiveness of government responses to the pandemic.
I began by collecting data from a Kaggle database, which included information on the number of confirmed cases and deaths, as well as demographic information such as age and gender, and it was collected from different countries, also data about tests conducted, hospitalization, recovery and more.
Once the data was collected, I used SQL to clean and organize it. I created tables and views to store and manipulate the data, using various SQL commands to analyze it. I also created temporary tables to store intermediate results. I examined the number of confirmed cases and deaths over time, and identified hotspots and high-risk areas where the virus was particularly prevalent. I also analyzed government responses to the pandemic, such as testing policies, quarantine measures, and vaccine distribution. I used SQL commands like SELECT, JOIN, GROUP BY, HAVING, WHERE, and more to create different queries that helped me extract the information that I needed.
I found that the number of confirmed cases and deaths varied greatly between countries and regions, and that hotspots and high-risk areas often correlated with population density and socio-economic factors. I also found that government responses varied widely, with some countries implementing more effective measures than others. I also found a relationship between the number of tests conducted and the number of confirmed cases and deaths.
One of the most interesting findings of the study was the correlation between the number of confirmed cases and deaths with the number of tests conducted by each country. The countries that have conducted more tests have a more accurate representation of the number of cases and deaths.

Git Hub


Technologies:
  • SQL Server