Movies Correlation Project

This project aimed to uncover correlations between different variables in the movie industry using Python, pandas, numpy, and seaborn on a Kaggle dataset. The dataset included information on movie budgets, ticket sales revenue, ratings, and other variables that are relevant to the success of a movie.
I began by importing the dataset into a pandas dataframe, and then used pandas to clean and organize the data. I then used numpy to perform calculations on the data, and seaborn to create visualizations.
One of my main goals was to uncover the relationship between a movie's budget and its ticket sales success. To do this, I created a scatter plot with budget on the x-axis and ticket sales revenue on the y-axis. I found that there was a positive correlation between budget and ticket sales revenue, meaning that movies with higher budgets tended to have higher ticket sales revenues. However, I also found that there were many movies that had high budgets but did not perform well in terms of ticket sales.
In addition to budget and ticket sales revenue, I also looked at other variables that could be related to a movie's success, such as its ratings and its genre. I found that movies with higher ratings tended to perform better in terms of ticket sales, and that certain genres, such as action and adventure, tended to have higher ticket sales revenues than other genres.
Overall, this project was highly effective in providing a deeper understanding of the data and uncovering correlations between different variables in the movie industry. The insights I gained through this project could be useful for movie studios and other industry professionals looking to make more informed decisions about movie budgets and marketing strategies. Additionally, it was a great opportunity to practice my skills in data analysis and visualization using Python.

Git Hub


Technologies:
  • Pandas
  • Matplotlib
  • Seaborn
  • Numpy
  • Jupyter Notebook