Table of Contents
In this project, I have analyzed the TMDB dataset, which is available on Kaggle. This dataset contains information related to around 10,000 movies collected from TMDB (www.themoviedb.org). It includes information about the movie’s budget, revenue, viewer’s rating, genres, production companies, director, casting, keywords associated with movies, the popularity of the movies, and runtime.
This dataset can help to understand various factors like profitability, the trend around runtime, popularity over the years; popular genres for the profitability, connection between popularity ratings and profit; reveal information like profitable directors, casts and production companies over the span.
Note: For this analysis, inferential statistics or machine learning are not used so findings are tentative.
I am focusing on answering following questions as part of this project:
- 1. What is the trend of the number of movies released each year?
- 2. Is that certain casts and directors work together to make profitable movies?
- 3. What are the most profitable, highest revenue, and highest budget movies?
- 4. What is the highest revenue movie in each of recent years?
- 5. What is the correlation between budget, revenue, and profit in recent years?
- Which Genres are more profitable in recent years?
- 7. Which are some of the most used keywords recent in profitable movies?