Movie Recommendation System – 2nd way – with source code – 2022

Machine Learning Projects

In this blog, we will see one more way of implementing the Movie Recommendation System. This blog is also going to be a very interesting blog, so without any further due.

The simple intuition of this 2nd way is that we will be combining the main features like the cast, director, genres, etc., and observe similarities between them because most of the time similar directors make similar movies, similar casts like to perform in some similar specific types of movies.

Let’s do it…

Step 1 – Importing libraries required for Movie Recommendation System.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

Step 2 – Reading input data.

org_movies = pd.read_csv('movie_dataset.csv')
Movie Recommendation System
Input Data

Step 3 – Checking columns of our data.

Movie Recommendation System

Step 4 – Just keeping important columns.

movies = org_movies[[ 'genres', 'keywords','cast', 'title', 'director']]
  • We will remove all the unnecessary columns/features and just keep these 5 columns.
Movie Recommendation System

Step 5 – Checking info. of our data.
  • As we can see from the image below that our data is having some NULL values.
  • So we will fill these NULL values in the next step.
Movie Recommendation System
Some NULL values present

Step 6 – Filling Null values.

  • We are simply filling the NULL values with an empty space.

Step 7 – Again checking info.
  • Now if we check again, we can see that there are no NULL values now.
Movie Recommendation System
No NULL values

Step 8 – Making a column called combined features.

movies['combined_features'] = movies['genres'] +' '+ movies['keywords'] +' '+ movies['cast'] +' '+ movies['title'] +' '+ movies['director']
  • Here we have made a new column called combined_features which will contain all these features combined or we can say all these strings concatenated.
Movie Recommendation System

Step 9 – Observe the first entry in the combined feature column.

  • This is how the first combined_feature looks.
Movie Recommendation System

Step 10 – Initializing CountVectorizer.

cv = CountVectorizer()
count_matrix = cv.fit_transform(movies['combined_features'])
  • Here we are using CountVectorizer() to convert these combined features to a bag of words because we just can’t operate on strings.

Step 11 – Finding similarities between different entries.

cs = cosine_similarity(count_matrix)
  • Here we are using cosine_similarity to calculate similarities between all the combined features.
  • Like we will calculate the similarity between 1st and 2nd combined features, between 2nd and 3rd, between 1st and 3rd, etc.
  • And then we come up with this 4803 X 4803 matrix which contains similarities.
Movie Recommendation System

Step 12 – Two utility functions.

def get_movie_name_from_index(index):
    return org_movies[org_movies['index']==index]['title'].values[0]
def get_index_from_movie_name(name):
    return org_movies[org_movies['title']==name]['index'].values[0]
  • Just 2 utility functions.
  • The first function helps in extracting names from the index.
  • The second function helps in extracting the index from the name.

Step 13 – Printing all movies names.

Movie Recommendation System
All movie names

Step 14 – Live predictor.

test_movie_name = input('Enter Movie name --> ')
test_movie_index = get_index_from_movie_name(test_movie_name)
movie_corrs = cs[test_movie_index]
movie_corrs = enumerate(movie_corrs)
sorted_similar_movies = sorted(movie_corrs,key=lambda x:x[1],reverse=True)
for i in range(10):
  • Simply enter the movie name, for eg. ‘The Avengers’.
  • Get its index.
  • Get its similarities with all other movies using the cosine_similairty matrix.
  • Simply enumerate the similarities. This step will just make similarity which was like [0.001, 0.2, 0.65, 0.02…] to [(0,0.001), (1,0.2), (2,0.65), (3,0.02)…]. It will just add an index in front of all of them.
  • Then we simply sort the results based on the 2nd parameter above that was similarity (0th index is index and 1st index is a similarity).
  • And then print the first 10.
  • We can see that it is giving pretty good results as if someone likes The Avengers’, he/she will surely like Avengers: Age of Ultron, Iron Man 2, Captain America, etc.
Movie Recommendation System
Final Results

Download Source Code for Movie Recommendation System…

NOTE – To download data open the following link, right-click and save as.

Download Data for Movie Recommendation System…

Do let me know if there’s any query regarding Movie Recommendation System by contacting me on email or LinkedIn.

So this is all for this blog folks, thanks for reading it and I hope you are taking something with you after reading this and till the next time ?…


Check out my other machine learning projectsdeep learning projectscomputer vision projectsNLP projectsFlask projects at

Leave a Comment

Your email address will not be published.