How Recommendation Systems Work
Recommendation systems are among the most widespread and profitable applications of Machine Learning. They power recommendations on Amazon, Netflix, Spotify, YouTube, and virtually every platform that personalizes user experience. The goal is to predict user preferences and suggest relevant content they have not yet discovered. It is estimated that 35% of Amazon sales and 80% of Netflix content viewed come from recommendation systems.
There are three main approaches: collaborative filtering (based on similar users' behavior), content-based filtering (based on item characteristics), and hybrid approaches that combine both to overcome each one's limitations.
What You Will Learn in This Article
- Collaborative Filtering: user-based and item-based
- Content-Based Filtering: feature similarity
- Matrix Factorization: SVD and NMF
- Hybrid Systems and the Cold-Start Problem
- Evaluation metrics: Precision@K, NDCG, RMSE
- Practical implementation with Python
Collaborative Filtering
Collaborative filtering is based on the idea that users with similar tastes in the past will have similar tastes in the future. The user-item matrix is the fundamental data structure: rows are users, columns are items, and values are ratings (or interactions). This matrix is typically very sparse: each user has interacted with only a small fraction of available items.
User-based CF finds users similar to the target and recommends items those users liked. Item-based CF finds items similar to those the user liked and recommends them. Similarity is typically measured with cosine similarity or Pearson correlation.
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# User-Item Matrix (0 = not rated)
# Rows: users, Columns: movies
ratings = np.array([
[5, 3, 0, 1, 4], # User 0
[4, 0, 0, 1, 5], # User 1
[1, 1, 0, 5, 0], # User 2
[0, 0, 5, 4, 0], # User 3
[0, 3, 4, 0, 0], # User 4
])
movie_names = ['Matrix', 'Inception', 'Notebook', 'Titanic', 'Interstellar']
# --- USER-BASED CF ---
# Calculate user similarity (using commonly rated movies)
user_similarity = cosine_similarity(ratings)
np.fill_diagonal(user_similarity, 0) # exclude self-similarity
def predict_user_based(user_id, item_id, ratings, similarity, k=2):
"""Predict user rating for an item."""
# Find k most similar users who rated this item
item_ratings = ratings[:, item_id]
rated_mask = item_ratings > 0
rated_mask[user_id] = False # exclude the user
if not rated_mask.any():
return 0
sim_scores = similarity[user_id][rated_mask]
user_ratings = item_ratings[rated_mask]
# Top-k most similar
top_k_idx = np.argsort(sim_scores)[-k:]
top_sims = sim_scores[top_k_idx]
top_ratings = user_ratings[top_k_idx]
if top_sims.sum() == 0:
return 0
return np.dot(top_sims, top_ratings) / top_sims.sum()
# Predictions for User 0 on unseen movies
user = 0
print(f"Recommendations for User {user}:")
for item in range(ratings.shape[1]):
if ratings[user, item] == 0:
pred = predict_user_based(user, item, ratings, user_similarity)
print(f" {movie_names[item]:<15s}: predicted rating = {pred:.2f}")
Content-Based Filtering
Content-based filtering recommends items similar to those already liked by the user, based on item characteristics (genre, director, actors for movies; artist, genre, BPM for music). Each item is represented as a feature vector, and similarity between items is calculated with cosine similarity. It does not suffer from the cold-start problem for items (knowing their characteristics is enough), but cannot recommend items outside the user's profile.
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
# Movie database with descriptions
films = {
'Matrix': 'sci-fi action cyberpunk hacker virtual reality',
'Inception': 'sci-fi action dreams psychological thriller',
'Notebook': 'romance drama love story emotional',
'Titanic': 'romance drama historical ship love',
'Interstellar': 'sci-fi space exploration time gravity',
'Blade Runner': 'sci-fi cyberpunk android dystopia',
'Pride and Prejudice': 'romance historical drama society',
'Dark Knight': 'action superhero crime thriller',
}
# Create TF-IDF matrix from descriptions
film_names = list(films.keys())
descriptions = list(films.values())
tfidf = TfidfVectorizer()
tfidf_matrix = tfidf.fit_transform(descriptions)
# Calculate similarity between all movies
item_similarity = cosine_similarity(tfidf_matrix)
def recommend_content_based(liked_films, all_films, similarity, n=3):
"""Recommend movies similar to those liked."""
liked_idx = [all_films.index(f) for f in liked_films if f in all_films]
# Average similarity with liked movies
scores = np.zeros(len(all_films))
for idx in liked_idx:
scores += similarity[idx]
scores /= len(liked_idx)
# Exclude already seen movies
for idx in liked_idx:
scores[idx] = -1
# Top-N recommendations
top_idx = np.argsort(scores)[::-1][:n]
return [(all_films[i], scores[i]) for i in top_idx]
# Recommend based on liked movies
liked = ['Matrix', 'Interstellar']
recs = recommend_content_based(liked, film_names, item_similarity)
print(f"You liked: {liked}")
print("Recommendations:")
for film, score in recs:
print(f" {film:<25s} (score: {score:.3f})")
Matrix Factorization: SVD
Matrix Factorization decomposes the sparse user-item matrix into two dense, low-dimensional matrices: one for users and one for items. The product of these matrices approximates the original matrix and fills in missing values with predictions. SVD (Singular Value Decomposition) and NMF (Non-negative Matrix Factorization) are the most used techniques. The latent factors capture abstract concepts like preferred genres or narrative style.
The Cold-Start Problem
The cold-start problem is the biggest challenge of recommendation systems: how to recommend to a new user whose preferences we do not know? How to recommend a new item that nobody has rated yet? Solutions include: asking for initial preferences (onboarding quiz), using demographic information, combining content-based (which does not suffer from cold-start for items) with collaborative filtering, and using popularity-based recommendations as fallback.
Collaborative vs Content-Based: Collaborative filtering discovers unexpected preferences (serendipity) but suffers from cold-start. Content-based does not require data about other users but remains confined to the user's profile (filter bubble). Hybrid systems combine the strengths of both and are the industry standard.
Evaluation Metrics
Recommendation system metrics divide into rating prediction metrics (RMSE, MAE) and ranking metrics (Precision@K, Recall@K, NDCG). Precision@K measures how many of the top-K recommended items are relevant. NDCG (Normalized Discounted Cumulative Gain) considers position: a relevant item in first position is worth more than one in tenth. Hit Rate measures the probability that at least one relevant item appears in the recommendations.
Key Takeaways
- Collaborative Filtering leverages similar users; Content-Based leverages item features
- Cosine similarity is the standard metric for measuring similarity
- Matrix Factorization (SVD) decomposes the user-item matrix into latent factors
- The cold-start problem requires hybrid strategies and intelligent fallbacks
- Precision@K and NDCG are the most important metrics for evaluating recommendations
- Hybrid systems (collaborative + content-based) are the industry standard







