Machine Learning
Hybrid recommender system combining collaborative and content-based filtering using deep learning for personalized Steam game recommendations.
This project implements a production-ready hybrid recommender system for Steam games that combines collaborative filtering and content-based filtering approaches using deep learning. The system provides personalized game recommendations to users based on their gaming history, preferences, and similarity to other users.
The recommender uses a PyTorch-based neural network architecture called DynamicHybridRecommender that simultaneously predicts both recommendation probability and expected playtime. The system addresses key challenges in recommender systems including the cold-start problem, data sparsity, and recommendation diversity through multiple sophisticated recommendation strategies.
The system follows a three-tier architecture with a Next.js frontend, FastAPI backend, and PyTorch model layer. The backend serves RESTful APIs for various recommendation strategies while the frontend provides a Steam-like UI for interactive exploration of recommendations.
graph TB
Frontend[Next.js Frontend] -->|HTTP/REST API| Backend[FastAPI Backend]
Backend -->|Model Inference| Model[PyTorch Model<br/>DynamicHybridRecommender]
Backend -->|Data Loading| DataLoader[DataLoader Service]
Model -->|Embeddings| Embeddings[User/Item Embeddings]
DataLoader -->|Steam Dataset| Dataset[(Steam Dataset)]
Backend: Python, FastAPI, PyTorch, NumPy, Pandas, scikit-learn
Frontend: Next.js, TypeScript, React, Steam-like UI components
Model: PyTorch neural network with user/item embeddings (32-dim), content features (50-dim), and dual output heads
Data Processing: k-core filtering, temporal split evaluation, MultiLabelBinarizer for tag vectors
The DynamicHybridRecommender model architecture concatenates user embeddings (32-dim), item embeddings (32-dim), and content features (50-dim reduced to 32-dim) into a 96-dimensional vector. This passes through two fully connected layers (96→64→32) with ReLU activation and dropout (0.2) before splitting into two output heads: a sigmoid output for recommendation probability and a linear output for predicted playtime.
The system uses k-core filtering (k=5) to create a dense interaction matrix, removing users/items with fewer than 5 interactions. For cold-start scenarios, the system constructs synthetic user embeddings by finding users who played similar games and computing weighted averages of their embeddings. Diversity re-ranking filters out games with cosine similarity > 0.8 to already-selected recommendations, and novelty scoring boosts less popular games. The model achieved precision@K of 0.0194, recall@K of 0.0520, and AUC-ROC of 0.6763 on the test set.