Machine Learning

Steam Game Recommender System

Hybrid recommender system combining collaborative and content-based filtering using deep learning for personalized Steam game recommendations.

Overview

This project implements a production-ready hybrid recommender system for Steam games that combines collaborative filtering and content-based filtering approaches using deep learning. The system provides personalized game recommendations to users based on their gaming history, preferences, and similarity to other users.

The recommender uses a PyTorch-based neural network architecture called DynamicHybridRecommender that simultaneously predicts both recommendation probability and expected playtime. The system addresses key challenges in recommender systems including the cold-start problem, data sparsity, and recommendation diversity through multiple sophisticated recommendation strategies.

Architecture

The system follows a three-tier architecture with a Next.js frontend, FastAPI backend, and PyTorch model layer. The backend serves RESTful APIs for various recommendation strategies while the frontend provides a Steam-like UI for interactive exploration of recommendations.

graph TB
    Frontend[Next.js Frontend] -->|HTTP/REST API| Backend[FastAPI Backend]
    Backend -->|Model Inference| Model[PyTorch Model<br/>DynamicHybridRecommender]
    Backend -->|Data Loading| DataLoader[DataLoader Service]
    Model -->|Embeddings| Embeddings[User/Item Embeddings]
    DataLoader -->|Steam Dataset| Dataset[(Steam Dataset)]

Key Features

Hybrid Recommendation Model: Combines user-item collaborative filtering embeddings with content-based features (game tags/genres) in a unified neural network
Multiple Recommendation Strategies:
- User-based recommendations using learned user embeddings
- Played-games-based recommendations for cold-start scenarios (synthetic user embeddings)
- Similar users discovery using cosine similarity
- Similar games discovery using content-based and co-occurrence analysis
Dual Output Prediction: Simultaneously predicts recommendation probability (0-1) and expected playtime (hours)
Diversity & Novelty: Re-ranking algorithms ensure varied recommendations and boost less popular games
Production-Ready API: FastAPI backend with comprehensive endpoints for all recommendation types

Technology Stack

Backend: Python, FastAPI, PyTorch, NumPy, Pandas, scikit-learn
Frontend: Next.js, TypeScript, React, Steam-like UI components
Model: PyTorch neural network with user/item embeddings (32-dim), content features (50-dim), and dual output heads
Data Processing: k-core filtering, temporal split evaluation, MultiLabelBinarizer for tag vectors

Technical Highlights

The DynamicHybridRecommender model architecture concatenates user embeddings (32-dim), item embeddings (32-dim), and content features (50-dim reduced to 32-dim) into a 96-dimensional vector. This passes through two fully connected layers (96→64→32) with ReLU activation and dropout (0.2) before splitting into two output heads: a sigmoid output for recommendation probability and a linear output for predicted playtime.

The system uses k-core filtering (k=5) to create a dense interaction matrix, removing users/items with fewer than 5 interactions. For cold-start scenarios, the system constructs synthetic user embeddings by finding users who played similar games and computing weighted averages of their embeddings. Diversity re-ranking filters out games with cosine similarity > 0.8 to already-selected recommendations, and novelty scoring boosts less popular games. The model achieved precision@K of 0.0194, recall@K of 0.0520, and AUC-ROC of 0.6763 on the test set.

Technologies Used

PythonPyTorchFastAPINext.jsTypeScriptDeep LearningNeural NetworksCollaborative Filtering