Machine Learning
Unsupervised deep learning pipeline using CNNs to detect visual anomalies in wood surfaces.

This project implements a comprehensive Wood Surface Anomaly Detection and Segmentation System using three state-of-the-art unsupervised deep learning models: PaDiM, EfficientAD, and STFPM. The system provides a web-based interface for detecting and visualizing defects in wood surface images through a modern full-stack application built with FastAPI (backend) and React/TypeScript (frontend).
The application enables automated detection and segmentation of anomalies (defects) on wood surfaces using computer vision and deep learning, designed for industrial quality control applications where identifying defects in wood materials is crucial. Each model offers different strengths: PaDiM provides excellent localization with statistical rigor, EfficientAD offers very fast inference, and STFPM balances multi-scale detection with good accuracy.
The system follows a modular architecture with separate service layers for each model. The FastAPI backend exposes RESTful endpoints for each detection method, while the React frontend provides an intuitive interface for image upload, model selection, and result visualization.
graph TB
Frontend[React Frontend] -->|HTTP POST| Backend[FastAPI Backend]
Backend -->|Model Selection| PaDiM[PaDiM Service]
Backend -->|Model Selection| EfficientAD[EfficientAD Service]
Backend -->|Model Selection| STFPM[STFPM Service]
PaDiM -->|ResNet50| Model1[PaDiM Model]
EfficientAD -->|PDN Networks| Model2[EfficientAD Models]
STFPM -->|ResNet18| Model3[STFPM Models]
Backend -->|Result Image| Static[Static Files]
Frontend -->|Display| Static
Backend: Python, FastAPI, PyTorch, Torchvision, OpenCV, Pillow, NumPy, SciPy, scikit-image, scikit-learn
Frontend: React, TypeScript, Material-UI (MUI), React Router DOM, react-dropzone, Vite
Models:
PaDiM (Patch Distribution Modeling) uses a pretrained ResNet50 to extract multi-layer features, models the distribution of normal features using mean vectors and covariance matrices (with Ledoit-Wolf regularization), and computes Mahalanobis distances for anomaly detection. It requires no training, only statistical modeling of normal data.
EfficientAD employs a three-network architecture: a frozen teacher network, a trainable student network, and an autoencoder. The student learns to match teacher features for normal images, while the autoencoder reconstructs normal images but fails on anomalies. Anomaly maps combine student-teacher and autoencoder-student differences.
STFPM (Student-Teacher Feature Pyramid Matching) uses a pretrained ResNet18 teacher and randomly initialized student, matching features at multiple scales. The student learns to replicate teacher features for normal images but fails on anomalous ones, with multi-scale matching capturing anomalies at different resolutions. All models generate composite visualizations showing original image, heatmap, and overlay for interpretability.