An image classification project using transfer learning with ResNet50 to classify waste materials into 6 categories for improved recycling and waste management.
Waste contamination in recycling is a growing problem due to lack of awareness about which items are recyclable. This project builds a deep learning classifier that takes RGB images of waste materials and categorizes them into 6 classes, helping to reduce contamination and improve recycling efficiency.
Improper sorting of recyclables leads to entire batches being discarded, exacerbating pollution and resource waste. This classifier aims to automate waste categorization using computer vision and deep learning.
TrashNet Dataset by Gary Thung and Mindy Yang
- Total Images: ~2,500 RGB images
- Classes: 6 categories
- Cardboard
- Glass
- Metal
- Paper
- Plastic
- Trash
Key Characteristics:
- Images organized in class-labeled folders
- Class imbalance present (Trash: 137 samples, Paper: 594 samples)
- Real-world waste material images with varying backgrounds
Dataset Split: 70% Train / 15% Validation / 15% Test (Stratified)
Base Model: ResNet50 pretrained on ImageNet
- Total Parameters: ~25M
- Architecture: 50-layer deep residual network
- Custom Classifier Head:
Linear(2048 β 512) β ReLU β Dropout(0.5) β Linear(512 β 6)
-
Model 1 (Frozen Features): Only classifier trained
- Test Accuracy: 83.95%
-
Model 2 (Full Fine-Tuning): All layers trainable
- Test Accuracy: 75.26%
-
Model 3 (Selective Fine-Tuning): Last conv block (layer4) + classifier
- Test Accuracy: 91.05% β Best Baseline
- Resize: 224Γ224 pixels
- Normalization: ImageNet mean/std
- Augmentation Pipeline:
- Random horizontal flip
- Random rotation (Β±5-10Β°)
- Color jitter (brightness, contrast, saturation)
- Random affine transformations
- Random erasing (Aug2)
- Gaussian blur (Aug3)
- Oversampling minority classes to match maximum class count
- Augmentation applied only to oversampled images
- Result: 416 samples per class in balanced training set
| Configuration | Test Accuracy | Notes |
|---|---|---|
| Baseline (No Aug, No Balance) | 91.05% | Model 3 architecture |
| Aug2 + Balanced | 92.11% | β Best overall |
| Aug1 + Balanced | 89.74% | Lighter augmentation |
| Aug3 + Balanced | 90.00% | Aggressive transforms |
Aug1 (Geometric + Color):
- Random flip, rotation, color jitter, affine, perspective
Aug2 (Spatial + Erasing):
- Random crop, rotation, affine, random erasing
- Best performer at 92.11%
Aug3 (Filter-based):
- Gaussian blur, invert, posterize, grayscale
β
Balancing + Augmentation: +1.06% improvement over baseline
β
Selective Fine-Tuning: Better than full fine-tuning (91% vs 75%)
β
Data Efficiency: 2,500 images sufficient with proper augmentation
β
Cross-Domain Test: 81% accuracy on completely new waste images
- Glass β Plastic: Highest similarity (0.95+)
- Metal β Glass: Strong correlation due to shine/reflectivity
- Paper β Cardboard: Similar texture patterns
- Trash: Dispersed across all categories
- Moderately distinct clusters for each material
- Overlap between visually similar pairs (Glass-Plastic, Paper-Cardboard)
- Trash class dispersed throughout feature space
Based on test set errors:
- Paper β Cardboard (texture similarity)
- Glass β Plastic (transparency/reflectivity)
- Metal β Glass (surface properties)
pip install torch torchvision numpy matplotlib scikit-learn tqdm seaborngit clone https://github.com/VK4041/TrashNet_Image_Classification.git
cd TrashNet_Image_Classification- Download TrashNet dataset from GitHub
- Place in
Data/TrashNet/trashnet/directory - Structure should be:
trashnet/ βββ cardboard/ βββ glass/ βββ metal/ βββ paper/ βββ plastic/ βββ trash/
# Load best model configuration
model = ResNet50WasteClassifier(num_classes=6, freeze_features=True)
for param in model.base_model.layer4.parameters():
param.requires_grad = True
# Train with balanced augmented data
train_model(model, balanced_train_loader, val_loader, test_loader, epochs=20)Before Optimization:
- DataLoader CPU time: ~730ms
- Main bottleneck: Data loading
After Optimization (num_workers=4, pin_memory=True):
- DataLoader CPU time: ~72ms
- 90% reduction in data loading time
- Epochs: 20 with early stopping (patience=3)
- Batch size: 128
- Optimizer: Adam (lr=0.001)
- Loss: CrossEntropyLoss
- Device: CUDA (T4 GPU)
- Class distribution plots
- Random sample display with denormalization
- Training/validation loss curves
- Accuracy progression graphs
- Misclassified image analysis
- Feature correlation heatmaps
- t-SNE cluster visualization
Course: SIT744 Deep Learning - Deakin University
Author: Varun Kumar
Research Paper: Classification of Trash for Recyclability Status
- β Transfer learning with pretrained CNNs
- β Class imbalance handling through oversampling
- β Strategic data augmentation
- β Selective layer fine-tuning
- β Early stopping to prevent overfitting
- β Cross-domain generalization testing
- β Feature extraction and visualization
- β Performance profiling and optimization
- Smart Fine-Tuning: Unfreezing only the last convolutional block achieved best results
- Balanced Learning: Oversampling + augmentation improved accuracy by 1%
- Real-World Testing: 81% accuracy on completely new waste images
- Efficient Pipeline: 90% reduction in data loading time through optimization
- Interpretability: Correlation analysis explains misclassification patterns
- Designed for Google Colab with GPU support
- Requires ~2GB storage for dataset
- Training time: ~10-15 minutes on T4 GPU
- Best results with balanced + Aug2 configuration
Contributions welcome! Feel free to:
- Add new augmentation strategies
- Test on different architectures
- Expand to more waste categories
- Improve cross-domain performance
This project is open source and available under the MIT License.
For questions or collaborations, please open an issue on GitHub.
Impact: This classifier can help reduce recycling contamination, saving resources and reducing environmental pollution through automated waste sorting.
Find the resources here: https://drive.google.com/drive/folders/14TOWegOtAg8Tfdjy9NRyVN0jLoLlrlA7?usp=sharing
Note: This project was developed as part of academic coursework. The techniques demonstrated are applicable to various NLP tasks requiring domain adaptation with limited computational resources.