Masked Autoencoders Are Scalable Vision Learners

This repository contains an implementation of Masked Autoencoders (MAE) applied to the CIFAR-10 dataset, instead of the larger datasets used in the original paper, for computational feasibility.

Reconstruction Example

Reproducing the Experiments

To reproduce the experiments it is possible to:

1. Reconstruct Masked Images

recunstruct_images.py

This script visualizes the model's reconstruction of masked images from the test set. You can access specific samples by changing the start_index variable.

2. Classify Images

classify_images.py

This script uses a classifier fine-tuned from the encoder of the pretrained Masked Autoencoder. You can access specific samples by changing the start_index variable.

3. Train the Masked Autoencoder

src/train_reconstruction_mae.py

This script pretrains the full MAE model following the pre-training setting described in the original paper.

4. Train the Classifier

src/train_mae_classifier.py

This script fine-tunes a classifier on top of the pretrained encoder (End-to-End fine-tuning, as described in the original paper).

Training Configurations

The config.yaml file can be edited to run the training with different configurations.

In the provided scripts:

The default model used to reconstruct images is the one trained with 75% masking.
The default model used to classify images is the one obtained fine-tuning the pretrained encoder with 75% masking.

It is possible to change the model path in the scripts (e.g. from mae-75-masking to mae-25-masking) according to the weights released in the src/data/weights folder.

Credits

Original Paper: Masked Autoencoders Are Scalable Vision Learners

Implementation Inspiration

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
classify_images.py		classify_images.py
config.yaml		config.yaml
reconstruct_images.py		reconstruct_images.py
slides_mae.pdf		slides_mae.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Masked Autoencoders Are Scalable Vision Learners

Reconstruction Example

Reproducing the Experiments

1. Reconstruct Masked Images

2. Classify Images

3. Train the Masked Autoencoder

4. Train the Classifier

Training Configurations

Credits

About

Uh oh!

Releases

Packages

Languages

fbizza/masked-autoencoder

Folders and files

Latest commit

History

Repository files navigation

Masked Autoencoders Are Scalable Vision Learners

Reconstruction Example

Reproducing the Experiments

1. Reconstruct Masked Images

2. Classify Images

3. Train the Masked Autoencoder

4. Train the Classifier

Training Configurations

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages