Skip to content

acyrl/MoS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mixture of Softmaxes

Keras implementation of Mixture of Softmaxes. This layer is a type of ensenmble method described in Breaking the Softmax Bottleneck: A High-Rank RNN Language Model.

I have linked below a few blogs that can do this layer more justice than I can.

Experiments

I'm planning on testing this layer with a few different architectures and datasets.

For MNIST I have compared the mixture of softmaxes -- where we combine 3 softmaxes -- and just plain softmax and the improvement is around %1 in accuracy. See MNIST notebook.

The plan is to play around with CIFAR-10 and CIFAR-100 next. I will then move to some actual language models.

Resources

Some useful references:

  • The official implementation can be found here.
  • There two interesting blog posts that are quite enlightening. Found here and here.
  • Mixture of Experts layer for Keras is here.

About

Keras implementation of mixture of softmaxes

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published