Re-implementation for my bachelor thesis project (thesis defense in April 2022).
Embeddings are widely used in natural language processing tasks. But one concern is that embeddings from existing language models are dense and high-dimensional, which is difficult for people to interpret. In this work, we propose a distribution-based method to identify ultradense subspace from contextualized embedding space.
We use pairs of words to define two different categories, e.g. female and male, and we extract bert embeddings for these pairs with the help of some corpora.
The embeddings form representation space for two categories. We multiply them with an orthogonal matrix Q and then take the first (or first several) dimensions. We view these dimensiosn as normal distribution and maxmize their divergence by Wasserstein distance. We optimize
For evaluation, we prepare a list of words, e.g. professions. As in the first demostration, we extract bert embeddings with the help of the same corpora. We can already compute the cosine similarities of the word to woman and man, respectively. Then, we do the transformation with the optimized orthogonal matrix
Here we present part of our results:
We find that the absolute diffences of similarities to woman and man decrease after the transformation. This means, the embeddings in the complement space contain less gender information than before.