Skip to content

This project is to interpret high-dimensional BERT embeddings via distribution-based orthogonal transformation.

Notifications You must be signed in to change notification settings

zhangyaqi20/debiasing_bert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards Ultradense Contextual Embeddings by Distribution-Based Orthogonal Transformation

Re-implementation for my bachelor thesis project (thesis defense in April 2022).

Project Description

Embeddings are widely used in natural language processing tasks. But one concern is that embeddings from existing language models are dense and high-dimensional, which is difficult for people to interpret. In this work, we propose a distribution-based method to identify ultradense subspace from contextualized embedding space.

Demonstration

We use pairs of words to define two different categories, e.g. female and male, and we extract bert embeddings for these pairs with the help of some corpora.

Demo1

The embeddings form representation space for two categories. We multiply them with an orthogonal matrix Q and then take the first (or first several) dimensions. We view these dimensiosn as normal distribution and maxmize their divergence by Wasserstein distance. We optimize $Q$ to maximize the distance, so that only the first dimension contains the categorty (gender) information. Demo2

Results

For evaluation, we prepare a list of words, e.g. professions. As in the first demostration, we extract bert embeddings with the help of the same corpora. We can already compute the cosine similarities of the word to woman and man, respectively. Then, we do the transformation with the optimized orthogonal matrix $Q$ and get the complement space (other than the first dimension) of transformed embeddings. We compute the cosine similarities of the word to woman and man in this complement space.

Here we present part of our results:

Results

We find that the absolute diffences of similarities to woman and man decrease after the transformation. This means, the embeddings in the complement space contain less gender information than before.

About

This project is to interpret high-dimensional BERT embeddings via distribution-based orthogonal transformation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages