This is the official implementation of ELISE (Effective and Lightweight Representation Learning for Signed Bipartite Graphs).
The paper is published in Neural Networks(Elsevier). [Paper Link]:

- Effective and Lightweight Representation Learning for Signed Bipartite Graphs
Gyeongmin Gu*, Minseo Jeon*, Hyun-Je Song, Jinhong Jung
How can we effectively and efficiently learn node representations in signed bipartite graphs? A signed bipartite graph is a graph consisting of two nodes sets where nodes of different types are positively or negative connected, and it has been extensively used to model various real-world relationships such as e-commerce, peer review systems, etc. To analyze such a graph, previous studies have focused on designing methods for learning node representations using graph neural networks (GNNs). In particular, these methods insert edges between nodes of the same type based on balance theory, enabling them to leverage augmented structures in their learning. However, the existing methods rely on a naive message passing design, which is prone to over-smoothing and susceptible to noisy interactions in real-world graphs. Furthermore, they suffer from computational inefficiency due to their heavy design and the significant increase in the number of added edges.
In this paper, we propose ELISE, an effective and lightweight GNN-based approach for learning signed bipartite graphs. We first extend personalized propagation to a signed bipartite graph, incorporating signed edges during message passing. This extension adheres to balance theory without introducing additional edges, mitigating the over-smoothing issue and enhancing representation power. We then jointly learn node embeddings on a low-rank approximation of the signed bipartite graph, which reduces potential noise and emphasizes its global structure, further improving expressiveness without significant loss of efficiency. We encapsulate these ideas into ELISE, designing it to be lightweight, unlike the previous methods that add too many edges and cause inefficiency. Through extensive experiments on real-world signed bipartite graphs, we demonstrate that ELISE outperforms its competitors for predicting link signs while providing faster training and inference time.
The packages used in this repository are as follows:
python==3.12.3
numpy==1.26.4
pytorch==2.2.1
tqdm==4.66.4
loguru==0.7.2
fire==0.7.0
You can create a conda environment with these packages by typing the following command in your terminal:
conda env create --file environment.yml
conda activate ELISEWe provide datasets used in the paper for reproducibility.
You can find raw datasets in ./datasets folder where the file's name is ${DATASET}.tsv.
The ${DATASET} is one of review, bonanza, ml-1m, and amazon-dm.
This file contains the list of signed edges where each line consists of a tuple of (src, dst, sign).
The details of the datasets are provided in the following table:
| Dataset | ||||||
|---|---|---|---|---|---|---|
| Review | 182 | 304 | 1,170 | 464 | 706 | 40.3 |
| Bonanza | 7,919 | 1,973 | 36,543 | 35,805 | 738 | 98.0 |
| ML-1m | 6,040 | 3,706 | 1,000,209 | 836,478 | 163,731 | 83.6 |
| Amazon-DM | 11,796 | 16,565 | 169,781 | 165,777 | 4,004 | 97.6 |
-
$|\mathcal{U}|$ : the number of type of node$\mathcal{U}$ -
$|\mathcal{V}|$ : the number of type of node$\mathcal{V}$ -
$|\mathcal{E}|$ : the number of edges -
$|\mathcal{E}^{+}|$ and$|\mathcal{E}^{-}|$ : the numbers of positive and negative edges, respectively -
$p$ (+): the ratio of positive edges
You can run the simple demo by typing the following command in your terminal:
cd src
python -m main --dataset reviewYou can perform the training process of ELISE with the following command:
cd src
python -m main --dataset {dataset_name} --num_layers {num_layer} --seed {seed} ...Default configurations for the “Review Dataset”
| Option | Description | Default |
|---|---|---|
| model | model name | elise |
| dataset_name | dataset name | review |
| seed | random seed value | 600 |
| device | device name | cuda:0 |
| epochs | number of epoch | 200 |
| lr | learning rate of an optimizer | 0.0005 |
| wdc | L2 regularization |
0.00001 |
| num_layer | number |
5 |
| num_decoder_layer s | number of classifier layer | 2 |
| c | ratio |
0.01 |
| rank_ratio | ratio |
0.4 |
| input_dim | model input feature dimension | 32 |
| decoder_input_dim | decoder input feature dimension | 256 |
| split_ratio | ratio of split of dataset for each phase | [0.85,0.05,01] |
| dataset_shuffle | check the shuffle | true |
| optimizer | optimizer name | Adam |
| Dataset | AUC | Binary-F1 | Macro-F1 | Micro-F1 |
|---|---|---|---|---|
| Review | 0.764 | 0.564 | 0.674 | 0.712 |
| Bonanza | 0.698 | 0.991 | 0.557 | 0.982 |
| ML-1M | 0.829 | 0.920 | 0.685 | 0.860 |
| Amazon-DM | 0.903 | 0.990 | 0.695 | 0.980 |
All experiments are conducted on RTX A5000 (24GB) with cuda version 12.0, and the above results were produced with the random seed seed=600.