ReREF

"A Unified Retrieval Framework with Document Ranking and EDU Filtering for Multi-document Summarization"

Introduction

This is the code for ReREF, a novel retrieval-based framework that integrates query selection and document ranking and shortening into a unified process. Our approach identifies the most salient elementary discourse units (EDUs) from input documents and utilizes them as latent queries. These queries guide the document ranking by calculating relevance scores through a multi-query mechanism. Instead of depending on traditional truncation, our approach filters out irrelevant EDUs to fit within the context length, ensuring that only the most critical information is preserved for summarization. We evaluate our framework on multiple MDS datasets, demonstrating consistent improvements in ROUGE metrics while confirming its scalability and flexibility across diverse model architectures.

Framework

Set up

Create a python environment with python 3.9 and the packages in requirements

install with pip (we recommend use conda):

pip install -r requirements.txt

Usage of ReREF

Usage of ReREF follows the procudure of our framework, which include three steps: 1. Datasets creating, 2. Retriever training, 3. summarizer training

1. Datasets creating

Download the source code of DMRST_Parser from here to ./DMRST_Parser.
Install python package sentence_transformers, and download the Semantic Search model multi-qa-mpnet-base-cos-v1
Use the jupyter notebook make dataset upload.ipynb to create dataset for retriever training.

2. Retriever training

After creating the dataset for retriver training, which uses ||||| to split documents and uses || to split EDUs, you can use retriever/run_train_test.sh for train/test retriever. Or you can use following script to train the retriever

python rank_primer.py --mode ${mode} --loss_type bpr
            --model_path ${model_path} --strategy auto --dataset_name ${dataset_name}

After the retriever training, you should update the dataset with the prediction of document ranks and edu scores by the retriever, same as make dataset upload.ipynb.

3. Summarizer training

You can use summarizer/primer_main_modify.py for train/test PRIMERA on the truncation strategy with retriever results or the original truncation strategy, and summarizer/compared_model_main_modify.py for train/test BART/PEGASUS/LED. You can follow the shell script summarizer/run_train_shell.sh to fully/few-shot supervised train summarizer models.

python primer_hf_main_modify.py --mode ${mode} 
        --model_path ${model_path} --beam_size 5 --batch_size 16 --strategy auto 
        --model_name ${model_name} --join_method truncate_last_ranking_filtering 
        --dataset_name ${dataset_name}

For Large Language Models, Llama 3 and StableLM-Zephyr, we use LitGPT to achieve fast and convenient training. The prompt used for LLMs is

Prompt
`The following document comes from the topic-related multiple sources. Using the information only in the document, summarize the document that the sources mainly talk about.`
`Document: ${document}`

Datasets

All these original datasets are provided PRIMERA.

For Multi-News and Multi-XScience, it will automatically download from Huggingface.
WCEP-10: the preprocessed version can be found here
Wikisum: we only the subset version provided by PRIMERA, which can be found here.

Note that we do not include LightPAL and LogicSumm as baselines, as they are open-domain MDS methods that typically rely on carefully crafted queries and retrieve information from external knowledge bases, rather than focusing solely on the content within the provided multi-document inputs.

We do not include a zero-shot setting in our experiments. For the following reasons:

Models such as BART, BartGraphSum, and Pegasus are not pre-trained specifically for multi-document summarization (MDS) and thus require fine-tuning to perform well.
Primera is pre-trained for MDS, directly integrating our retrieval module without fine-tuning would result in an unfair comparison, as its pre-training is tightly coupled with its original input format.
As for large language models (LLMs), we aim for a fair comparison. Without fine-tuning, LLMs produce unconvincing and low-quality summaries in both the original and retrieval-based settings.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
retriever		retriever
summarizer		summarizer
utils		utils
README.md		README.md
framework.png		framework.png
make dataset upload.ipynb		make dataset upload.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ReREF

"A Unified Retrieval Framework with Document Ranking and EDU Filtering for Multi-document Summarization"

Introduction

Framework

Set up

Usage of ReREF

1. Datasets creating

2. Retriever training

3. Summarizer training

Datasets

About

Uh oh!

Releases

Packages

Languages

ShiyinTan/ReREF

Folders and files

Latest commit

History

Repository files navigation

ReREF

"A Unified Retrieval Framework with Document Ranking and EDU Filtering for Multi-document Summarization"

Introduction

Framework

Set up

Usage of ReREF

1. Datasets creating

2. Retriever training

3. Summarizer training

Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages