Learning to Refine with Fine-Grained Natural Language Feedback

July 2, 2024 · View on GitHub

This repo contains code and instructions for reproducing experiments in the paper "Learning to Refine with Fine-Grained Natural Language Feedback". We propose a new method - Detect, Critique and Refine (DCR) for post-hoc editing document grounded summaries and making them more factual.

To run end to end editing with DCR you can run our code with the following command and arguments:

from run_end_to_end_refinement.dcr import DCR
document_instruction = '' # source document with the summarization instruction 
initial_response = '' # initial response 
model = "llama3-ft" # critique and refinement model: could be any HF model or GPT-4
dcr = DCR(cuda_id=0, model_name=model, path_to_minicheck="/home/mwadhwa/code/MiniCheck/",cache_dir="/data/users/mwadhwa/")
refinement = dcr.refine(source_text=document_instruction, initial_response=initial_response)
print(refinement)

Models

Our fine-tuned feedback and refinement models are available on HuggingFace 🤗:

Critique Model: Llama2-7b-Chat Fine-Tuned / Llama3-8b-Instruct Fine-Tuned
Refinement Model: Llama2-7b-Chat Fine-Tune / Llama3-8b-Instruct Fine-Tuned

Data for fine-tuning

The fine-tuning data distilled from GPT-4 is available on HuggingFace: https://huggingface.co/datasets/wadhma/dcr_data

Setup

You need to setup the folloiwng:

pip install -r requirements.txt
Setup MiniCheck here

Evaluation

We use the following metrics for evaluation:

AlignScore (here)
GPT-4 Likert Score on a scale of 1-5
GPT-4 pairwise score

Run end to end refinement with DCR

Models

Data for fine-tuning

Setup

Evaluation