LaCour! Questions and Opinions

December 13, 2023 ยท View on GitHub

Companion dataset and experiments to the arXiv preprint presenting the LaCour! subset Questions and Opinions.

Dataset

Loading the dataset

To use the dataset, all you need is dataset_questions_opinions.json.

You can load the dataset e.g. with pandas:

import pandas as pd
df = pd.read_json('dataset_questions_opinions.json', dtype={'webcast_id': str})

The dataset contains the following fields:

  • webcast_id: the identifier for the associated hearing webcast, can be linked to the rest of the LaCour! corpus
  • name: the name of the judge (only used to link question and opinion)
  • has_question: the boolean value whether the judge had a question during the hearing
  • has_opinion: the boolean value whether the judge had an opinion in the consequent judgment
  • language: the language of the question
  • question: the question asked by the judge
  • case_id: the identifier for the relevant judgment document
  • opinion: the tuple containing both the opinion title and the entire opinion by the judge
  • opinion_type: the inferred type of the opinion (label), categorized into PARTLY, DISSENTING, CONCURRING, OPINION or UNKOWN if the type cannot be categorized

Creating the dataset

[To be updated] To create the dataset yourself, all relevant information has to be extracted first. For this please refer to the creation scripts in lacour-generation.

conda create -n lacour-qando python=3.9
conda activate lacour-qando
git clone https://github.com/trusthlt/lacour-qando.git
cd lacour-qando
pip install -r requirements.txt

Experiments

[tbd]