Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism (EMNLP 2023)

November 5, 2025 ยท View on GitHub

arXiv Python License

This repository contains the official code for the EMNLP 2023 paper: "Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism".

Experimental Overview

The experimental pipeline consists of two main stages:

  1. Data Generation: Generate vocabulary and question datasets tailored for different experimental settings.
  2. Experiment Execution: Run experiments using various models and experimental configurations.

Setup

  1. Clone the repository:

    git clone https://github.com/muyo8692/stepbystep-reasoning-vs-negation.git
    cd stepbystep-reasoning-vs-negation
    
  2. Install dependencies:

    pip install -r requirements.txt
    

Running Experiments

Step 1: Generate Data

Execute the following script to generate the necessary datasets for all experiments.

bash src/scripts/generate_data.sh

This will populate the data/ directory with the required vocabulary and question files.

Step 2: Run an Experiment

The run_experiment.sh script is used to execute experiments. The script is highly configurable via command-line arguments.

Experimental Settings

The --exp_level argument controls the reasoning setting, as described in the paper:

SettingDescriptionexemplars_typequestions_typevocab_type
BASEStandard syllogistic reasoning with real-world knowledge.basebasereal
FICReasoning with fictional knowledge to test logical deduction.basebasefiction
FICNEGIn-domain negation: both exemplars and questions contain negation.negnegfiction
FICNEG-OOut-of-domain negation: only questions contain negation.basenegfiction

Examples

  • Run the default experiment (BASE setting with GPT-3.5):

    bash src/scripts/run_experiment.sh
    
  • Run the FICNEG-O experiment on the occupation domain with opt-175b on a specific GPU:

    bash src/scripts/run_experiment.sh \
        --model_name opt-175b \
        --exp_level FICNEG-O \
        --task_domain occupation \
        --gpu_num 1 \
        --certain_gpus_list 0
    
  • Run an exemplar_reorder experiment in the FIC setting:

    bash src/scripts/run_experiment.sh \
        --model_name openai-gpt-4 \
        --exp_level FIC \
        --task_domain sports \
        --task_type exemplar_reorder \
        --exemplar_order_list 'exemplar_a' 'exemplar_b' 'exemplar_c' \
        --exemplar_label_str 'yes_no_no'
    

Results will be saved to the output/ directory, organized by date and model name.

Cite our work

@inproceedings{ye-etal-2023-assessing,
    title = "Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism",
    author = "Ye, Mengyu and Kuribayashi, Tatsuki and Suzuki, Jun and Kobayashi, Goro and Funayama, Hiroaki",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    year = "2023",
    url = "https://aclanthology.org/2023.emnlp-main.912/",
}