Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism (EMNLP 2023)
November 5, 2025 ยท View on GitHub
This repository contains the official code for the EMNLP 2023 paper: "Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism".
Experimental Overview
The experimental pipeline consists of two main stages:
- Data Generation: Generate vocabulary and question datasets tailored for different experimental settings.
- Experiment Execution: Run experiments using various models and experimental configurations.
Setup
-
Clone the repository:
git clone https://github.com/muyo8692/stepbystep-reasoning-vs-negation.git cd stepbystep-reasoning-vs-negation -
Install dependencies:
pip install -r requirements.txt
Running Experiments
Step 1: Generate Data
Execute the following script to generate the necessary datasets for all experiments.
bash src/scripts/generate_data.sh
This will populate the data/ directory with the required vocabulary and question files.
Step 2: Run an Experiment
The run_experiment.sh script is used to execute experiments. The script is highly configurable via command-line arguments.
Experimental Settings
The --exp_level argument controls the reasoning setting, as described in the paper:
| Setting | Description | exemplars_type | questions_type | vocab_type |
|---|---|---|---|---|
BASE | Standard syllogistic reasoning with real-world knowledge. | base | base | real |
FIC | Reasoning with fictional knowledge to test logical deduction. | base | base | fiction |
FICNEG | In-domain negation: both exemplars and questions contain negation. | neg | neg | fiction |
FICNEG-O | Out-of-domain negation: only questions contain negation. | base | neg | fiction |
Examples
-
Run the default experiment (
BASEsetting with GPT-3.5):bash src/scripts/run_experiment.sh -
Run the
FICNEG-Oexperiment on theoccupationdomain withopt-175bon a specific GPU:bash src/scripts/run_experiment.sh \ --model_name opt-175b \ --exp_level FICNEG-O \ --task_domain occupation \ --gpu_num 1 \ --certain_gpus_list 0 -
Run an
exemplar_reorderexperiment in theFICsetting:bash src/scripts/run_experiment.sh \ --model_name openai-gpt-4 \ --exp_level FIC \ --task_domain sports \ --task_type exemplar_reorder \ --exemplar_order_list 'exemplar_a' 'exemplar_b' 'exemplar_c' \ --exemplar_label_str 'yes_no_no'
Results will be saved to the output/ directory, organized by date and model name.
Cite our work
@inproceedings{ye-etal-2023-assessing,
title = "Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism",
author = "Ye, Mengyu and Kuribayashi, Tatsuki and Suzuki, Jun and Kobayashi, Goro and Funayama, Hiroaki",
booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
year = "2023",
url = "https://aclanthology.org/2023.emnlp-main.912/",
}