README.md

January 3, 2026 ยท View on GitHub

Brainteaser

arXiv OpenReview NeurIPS 2025 HuggingFace Dataset Github Stars

This is the official implementation of Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models, NeurIPS 2025.

Citation

@article{han2025creativity,
  title={Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models},
  author={Han, Simeng and Dai, Howard and Xia, Stephen and Zhang, Grant and Liu, Chen and Chen, Lichang and Nguyen, Hoang Huy and Mei, Hongyuan and Mao, Jiayuan and McCoy, R. Thomas},
  journal={Advances in neural information processing systems},
  year={2025}
}

Usage

(Update the code below please)

python queryGPT.py \
    --name [Name of your experiment] \
    --dataset [Math/Logic] \
    --prompt ["promptInstructions.txt"] \
    --rows [Default is 1] \
    --samples [Default is 1]

# Name: the name of your experiment
# Dataset: Math or Logic
# Prompt: your instructions file
# Rows: the # of problems from the data to sample on
# Samples: the # of times to query each question
# Results will be saved in ./results

Dependencies

We developed the codebase in a miniconda environment. How we created the conda environment:

# Optional: Update to libmamba solver.
conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba

conda create --name bt pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -c anaconda -c conda-forge -y
conda activate bt
conda install scikit-learn scikit-image pandas matplotlib seaborn tqdm -c pytorch -c anaconda -c conda-forge -y

python -m pip install beautifulsoup4 requests
python -m pip install nltk

python -m pip install openai
python -m pip install google-genai
python -m pip install anthropic