ir_datasets

May 28, 2026 · View on GitHub

⚠️ This repository has moved to https://github.com/ir-datasets/ir-datasets

ir_datasets

ir_datasets is a python package that provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.

It can now be found here.

@inproceedings{DBLP:conf/sigir/MacAvaneyYFDCG21,
  author       = {Sean MacAvaney and
                  Andrew Yates and
                  Sergey Feldman and
                  Doug Downey and
                  Arman Cohan and
                  Nazli Goharian},
  title        = {Simplified Data Wrangling with ir{\_}datasets},
  booktitle    = {{SIGIR}},
  pages        = {2429--2436},
  publisher    = {{ACM}},
  year         = {2021}
}