SemEval-2026 Task 7: Everyday Knowledge Accross Diverse Languages and Cultures Trial Data

September 16, 2025 · View on GitHub

This README file describes the datasets trial_data_multiple_choice.tsv and trial_data_unique_answer.tsv, provided as trial data for the SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures.

trial_data_multiple_choice.tsv

This file contains the trial data with multiple-choice options.

Columns:

  • index: Data point index.

  • lang_reg: Language and region code.

  • The mapping between the codes and the full language-country names is as follows:

CodeLanguage – Region
ms-SGMalay (Singapore)
ta-SGTamil (Singapore)
zh-SGChinese (Singapore)
es-ECSpanish (Ecuador)
en-GBEnglish (United Kingdom)
zh-CNChinese (China)
es-ESSpanish (Spain)
es-MXSpanish (Mexico)
id-IDIndonesian (Indonesia)
ko-KRKorean (South Korea)
el-GRGreek (Greece)
fa-IRPersian/Farsi (Iran)
ar-EGArabic (Egypt)
ar-MAArabic (Morocco)
ar-SAArabic (Saudi Arabia)
en-AUEnglish (Australia)
eu-ESBasque (Spain – Basque Country)
fr-FRFrench (France)
ga-IEIrish (Ireland)
ta-LKTamil (Sri Lanka)
tl-PHTagalog (Philippines)
bg-BGBulgarian (Bulgaria)
ja-JPJapanese (Japan)
  • question: The question text.

  • multiple_choice_options: The multiple-choice options for the question. Newline characters separate individual options.

  • correct_answer: The correct answer to the question. Leading and trailing whitespace has been removed from this column.

trial_data_unique_answer.tsv

  • This file contains the trial data with only the correct answer, excluding the multiple-choice options.

Columns:

  • index: Data point index

  • lang_reg: Language and region code.

  • The mapping between the codes and the full language-country names is as follows:

CodeLanguage – Region
ms-SGMalay (Singapore)
ta-SGTamil (Singapore)
zh-SGChinese (Singapore)
es-ECSpanish (Ecuador)
en-GBEnglish (United Kingdom)
zh-CNChinese (China)
es-ESSpanish (Spain)
es-MXSpanish (Mexico)
id-IDIndonesian (Indonesia)
ko-KRKorean (South Korea)
el-GRGreek (Greece)
fa-IRPersian/Farsi (Iran)
ar-EGArabic (Egypt)
ar-MAArabic (Morocco)
ar-SAArabic (Saudi Arabia)
en-AUEnglish (Australia)
eu-ESBasque (Spain – Basque Country)
fr-FRFrench (France)
ga-IEIrish (Ireland)
ta-LKTamil (Sri Lanka)
tl-PHTagalog (Philippines)
bg-BGBulgarian (Bulgaria)
ja-JPJapanese (Japan)
  • question: The question text.

  • correct_answer: The correct answer to the question.