File Formats Supported by Different Tasks

October 14, 2022 ยท View on GitHub

Here is the list of the file formats supported by different tasks.

For dataset files, if your datasets have been supported by datalab, you fortunately don't need to prepare the dataset. Otherwise, you can upload your custom datasets in the supported formats.

You may refer to the example files for more specific information about the formats, or click on the task link for more explanation.

To upload custom features and analysis, please refer to this instruction.

TaskFile TypeFile FormatExample File
conditional generation (machine translation/summarization)datasetTSVcnndm_mini-dataset.tsv
JSONconala-dataset.json
outputJSONconala-baseline-output.json
TXTcnndm_mini-bart-output.txt
text classificationdatasetTSVsst2-dataset.tsv
JSONtext-classification-dataset.json
outputJSONtext-classification-output.json
TXTsst2-lstm-output.txt
sequence labeling (NER/word segmentation/chunking)datasetCoNLLconll2003-dataset.conll
outputCoNLLconll2003-elmo-output.conll
JSON
cloze multiple choicedatasetJSON
outputJSON
cloze generativedatasetJSON
outputJSON
QA (extractive)datasetJSONsquad_mini-dataset.json
outputJSONsquad_mini-example-output.json
QA (MCQ)datasetJSONfig_qa-dataset.json
outputJSONfig_qa-bert-output.json
QA (open domain)datasetJSON
outputTXTtest.dpr.nq.txt
aspect-based sentiment analysisdatasetTSVabsa-dataset.tsv
JSON
outputJSONabsa-example-output-confidence.json
TXTabsa-example-output.txt
grammatical error correctiondatasetJSON
outputJSON
text pair classificationdatasetTSVsnli-dataset.tsv
JSON
outputJSON
TXTsnli-roberta-output.txt
knowledge graph link tail predictiondatasetJSON
outputJSON
language modelingdatasetJSON
TXT
outputJSON
TXT
tabular classificationdatasetJSONsst2-tabclass-dataset.json
outputJSON
TXT
tabular regressiondatasetJSONsst2-tabreg-dataset.json
outputJSON
TXTsst2-tabreg-lstm-output.txt