TabFlash

November 27, 2025 ยท View on GitHub

Official implementation of TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing, in collaboration with Google Cloud AI, accepted at AAAI 2026 (Main Technical Track).

Overview

๐Ÿค– TabFlash is an efficient and accurate multimodal LLM, achieving state-of-the-art performance outperforming GPT-4o and Gemini 2.5 Pro with exceptionally low computational cost.

๐Ÿš€ TabFlash (3B) achieves state-of-the-art performance while reducing FLOPs by 27% and memory usage by 30% compared to the second-best MLLM.

โšก TabFlash (1B) outperforms most MLLMs with exceptionally low TFLOPs and just 11.2 GB peak memory, enabling deployment on low-memory GPUs.

Accuracy vs TFLOPs Plot

Setup

This code is tested on python 3.9, CUDA 12.4, PyTorch 2.4.1, and FlashAttention 2.7.3.

1. Create Conda Environment

conda create -n tabflash python=3.9 -y
conda activate tabflash

2. Install InternVL-2.5

Follow the official guide.

cd InternVL
pip install -r requirements.txt

3. Install PyTorch

pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124

4. Install Flash Attention v2.7.3

git clone --branch v2.7.3 --single-branch https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
python setup.py install
cd ..

5. Install Additional Dependencies

pip install wandb sacrebleu distance apted bitsandbytes --upgrade
pip install datasets==2.18.0

Dataset Preparation

TabFlash uses MMTab from Table-LLaVA.

1. Download MMTab-pre (Pretraining)

  1. Download MMTab-instruct_table_images_82K.zip and MMTab-pre_table_images_part_2_16K.zip.
  2. Place them under data/LLaVA-Pretrain/images and unzip them. Rename IID_train_image directory to table_pretrain_part_1.
  3. Download table_only_pretrain_data_with_length.jsonl. Place it under data/LLaVA-Pretrain.

2. Download MMTab-instruct (Instruction Fine-tuning)

  1. Download MMTab-instruct_table_images_82K.zip.
  2. Place it under data/LLaVA-Finetune/images/table_instructV and unzip it. Rename the resulting IID_train_image directory to images.
  3. Download table_only_sft_data_with_length.jsonl. Place it under data/LLaVA-Finetune.

3. Download MMTab-eval (Inference)

  1. Download MMTab-eval_test_data_49K_llava_jsonl_format.jsonl and MMTab-eval_table_images_23K.zip.
  2. Place them under data/LLaVA-Inference and unzip it.

4. Download files for evaluation

  1. Download MMTab-eval_test_data_49K.json and MMTab-eval_test_tables_23K.json.
  2. Place them under data/MMTab-eval_evaluation.

Final file structure

TabFlash/
โ”œโ”€โ”€ InternVL/
โ”‚   โ”œโ”€โ”€ internvl_chat/
โ”‚   โ”‚   โ”œโ”€โ”€ scripts/
โ”‚   โ”‚   โ”œโ”€โ”€ inference.py
โ”‚   โ”‚   โ”œโ”€โ”€ mmtab_eval.py
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ LLaVA-Pretrain
โ”‚   โ”‚   โ”œโ”€โ”€ images/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ table_pretrain_part_1/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ table_pretrain_part_2/
โ”‚   โ”‚   โ”œโ”€โ”€ table_only_pretrain_data_with_length.jsonl
โ”‚   โ”œโ”€โ”€ LLaVA-Finetune
โ”‚   โ”‚   โ”œโ”€โ”€ images/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ table_instructV/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ images/
โ”‚   โ”‚   โ”œโ”€โ”€ table_only_sft_data_with_length.jsonl
โ”‚   โ”œโ”€โ”€ LLaVA-Inference
โ”‚   โ”‚   โ”œโ”€โ”€ all_test_image/
โ”‚   โ”‚   โ”œโ”€โ”€ MMTab-eval_test_data_49K_llava_jsonl_format.jsonl
โ”‚   โ”œโ”€โ”€ MMTab-eval_evaluation
โ”‚   โ”‚   โ”œโ”€โ”€ MMTab-eval_test_data_49K.json
โ”‚   โ”‚   โ”œโ”€โ”€ MMTab-eval_test_tables_23K.json
โ”œโ”€โ”€ assets/
โ”‚   โ”œโ”€โ”€ acc_tflops_plot.png
โ”‚   โ””โ”€โ”€ ...
โ””โ”€โ”€ README.md

Move into code directory

Move into the directory below for training / inference / evaluation.

cd InternVL/internvl_chat/

Training

Pre-trained models

If you only want to use model, download tabflash_stage2_4b.tar and tabflash_stage2_1b.tar and unzip it under work_dirs/internvl_chat_v2_5/tabflash_4b and work_dirs/internvl_chat_v2_5/tabflash_1b, respectively.

If you want to train the model from scratch, follow the instructions below. TabFlash training consists of two stages:

Stage 1

bash scripts/4b_train_stage1.sh # For 4B model
bash scripts/1b_train_stage1.sh # For 1B model

Stage 2

bash scripts/4b_train_stage2.sh # For 4B model
bash scripts/1b_train_stage2.sh # For 1B model

Inference

Run inference on test set:

bash scripts/4b_inference.sh # For 4B model
bash scripts/1b_inference.sh # For 1B model

Evaluation

Evaluate the model predictions:

python mmtab_eval.py --pred_file results/{exp_name}/result.jsonl

Citation

If you find this work useful, please cite:

@inproceedings{
    kim2026tabflash,
    title={TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing},
    author={Kim, Jongha and Bae, Minseong and Lee, Sanghyeok and Yoon, Jinsung and Kim, Hyunwoo J},
    booktitle={AAAI},
    year={2026}
}

Acknowledgements

This codebase is based on InternVL and Table-LLaVA.