TabFlash
November 27, 2025 ยท View on GitHub
Official implementation of TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing, in collaboration with Google Cloud AI, accepted at AAAI 2026 (Main Technical Track).
Overview
๐ค TabFlash is an efficient and accurate multimodal LLM, achieving state-of-the-art performance outperforming GPT-4o and Gemini 2.5 Pro with exceptionally low computational cost.
๐ TabFlash (3B) achieves state-of-the-art performance while reducing FLOPs by 27% and memory usage by 30% compared to the second-best MLLM.
โก TabFlash (1B) outperforms most MLLMs with exceptionally low TFLOPs and just 11.2 GB peak memory, enabling deployment on low-memory GPUs.
Setup
This code is tested on python 3.9, CUDA 12.4, PyTorch 2.4.1, and FlashAttention 2.7.3.
1. Create Conda Environment
conda create -n tabflash python=3.9 -y
conda activate tabflash
2. Install InternVL-2.5
Follow the official guide.
cd InternVL
pip install -r requirements.txt
3. Install PyTorch
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
4. Install Flash Attention v2.7.3
git clone --branch v2.7.3 --single-branch https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
python setup.py install
cd ..
5. Install Additional Dependencies
pip install wandb sacrebleu distance apted bitsandbytes --upgrade
pip install datasets==2.18.0
Dataset Preparation
TabFlash uses MMTab from Table-LLaVA.
1. Download MMTab-pre (Pretraining)
- Download MMTab-instruct_table_images_82K.zip and MMTab-pre_table_images_part_2_16K.zip.
- Place them under
data/LLaVA-Pretrain/imagesand unzip them. RenameIID_train_imagedirectory totable_pretrain_part_1. - Download table_only_pretrain_data_with_length.jsonl. Place it under
data/LLaVA-Pretrain.
2. Download MMTab-instruct (Instruction Fine-tuning)
- Download MMTab-instruct_table_images_82K.zip.
- Place it under
data/LLaVA-Finetune/images/table_instructVand unzip it. Rename the resultingIID_train_imagedirectory toimages. - Download table_only_sft_data_with_length.jsonl. Place it under
data/LLaVA-Finetune.
3. Download MMTab-eval (Inference)
- Download MMTab-eval_test_data_49K_llava_jsonl_format.jsonl and MMTab-eval_table_images_23K.zip.
- Place them under
data/LLaVA-Inferenceand unzip it.
4. Download files for evaluation
- Download MMTab-eval_test_data_49K.json and MMTab-eval_test_tables_23K.json.
- Place them under
data/MMTab-eval_evaluation.
Final file structure
TabFlash/
โโโ InternVL/
โ โโโ internvl_chat/
โ โ โโโ scripts/
โ โ โโโ inference.py
โ โ โโโ mmtab_eval.py
โ โ โโโ ...
โ โโโ ...
โโโ data/
โ โโโ LLaVA-Pretrain
โ โ โโโ images/
โ โ โ โโโ table_pretrain_part_1/
โ โ โ โโโ table_pretrain_part_2/
โ โ โโโ table_only_pretrain_data_with_length.jsonl
โ โโโ LLaVA-Finetune
โ โ โโโ images/
โ โ โ โโโ table_instructV/
โ โ โ โ โโโ images/
โ โ โโโ table_only_sft_data_with_length.jsonl
โ โโโ LLaVA-Inference
โ โ โโโ all_test_image/
โ โ โโโ MMTab-eval_test_data_49K_llava_jsonl_format.jsonl
โ โโโ MMTab-eval_evaluation
โ โ โโโ MMTab-eval_test_data_49K.json
โ โ โโโ MMTab-eval_test_tables_23K.json
โโโ assets/
โ โโโ acc_tflops_plot.png
โ โโโ ...
โโโ README.md
Move into code directory
Move into the directory below for training / inference / evaluation.
cd InternVL/internvl_chat/
Training
Pre-trained models
If you only want to use model, download tabflash_stage2_4b.tar and tabflash_stage2_1b.tar and unzip it under work_dirs/internvl_chat_v2_5/tabflash_4b and work_dirs/internvl_chat_v2_5/tabflash_1b, respectively.
If you want to train the model from scratch, follow the instructions below. TabFlash training consists of two stages:
Stage 1
bash scripts/4b_train_stage1.sh # For 4B model
bash scripts/1b_train_stage1.sh # For 1B model
Stage 2
bash scripts/4b_train_stage2.sh # For 4B model
bash scripts/1b_train_stage2.sh # For 1B model
Inference
Run inference on test set:
bash scripts/4b_inference.sh # For 4B model
bash scripts/1b_inference.sh # For 1B model
Evaluation
Evaluate the model predictions:
python mmtab_eval.py --pred_file results/{exp_name}/result.jsonl
Citation
If you find this work useful, please cite:
@inproceedings{
kim2026tabflash,
title={TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing},
author={Kim, Jongha and Bae, Minseong and Lee, Sanghyeok and Yoon, Jinsung and Kim, Hyunwoo J},
booktitle={AAAI},
year={2026}
}
Acknowledgements
This codebase is based on InternVL and Table-LLaVA.