Models

February 24, 2025 · View on GitHub

All our models are released under a research-only RAIL Model License.

Downloading the pretrained models

All our models described in our tech report are released on our GitHub.

For VaViM-L and VaVAM-L we use the following script to convert the weights to tar files:

# Performs something akin to:
# tar czf - filename | split -b 1900MB - filename.tar.gz.part_
python scripts/handle_checkpoints.py \
--mode create \
--checkpoint_dir XXXX \
--outdir vavam_release \
--maxsize 1900MB

The weights are then chunked into several tar files, you can merge them using the following command:

  1. Download all tar files (for VaViM-L or VaVAM-L).
  2. Put them in a single folder (e.g., vavam_l_release_chunks).
  3. Run the following command:
# Performs something akin to:
# cat filename.tar.gz.part_* > filename.tar.gz
# tar xzf filename.tar.gz
python scripts/handle_checkpoints.py \
--mode extract \
--checkpoint_dir vavam_l_release_chunks \
--outdir vavam_release

The other models are released as single torch pickle files (can be loaded with torch.load).

Available models

Main models

Here are the links for our main VaViM and VaVAM models:

model # of
params
VaViM VaVAM
VaVAM-S 185M + 21M part 1 part 1
VaVAM-B 318M + 38M part 1 part 1
VaVAM-L 1.2B + 150M part 1, part 2, part 3 part 1, part 2, part 3, part 4

VaViM only

We also release the different checkpoints that helped up compute our scaling laws. Here are the different VaViM models, with different sizes and trained on different amounts of data:

model # params
(in M)
# data
(in ×103)
pre-trained fine-tuned
VaViM-S 185 38 part 1 part 1
VaViM-S 185 77 part 1 part 1
VaViM-S 185 116 part 1 part 1
VaViM-S 185 139 part 1 part 1
VaViM-B 318 38 part 1 part 1
VaViM-B 318 77 part 1 part 1
VaViM-B 318 116 part 1 part 1
VaViM-B 318 139 part 1 part 1
VaViM-L 1200 139 part 1, part 2, part 3 part 1, part 2, part 3

VaVAM

We trained VaVAM models given the VaViM models. Here are the different VaVAM models, with their corresponding amount of pre-training data:

model # params # data
(in ×103)
VaVAM
VaVAM-S 185M + 21M 38 part 1
VaVAM-S 185M + 21M 77 part 1
VaVAM-S 185M + 21M 116 part 1
VaVAM-S 185M + 21M 139 part 1
VaVAM-B 318M + 38M 38 part 1
VaVAM-B 318M + 38M 77 part 1
VaVAM-B 318M + 38M 116 part 1
VaVAM-B 318M + 38M 139 part 1
VaVAM-L 1.2B + 150M 139 part 1, part 2, part 3, part 4