LLaMA Pro: Progressive LLaMA with Block Expansion

May 20, 2024 ยท View on GitHub

๐Ÿ“ƒ Paper โ€ข ๐Ÿค— Demo & Model

News

๐Ÿ”ฅ Comprehensive Results

ModelGSM8k Pass@1MATH Pass@1
WizardMath-7B54.910.7
LLaMA-2-70B56.813.5
WizardMath-13B63.914.0
MetaMath-7B66.519.8
MetaMath-13B72.322.4
MetaMath-Mistral-7B77.728.2
MetaMath-Llemma-7B69.230.0
๐Ÿ”ฅ MetaMath-Mistral-Pro78.430.3

Acknowledgement

The code of instruction tuning is based on the official implementation of open-instruct.

Thanks huggingface & wisemodel for hosting our checkpoint.

Citation

The code and model in this repository is mostly developed for or derived from the paper below. Please cite it if you find the repository helpful.

@article{wu2024llama,
  title={Llama pro: Progressive llama with block expansion},
  author={Wu, Chengyue and Gan, Yukang and Ge, Yixiao and Lu, Zeyu and Wang, Jiahao and Feng, Ye and Luo, Ping and Shan, Ying},
  journal={arXiv preprint arXiv:2401.02415},
  year={2024}
}