static_models.md

April 1, 2025 · View on GitHub

静态图模型下载支持

  • 静态图模型已支持 Deepseek 系列、Qwen 系列、llama 系列等模型 详细支持列表如下:

DeepSeekV2

模型名称静态图下载 model_name
deepseek-ai/DeepSeek-V2-Chat🚧
deepseek-ai/DeepSeek-V2-Lite-Chat🚧

DeepSeekV3

模型名称静态图下载 model_name
deepseek-ai/DeepSeek-V3🚧

DeepSeekR1

部署硬件要求:

  • 除MTP模型,Fp8模型之外支持的最低版本是SM80 (机器:A100 / A800) 要求CUDA 11.8 以上
  • DeepSeek-R1-MTP 与 Fp8 模型 支持的最低版本是SM90 (机器:H800) 要求CUDA 12.4 以上
  • Append-Attn:
    • 支持的最低版本是SM80 (机器:A100 / A800)
    • 要求CUDA 11.8 以上
  • Block-Attn:
    • 支持的最低版本是SM70 (机器:V100)
    • 要求CUDA 11.8 以上
模型名称精度MTP节点数静态图下载 model_name
deepseek-ai/DeepSeek-R1weight_only_int41deepseek-ai/DeepSeek-R1/weight_only_int4
deepseek-ai/DeepSeek-R1weight_only_int41deepseek-ai/DeepSeek-R1-MTP/weight_only_int4
deepseek-ai/DeepSeek-R1weight_only_int82deepseek-ai/DeepSeek-R1-2nodes/weight_only_int8
deepseek-ai/DeepSeek-R1weight_only_int82deepseek-ai/DeepSeek-R1-MTP-2nodes/weight_only_int8
deepseek-ai/DeepSeek-R1a8w8_fp82deepseek-ai/DeepSeek-R1-2nodes/a8w8_fp8
deepseek-ai/DeepSeek-R1a8w8_fp82deepseek-ai/DeepSeek-R1-MTP-2nodes/a8w8_fp8
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/weight_only_int8,deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B-Block-Attn/weight_only_int8
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5Bfp16--deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B-Block-Attn/float16
deepseek-ai/DeepSeek-R1-Distill-Qwen-7Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/weight_only_int8,deepseek-ai/DeepSeek-R1-Distill-Qwen-7B-Block-Attn/weight_only_int8
deepseek-ai/DeepSeek-R1-Distill-Qwen-7Bfp16--deepseek-ai/DeepSeek-R1-Distill-Qwen-7B-Block-Attn/float16
deepseek-ai/DeepSeek-R1-Distill-Qwen-14Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/weight_only_int8,deepseek-ai/DeepSeek-R1-Distill-Qwen-14B-Block-Attn/weight_only_int8
deepseek-ai/DeepSeek-R1-Distill-Qwen-14Bfp16--deepseek-ai/DeepSeek-R1-Distill-Qwen-14B-Block-Attn/float16
deepseek-ai/DeepSeek-R1-Distill-Qwen-32Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/weight_only_int8,deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-Block-Attn/weight_only_int8
deepseek-ai/DeepSeek-R1-Distill-Qwen-32Bfp16--deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-Block-Attn/float16
deepseek-ai/DeepSeek-R1-Distill-Llama-8Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Llama-8B/weight_only_int8,deepseek-ai/DeepSeek-R1-Distill-Llama-8B-Block-Attn/weight_only_int8
deepseek-ai/DeepSeek-R1-Distill-Llama-8Bfp16--deepseek-ai/DeepSeek-R1-Distill-Llama-8B-Block-Attn/float16
deepseek-ai/DeepSeek-R1-Distill-Llama-70Bweight_only_int8--deepseek-ai/DeepSeek-R1-Distill-Llama-70B/weight_only_int8

QWQ

模型名称静态图下载 model_name
Qwen/QwQ-32BQwen/QwQ-32B-Append-Attn/weight_only_int8
Qwen/QwQ-32B-Preview🚧

LLaMA

模型名称静态图下载 model_name
facebook/llama-7b🚧
facebook/llama-13b🚧
facebook/llama-30b🚧
facebook/llama-65b🚧

Llama2

模型名称静态图下载 model_name
meta-llama/Llama-2-7b🚧
meta-llama/Llama-2-7b-chat🚧
meta-llama/Llama-2-13b🚧
meta-llama/Llama-2-13b-chat🚧
meta-llama/Llama-2-70b🚧
meta-llama/Llama-2-70b-chat🚧

Llama3

部署硬件要求:

  • Append-Attn:
    • 支持的最低版本是SM80 (机器:A100 / A800)
    • 要求CUDA 11.8 以上
  • Block-Attn:
    • 支持的最低版本是SM70 (机器:V100)
    • 要求CUDA 11.8 以上
模型名称静态图下载 model_name
meta-llama/Meta-Llama-3-8B🚧
meta-llama/Meta-Llama-3-8B-Instructmeta-llama/Meta-Llama-3-8B-Instruct-Append-Attn/bfloat16,meta-llama/Meta-Llama-3-8B-Instruct-Block-Attn/float16
meta-llama/Meta-Llama-3-70B🚧
meta-llama/Meta-Llama-3-70B-Instruct🚧

Llama3.1

模型名称静态图下载 model_name
meta-llama/Meta-Llama-3.1-8B🚧
meta-llama/Meta-Llama-3.1-8B-Instruct🚧
meta-llama/Meta-Llama-3.1-70B🚧
meta-llama/Meta-Llama-3.1-70B-Instruct🚧
meta-llama/Meta-Llama-3.1-405B🚧
meta-llama/Meta-Llama-3.1-405B-Instruct🚧
meta-llama/Llama-Guard-3-8B🚧

Llama3.2

模型名称静态图下载 model_name
meta-llama/Llama-3.2-1B🚧
meta-llama/Llama-3.2-1B-Instruct🚧
meta-llama/Llama-3.2-3B🚧
meta-llama/Llama-3.2-3B-Instruct🚧
meta-llama/Llama-Guard-3-1B🚧

Llama3.3

模型名称静态图下载 model_name
meta-llama/Llama-3.3-70B-Instruct🚧

Mixtral

模型名称静态图下载 model_name
mistralai/Mixtral-8x7B-Instruct-v0.1🚧

Qwen

模型名称静态图下载 model_name
qwen/qwen-7b🚧
qwen/qwen-7b-chat🚧
qwen/qwen-14b🚧
qwen/qwen-14b-chat🚧
qwen/qwen-72b🚧
qwen/qwen-72b-chat🚧

Qwen1.5

部署硬件要求:

  • Block-Attn:
    • 支持的最低版本是SM70 (机器:V100)
    • 要求CUDA 11.8 以上
模型名称静态图下载 model_name
Qwen/Qwen1.5-0.5BQwen/Qwen1.5-0.5B-Block-Attn/bfloat16,Qwen/Qwen1.5-0.5B-Block-Attn/float16
Qwen/Qwen1.5-0.5B-Chat🚧
Qwen/Qwen1.5-1.8B🚧
Qwen/Qwen1.5-1.8B-Chat🚧
Qwen/Qwen1.5-4B🚧
Qwen/Qwen1.5-4B-Chat🚧
Qwen/Qwen1.5-7B🚧
Qwen/Qwen1.5-7B-Chat🚧
Qwen/Qwen1.5-14B🚧
Qwen/Qwen1.5-14B-Chat🚧
Qwen/Qwen1.5-32B🚧
Qwen/Qwen1.5-32B-Chat🚧
Qwen/Qwen1.5-72B🚧
Qwen/Qwen1.5-72B-Chat🚧
Qwen/Qwen1.5-110B🚧
Qwen/Qwen1.5-110B-Chat🚧
Qwen/Qwen1.5-MoE-A2.7B🚧
Qwen/Qwen1.5-MoE-A2.7B-Chat🚧

Qwen2

部署硬件要求:

  • Append-Attn:
    • 支持的最低版本是SM80 (机器:A100 / A800)
    • 要求CUDA 11.8 以上
  • Block-Attn:
    • 支持的最低版本是SM70 (机器:V100)
    • 要求CUDA 11.8 以上
模型名称静态图下载 model_name
Qwen/Qwen2-0.5B🚧
Qwen/Qwen2-0.5B-Instruct🚧
Qwen/Qwen2-1.5B🚧
Qwen/Qwen2-1.5B-InstructQwen/Qwen2-1.5B-Instruct-Append-Attn/bfloat16, Qwen/Qwen2-1.5B-Instruct-Block-Attn/float16
Qwen/Qwen2-7B🚧
Qwen/Qwen2-7B-Instruct🚧
Qwen/Qwen2-72B🚧
Qwen/Qwen2-72B-Instruct🚧
Qwen/Qwen2-57B-A14B🚧
Qwen/Qwen2-57B-A14B-Instruct🚧

Qwen2-Math

模型名称静态图下载 model_name
Qwen/Qwen2-Math-1.5B🚧
Qwen/Qwen2-Math-1.5B-Instruct🚧
Qwen/Qwen2-Math-7B🚧
Qwen/Qwen2-Math-7B-Instruct🚧
Qwen/Qwen2-Math-72B🚧
Qwen/Qwen2-Math-72B-Instruct🚧
Qwen/Qwen2-Math-RM-72B🚧

Qwen2.5

模型名称静态图下载 model_name
Qwen/Qwen2.5-0.5B🚧
Qwen/Qwen2.5-0.5B-Instruct🚧
Qwen/Qwen2.5-1.5B🚧
Qwen/Qwen2.5-1.5B-Instruct🚧
Qwen/Qwen2.5-3B🚧
Qwen/Qwen2.5-3B-Instruct🚧
Qwen/Qwen2.5-7B🚧
Qwen/Qwen2.5-7B-Instruct🚧
Qwen/Qwen2.5-14B🚧
Qwen/Qwen2.5-14B-Instruct🚧
Qwen/Qwen2.5-32B🚧
Qwen/Qwen2.5-32B-Instruct🚧
Qwen/Qwen2.5-72B🚧
Qwen/Qwen2.5-72B-Instruct🚧

Qwen2.5-Math

模型名称静态图下载 model_name
Qwen/Qwen2.5-Math-1.5B🚧
Qwen/Qwen2.5-Math-1.5B-Instruct🚧
Qwen/Qwen2.5-Math-7B🚧
Qwen/Qwen2.5-Math-7B-Instruct🚧
Qwen/Qwen2.5-Math-72B🚧
Qwen/Qwen2.5-Math-72B-Instruct🚧
Qwen/Qwen2.5-Math-RM-72B🚧

Qwen2.5-Coder

模型名称静态图下载 model_name
Qwen/Qwen2.5-Coder-1.5B🚧
Qwen/Qwen2.5-Coder-1.5B-Instruct🚧
Qwen/Qwen2.5-Coder-7B🚧
Qwen/Qwen2.5-Coder-7B-Instruct🚧