README.md

February 21, 2026 · View on GitHub

bert4torch

licence GitHub release PyPI PyPI - Downloads GitHub stars GitHub Issues contributions welcome Generic badge

Documentation | Torch4keras | Examples | build_MiniLLM_from_scratch | bert4vector

目录

1. 下载安装

安装稳定版

pip install bert4torch

安装最新版

pip install git+https://github.com/Tongjilibo/bert4torch
  • 注意事项:pip包的发布慢于git上的开发版本,git clone注意引用路径,注意权重是否需要转换
  • 测试用例git clone https://github.com/Tongjilibo/bert4torch,修改example中的预训练模型文件路径和数据路径即可启动脚本
  • 自行训练:针对自己的数据,修改相应的数据处理代码块
  • 开发环境:原使用 torch==1.10版本进行开发,现已切换到 torch2.0开发,如其他版本遇到不适配,欢迎反馈

2. 功能

  • LLM模型: 加载chatglm、llama、 baichuan、ziya、bloom等开源大模型权重进行推理和微调,命令行一行部署大模型

  • 核心功能:加载bert、roberta、albert、xlnet、nezha、bart、RoFormer、RoFormer_V2、ELECTRA、GPT、GPT2、T5、GAU-alpha、ERNIE等预训练权重继续进行finetune、并支持在bert基础上灵活定义自己模型

  • 丰富示例:包含llmpretrainsentence_classficationsentence_embeddingsequence_labelingrelation_extractionseq2seqserving等多种解决方案

  • 实验验证:已在公开数据集实验验证,使用如下examples数据集实验指标

  • 易用trick:集成了常见的trick,即插即用

  • 其他特性加载transformers库模型一起使用;调用方式简洁高效;有训练进度条动态展示;配合torchinfo打印参数量;默认Logger和Tensorboard简便记录训练过程;自定义fit过程,满足高阶需求

  • 训练过程

    训练过程

功能bert4torchtransformers备注
训练进度条进度条打印loss和定义的metrics
分布式训练dp/ddptorch自带dp/ddp
各类callbacks日志/tensorboard/earlystop/wandb等
大模型推理,stream/batch输出各个模型是通用的,无需单独维护脚本
大模型微调lora依赖peft库,pv2自带
丰富tricks对抗训练等tricks即插即用
代码简洁易懂,自定义空间大代码复用度高, keras代码训练风格
仓库的维护能力/影响力/使用量/兼容性目前仓库个人维护
一键部署大模型

3. 快速上手

3.1 上手教程

3.2 命令行快速部署大模型服务

  • 本地 / 联网加载
    # 联网下载全部文件
    bert4torch serve --checkpoint_path Qwen2-0.5B-Instruct
    
    # 加载本地大模型,联网下载bert4torch_config.json
    bert4torch serve --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --config_path Qwen/Qwen2-0.5B-Instruct
    
    # 加载本地大模型,且bert4torch_config.json已经下载并放于同名目录下
    bert4torch serve --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct
    
  • 命令行 / gradio网页 / openai_api
    # 命令行
    bert4torch serve --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode cli
    
    # gradio网页
    bert4torch serve --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode gradio
    
    # openai_api
    bert4torch serve --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode openai
    
  • 命令行聊天示例 命令行聊天

4. 版本和更新历史

4.1 版本历史

更新日期bert4torchtorch4keras版本说明
202601140.6.10.3.3增加paddleocr-vl,优化代码结构,去除硬代码模型配置项
202509250.6.00.3.2增加 Qwen3-moe, 支持 gptqawq等主流量化方式,其他代码优化
202507210.5.9.post20.3.1增加 Ernie4_5, 修复hub下载bug, 拆分出 openai_client

更多版本

4.2 更新历史

更多历史

5. 预训练权重

5.1 权重加载

from bert4torch.models import build_transformer_model

# 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型
model = build_transformer_model('./model/bert4torch_config.json')

# 2. 仅指定checkpoint_path: 
## 2.1 文件夹路径: 自动寻找路径下的*.bin/*.safetensors权重文件 + 需把bert4torch_config.json下载并放于该目录下
model = build_transformer_model(checkpoint_path='./model')

## 2.2 文件路径/列表: 文件路径即权重路径/列表, bert4torch_config.json会从同级目录下寻找
model = build_transformer_model(checkpoint_path='./pytorch_model.bin')

## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件
model = build_transformer_model(checkpoint_path='google-bert/bert-base-chinese')

# 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): 
#    本地路径从本地加载,pretrained_model_name会联网下载
config_path = './model/bert4torch_config.json'  # 或'google-bert/bert-base-chinese'
checkpoint_path = './model/pytorch_model.bin'  # 或'google-bert/bert-base-chinese'
model = build_transformer_model(config_path, checkpoint_path)

5.2 权重链接

模型分类模型名称权重来源checkpoint_pathconfig_path
bertbert-base-chinesegoogle-bertgoogle-bert/bert-base-chinese 🤗🤗
chinese_L-12_H-768_A-12谷歌tf权重
Tongjilibo/bert-chinese_L-12_H-768_A-12 🤗
chinese-bert-wwm-extHFLhfl/chinese-bert-wwm-ext 🤗🤗
bert-base-multilingual-casedgoogle-bertgoogle-bert/bert-base-multilingual-cased 🤗🤗
bert-base-casedgoogle-bertgoogle-bert/bert-base-cased 🤗🤗
bert-base-uncasedgoogle-bertgoogle-bert/bert-base-uncased 🤗🤗
MacBERTHFLhfl/chinese-macbert-base 🤗
hfl/chinese-macbert-large 🤗
🤗
🤗
WoBERT追一科技junnyu/wobert_chinese_base 🤗
junnyu/wobert_chinese_plus_base 🤗
🤗
🤗
robertachinese-roberta-wwm-extHFLhfl/chinese-roberta-wwm-ext 🤗
hfl/chinese-roberta-wwm-ext-large 🤗
(large的mlm权重是随机初始化)
🤗
🤗
roberta-small/tiny追一科技Tongjilibo/chinese_roberta_L-4_H-312_A-12 🤗
Tongjilibo/chinese_roberta_L-6_H-384_A-12 🤗
roberta-baseFacebookAIFacebookAI/roberta-base 🤗🤗
guwenbertethanytethanyt/guwenbert-base 🤗🤗
albertalbert_zh
albert_pytorch
brightmartvoidful/albert_chinese_tiny 🤗
voidful/albert_chinese_small 🤗
voidful/albert_chinese_base 🤗
voidful/albert_chinese_large 🤗
voidful/albert_chinese_xlarge 🤗
voidful/albert_chinese_xxlarge 🤗
🤗
🤗
🤗
🤗
🤗
🤗
nezhaNEZHA
NeZha_Chinese_PyTorch
huawei_noahsijunhe/nezha-cn-base 🤗
sijunhe/nezha-cn-large 🤗
sijunhe/nezha-base-wwm 🤗
sijunhe/nezha-large-wwm 🤗
🤗
🤗
🤗
🤗
nezha_gpt_dialogbojoneTongjilibo/nezha_gpt_dialog 🤗
xlnetChinese-XLNetHFLhfl/chinese-xlnet-base 🤗🤗
tranformer_xlhuggingfacetransfo-xl/transfo-xl-wt103 🤗🤗
debertaErlangshen-DeBERTa-v2IDEAIDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese 🤗
IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese 🤗
IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese 🤗
🤗
🤗
🤗
electraChinese-ELECTRAHFLhfl/chinese-electra-base-discriminator 🤗🤗
ernieernie百度文心nghuyong/ernie-1.0-base-zh 🤗
nghuyong/ernie-3.0-base-zh 🤗
🤗
🤗
roformerroformer追一科技junnyu/roformer_chinese_base 🤗🤗
roformer_v2追一科技junnyu/roformer_v2_chinese_char_base 🤗🤗
simbertsimbert追一科技Tongjilibo/simbert-chinese-base 🤗
Tongjilibo/simbert-chinese-small 🤗
Tongjilibo/simbert-chinese-tiny 🤗
simbert_v2/roformer-sim追一科技junnyu/roformer_chinese_sim_char_base 🤗
junnyu/roformer_chinese_sim_char_ft_base 🤗
junnyu/roformer_chinese_sim_char_small 🤗
junnyu/roformer_chinese_sim_char_ft_small 🤗
🤗
🤗
🤗
🤗
gauGAU-alpha追一科技Tongjilibo/chinese_GAU-alpha-char_L-24_H-768 🤗
ModernBERTModernBERTanswerdotaianswerdotai/ModernBERT-base 🤗
answerdotai/ModernBERT-large 🤗
🤗
🤗
uieuie
uie_pytorch
百度Tongjilibo/uie-base 🤗
gptCDial-GPTthu-coaithu-coai/CDial-GPT_LCCC-base 🤗
thu-coai/CDial-GPT_LCCC-large 🤗
🤗
🤗
cmp_lm(26亿)清华TsinghuaAI/CPM-Generate 🤗🤗
nezha_genhuawei_noahTongjilibo/chinese_nezha_gpt_L-12_H-768_A-12 🤗
gpt2-chinese-cluecorpussmallUERuer/gpt2-chinese-cluecorpussmall 🤗🤗
gpt2-mlimcasparTongjilibo/gpt2-ml_15g_corpus 🤗
Tongjilibo/gpt2-ml_30g_corpus 🤗
torch,BaiduYun(84dh)
bartbart_base_chinese复旦fnlpfnlp/bart-base-chinese 🤗
fnlp/bart-base-chinese-v1.0
🤗
🤗
t5t5UERuer/t5-small-chinese-cluecorpussmall 🤗
uer/t5-base-chinese-cluecorpussmall 🤗
🤗
🤗
mt5谷歌google/mt5-base 🤗🤗
t5_pegasus追一科技Tongjilibo/chinese_t5_pegasus_small 🤗
Tongjilibo/chinese_t5_pegasus_base 🤗
chatyuanclue-aiClueAI/ChatYuan-large-v1 🤗
ClueAI/ChatYuan-large-v2 🤗
🤗
🤗
PromptCLUEclue-aiClueAI/PromptCLUE-base 🤗🤗
chatglmChatGLM-6Bzai-orgzai-org/chatglm-6b 🤗
zai-org/chatglm-6b-int8 🤗
zai-org/chatglm-6b-int4 🤗
zai-org/chatglm-6b-v0.1.0🤗
🤗
🤗
🤗
🤗
ChatGLM2-6Bzai-orgzai-org/chatglm2-6b 🤗
zai-org/chatglm2-6b-int4 🤗
zai-org/chatglm2-6b-32k 🤗
🤗
🤗
🤗
ChatGLM3zai-orgzai-org/chatglm3-6b 🤗
zai-org/chatglm3-6b-32k 🤗
🤗
🤗
GLM-4zai-orgzai-org/glm-4-9b 🤗
zai-org/glm-4-9b-chat 🤗
zai-org/glm-4-9b-chat-1m 🤗
zai-org/glm-4v-9b 🤗
zai-org/GLM-4-9B-0414 🤗
zai-org/GLM-Z1-9B-0414 🤗
🤗
🤗
🤗
🤗


llamallamametameta-llama/llama-7b
meta-llama/llama-13b
🤗
🤗
llama-2metameta-llama/Llama-2-7b-hf🤗
meta-llama/Llama-2-7b-chat-hf🤗
meta-llama/Llama-2-13b-hf🤗
meta-llama/Llama-2-13b-chat-hf🤗
🤗
🤗
🤗
🤗
llama-3metameta-llama/Meta-Llama-3-8B 🤗
meta-llama/Meta-Llama-3-8B-Instruct 🤗
🤗
🤗
llama-3.1metameta-llama/Meta-Llama-3.1-8B 🤗
meta-llama/Meta-Llama-3.1-8B-Instruct 🤗
🤗
🤗
llama-3.2metameta-llama/Llama-3.2-1B 🤗
meta-llama/Llama-3.2-1B-Instruct 🤗
meta-llama/Llama-3.2-3B 🤗
meta-llama/Llama-3.2-3B-Instruct 🤗
🤗
🤗
🤗
🤗
llama-3.2-visionmetameta-llama/Llama-3.2-11B-Vision 🤗
meta-llama/Llama-3.2-11B-Vision-Instruct 🤗
🤗
🤗
llama-seriesChinese-LLaMA-AlpacaHFLhfl/chinese-alpaca-plus-lora-7b 🤗
hfl/chinese-llama-plus-lora-7b 🤗
(使用前需要合并lora权重)
🤗
🤗

Chinese-LLaMA-Alpaca-2HFL待添加
Chinese-LLaMA-Alpaca-3HFL待添加
Belle_llamaLianjiaTechBelleGroup/BELLE-LLaMA-7B-2M-enc🤗合成说明🤗
ZiyaIDEA-CCNLIDEA-CCNL/Ziya-LLaMA-13B-v1🤗
IDEA-CCNL/Ziya-LLaMA-13B-v1.1🤗
IDEA-CCNL/Ziya-LLaMA-13B-Pretrain-v1🤗
🤗
🤗

vicunalmsyslmsys/vicuna-7b-v1.5 🤗🤗
BaichuanBaichuanbaichuan-incbaichuan-inc/Baichuan-7B 🤗
baichuan-inc/Baichuan-13B-Base 🤗
baichuan-inc/Baichuan-13B-Chat 🤗
🤗
🤗
🤗
Baichuan2baichuan-incbaichuan-inc/Baichuan2-7B-Base 🤗
baichuan-inc/Baichuan2-7B-Chat 🤗
baichuan-inc/Baichuan2-13B-Base 🤗
baichuan-inc/Baichuan2-13B-Chat 🤗
🤗
🤗
🤗
🤗
YiYi01-ai01-ai/Yi-6B 🤗
01-ai/Yi-6B-200K 🤗
01-ai/Yi-9B 🤗
01-ai/Yi-9B-200K 🤗
🤗
🤗
🤗
🤗
Yi-1.501-ai01-ai/Yi-1.5-6B 🤗
01-ai/Yi-1.5-6B-Chat 🤗
01-ai/Yi-1.5-9B 🤗
01-ai/Yi-1.5-9B-32K 🤗
01-ai/Yi-1.5-9B-Chat 🤗
01-ai/Yi-1.5-9B-Chat-16K 🤗
🤗
🤗
🤗
🤗
🤗
🤗
bloombloombigsciencebigscience/bloom-560m 🤗
bigscience/bloomz-560m 🤗
🤗
🤗
QwenQwen阿里云Qwen/Qwen-1_8B 🤗
Qwen/Qwen-1_8B-Chat 🤗
Qwen/Qwen-7B 🤗
Qwen/Qwen-7B-Chat 🤗
Qwen/Qwen-14B 🤗
Qwen/Qwen-14B-Chat 🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen1.5阿里云Qwen/Qwen1.5-0.5B 🤗
Qwen/Qwen1.5-0.5B-Chat 🤗
Qwen/Qwen1.5-1.8B 🤗
Qwen/Qwen1.5-1.8B-Chat 🤗
Qwen/Qwen1.5-7B 🤗
Qwen/Qwen1.5-7B-Chat 🤗
Qwen/Qwen1.5-14B 🤗
Qwen/Qwen1.5-14B-Chat 🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen2阿里云Qwen/Qwen2-0.5B 🤗
Qwen/Qwen2-0.5B-Instruct 🤗
Qwen/Qwen2-1.5B 🤗
Qwen/Qwen2-1.5B-Instruct 🤗
Qwen/Qwen2-7B 🤗
Qwen/Qwen2-7B-Instruct 🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen2-VL阿里云Qwen/Qwen2-VL-2B-Instruct 🤗
Qwen/Qwen2-VL-7B-Instruct 🤗
🤗
🤗
Qwen2.5阿里云Qwen/Qwen2.5-0.5B 🤗
Qwen/Qwen2.5-0.5B-Instruct 🤗
Qwen/Qwen2.5-1.5B 🤗
Qwen/Qwen2.5-1.5B-Instruct 🤗
Qwen/Qwen2.5-3B 🤗
Qwen/Qwen2.5-3B-Instruct 🤗
Qwen/Qwen2.5-7B 🤗
Qwen/Qwen2.5-7B-Instruct 🤗
Qwen/Qwen2.5-14B 🤗
Qwen/Qwen2.5-14B-Instruct 🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen2.5-VL阿里云Qwen/Qwen2.5-VL-3B-Instruct 🤗
Qwen/Qwen2.5-VL-7B-Instruct 🤗
Qwen/Qwen2.5-VL-32B-Instruct 🤗
🤗
🤗
🤗
Qwen3阿里云Qwen/Qwen3-0.6B-Base 🤗
Qwen/Qwen3-0.6B 🤗
Qwen/Qwen3-0.6B-GPTQ-Int8 🤗
Qwen/Qwen3-1.7B-Base 🤗
Qwen/Qwen3-1.7B 🤗
Qwen/Qwen3-4B-Base 🤗
Qwen/Qwen3-4B 🤗
Qwen/Qwen3-4B-AWQ 🤗
Qwen/Qwen3-8B-Base 🤗
Qwen/Qwen3-8B 🤗
Qwen/Qwen3-14B-Base 🤗
Qwen/Qwen3-14B 🤗
Qwen/Qwen3-32B 🤗
Qwen/Qwen3-4B-Instruct-2507 🤗
Qwen/Qwen3-4B-Thinking-2507 🤗
Qwen/Qwen3-30B-A3B-Instruct-2507 🤗
Qwen/Qwen3-30B-A3B-Thinking-2507 🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen3-VL阿里云Qwen/Qwen3-VL-2B-Instruct 🤗
Qwen/Qwen3-VL-2B-Thinking 🤗
Qwen/Qwen3-VL-4B-Instruct 🤗
Qwen/Qwen3-VL-4B-Thinking 🤗
Qwen/Qwen3-VL-8B-Instruct 🤗
Qwen/Qwen3-VL-8B-Thinking 🤗
Qwen/Qwen3-VL-30B-A3B-Instruct 🤗
Qwen/Qwen3-VL-30B-A3B-Thinking 🤗
Qwen/Qwen3-VL-32B-Instruct 🤗
Qwen/Qwen3-VL-32B-Thinking 🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
🤗
Qwen3-Embedding阿里云Qwen/Qwen3-Embedding-0.6B 🤗
Qwen/Qwen3-Embedding-4B 🤗
Qwen/Qwen3-Embedding-8B 🤗
🤗
🤗
🤗
Qwen3-Reranker阿里云Qwen/Qwen3-Reranker-0.6B 🤗
Qwen/Qwen3-Reranker-4B 🤗
Qwen/Qwen3-Reranker-8B 🤗
🤗
🤗
🤗
InternInternLM上海人工智能实验室internlm/internlm-7b 🤗
internlm/internlm-chat-7b 🤗
🤗
🤗
InternLM2上海人工智能实验室internlm/internlm2-1_8b 🤗
internlm/internlm2-chat-1_8b 🤗
internlm/internlm2-7b 🤗
internlm/internlm2-chat-7b 🤗
internlm/internlm2-20b 🤗
internlm/internlm2-chat-20b 🤗
🤗
🤗
🤗
🤗


InternLM2.5上海人工智能实验室internlm/internlm2_5-7b 🤗
internlm/internlm2_5-7b-chat 🤗
internlm/internlm2_5-7b-chat-1m 🤗
🤗
🤗
🤗
InternLM3上海人工智能实验室internlm/internlm3-8b-instruct 🤗🤗
InternVL1.0-1.5上海人工智能实验室OpenGVLab/Mini-InternVL-Chat-4B-V1-5 🤗
OpenGVLab/Mini-InternVL-Chat-2B-V1-5 🤗
待添加
InternVL2.0上海人工智能实验室OpenGVLab/InternVL2-1B 🤗
OpenGVLab/InternVL2-2B 🤗
OpenGVLab/InternVL2-4B 🤗
OpenGVLab/InternVL2-8B 🤗
待添加
InternVL2.5上海人工智能实验室OpenGVLab/InternVL2_5-1B 🤗
OpenGVLab/InternVL2_5-2B 🤗
OpenGVLab/InternVL2_5-4B 🤗
OpenGVLab/InternVL2_5-8B 🤗
🤗
待添加
待添加
待添加
FalconFalcontiiuaetiiuae/falcon-rw-1b 🤗
tiiuae/falcon-7b 🤗
tiiuae/falcon-7b-instruct 🤗
🤗
🤗
🤗
DeepSeekDeepSeek-MoE深度求索deepseek-ai/deepseek-moe-16b-base 🤗
deepseek-ai/deepseek-moe-16b-chat 🤗
🤗
🤗
DeepSeek-LLM深度求索deepseek-ai/deepseek-llm-7b-base 🤗
deepseek-ai/deepseek-llm-7b-chat 🤗
🤗
🤗
DeepSeek-V2深度求索deepseek-ai/DeepSeek-V2-Lite 🤗
deepseek-ai/DeepSeek-V2-Lite-Chat 🤗
🤗
🤗
DeepSeek-Coder深度求索deepseek-ai/deepseek-coder-1.3b-base 🤗
deepseek-ai/deepseek-coder-1.3b-instruct 🤗
deepseek-ai/deepseek-coder-6.7b-base 🤗
deepseek-ai/deepseek-coder-6.7b-instruct 🤗
deepseek-ai/deepseek-coder-7b-base-v1.5 🤗
deepseek-ai/deepseek-coder-7b-instruct-v1.5 🤗
🤗
🤗
🤗
🤗
🤗
🤗
DeepSeek-Coder-V2深度求索deepseek-ai/DeepSeek-Coder-V2-Lite-Base 🤗
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct 🤗
🤗
🤗
DeepSeek-Math深度求索deepseek-ai/deepseek-math-7b-base 🤗
deepseek-ai/deepseek-math-7b-instruct 🤗
deepseek-ai/deepseek-math-7b-rl 🤗
🤗
🤗
🤗
DeepSeek-R1深度求索deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 🤗
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 🤗
deepseek-ai/DeepSeek-R1-Distill-Llama-8B 🤗
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B 🤗
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 🤗
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 🤗
🤗
🤗
🤗
🤗
🤗
🤗
Seed-OSSSeed-OSSByteDanceByteDance-Seed/Seed-OSS-36B-Instruct 🤗
ByteDance-Seed/Seed-OSS-36B-Base 🤗
ByteDance-Seed/Seed-OSS-36B-Base-woSyn 🤗
Ernie4_5Ernie4_5百度baidu/ERNIE-4.5-0.3B-Base-PT 🤗
baidu/ERNIE-4.5-0.3B-PT 🤗
baidu/ERNIE-4.5-21B-A3B-Base-PT 🤗
baidu/ERNIE-4.5-21B-A3B-PT 🤗
baidu/ERNIE-4.5-VL-28B-A3B-Base-PT 🤗
baidu/ERNIE-4.5-VL-28B-A3B-PT 🤗
🤗
🤗
PaddleOCRPaddleOCR-VL百度PaddlePaddle/PaddleOCR-VL 🤗🤗
PaddleOCR-VL-1.5百度PaddlePaddle/PaddleOCR-VL-1.5 🤗🤗
MiniCPMMiniCPMOpenBMBopenbmb/MiniCPM-2B-sft-bf16 🤗
openbmb/MiniCPM-2B-dpo-bf16 🤗
openbmb/MiniCPM-2B-128k 🤗
openbmb/MiniCPM-1B-sft-bf16 🤗
openbmb/MiniCPM3-4B 🤗
openbmb/MiniCPM4-0.5B 🤗
openbmb/MiniCPM4-8B 🤗
🤗
🤗
🤗
🤗
待添加
待添加
待添加
MiniCPM-oOpenBMBopenbmb/MiniCPM-Llama3-V-2_5 🤗
openbmb/MiniCPM-V-2_6 🤗
openbmb/MiniCPM-o-2_6 🤗
openbmb/MiniCPM-V-4 🤗
🤗
🤗
待添加
待添加
embeddingtext2vec-base-chineseshibing624shibing624/text2vec-base-chinese 🤗🤗
m3emoka-aimoka-ai/m3e-base 🤗🤗
bgeBAAIBAAI/bge-large-en-v1.5 🤗
BAAI/bge-large-zh-v1.5 🤗
BAAI/bge-base-en-v1.5 🤗
BAAI/bge-base-zh-v1.5 🤗
BAAI/bge-small-en-v1.5 🤗
BAAI/bge-small-zh-v1.5 🤗
🤗
🤗
🤗
🤗
🤗
🤗
gtethenlperthenlper/gte-large-zh 🤗
thenlper/gte-base-zh 🤗
🤗
🤗

*注:

  1. 高亮格式(如 bert-base-chinese)的表示可直接 build_transformer_model()联网下载

  2. 国内镜像网站加速下载

    • HF_ENDPOINT=https://hf-mirror.com python your_script.py
    • export HF_ENDPOINT=https://hf-mirror.com后再执行python代码
    • 在python代码开头如下设置
    import os
    os.environ['HF_ENDPOINT'] = "https://hf-mirror.com"
    

6. 鸣谢

  • 感谢苏神实现的bert4keras,本实现有不少地方参考了bert4keras的源码,在此衷心感谢大佬的无私奉献;
  • 其次感谢项目bert4pytorch,也是在该项目的指引下给了我用pytorch来复现bert4keras的想法和思路。

7. 引用

@misc{bert4torch,
  title={bert4torch},
  author={Bo Li},
  year={2022},
  howpublished={\url{https://github.com/Tongjilibo/bert4torch}},
}

8. 其他

  • Wechat & Star History Chart
  • 微信群人数超过200个(有邀请限制),可添加个人微信拉群,备注:bert4torch-姓名-公司名
pic
微信号
pic
微信群
pic
Star History Chart