ncnn

May 26, 2026 · View on GitHub

ncnn

ncnn

License Download Total Count codecov

ncnn is a high-performance neural network inference framework optimized for mobile, embedded, and desktop deployment. It has no third-party runtime dependencies, runs across CPU and Vulkan GPU backends, and provides tools such as pnnx for converting PyTorch and ONNX models to ncnn. Developers can deploy deep learning models efficiently on phones, PCs, browsers, and edge devices. ncnn is currently being used in many Tencent applications, such as QQ, Qzone, WeChat, Pitu, and so on.

ncnn 是一个面向移动端、嵌入式和桌面端部署优化的高性能神经网络推理框架。 ncnn 无第三方运行时依赖,支持 CPU 和 Vulkan GPU 后端,并提供 pnnx 等工具将 PyTorch 和 ONNX 模型转换为 ncnn 模型。 基于 ncnn,开发者可以将深度学习模型高效部署到手机、PC、浏览器和边缘设备上。 ncnn 目前已在腾讯多款应用中使用,如:QQ,Qzone,微信,天天 P 图等。


Quick Start

The recommended beginner path is PyTorch -> pnnx -> ncnn.

Install pnnx in a PyTorch environment

pip3 install pnnx

Export a PyTorch model to ncnn

import torch
import torch.nn as nn
import pnnx

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 8, 1)
        self.relu = nn.ReLU()
        self.fc = nn.Linear(8, 4)

    def forward(self, x):
        x = self.conv(x)
        x = self.relu(x)
        x = x.mean((2, 3))
        return self.fc(x)

model = Model().eval()

x = torch.rand(1, 3, 224, 224)
pnnx.export(model, "model.pt", (x,))

This generates model.ncnn.param and model.ncnn.bin.

Run with ncnn C++ API

#include "net.h"

ncnn::Net net;
net.load_param("model.ncnn.param");
net.load_model("model.ncnn.bin");

ncnn::Mat in(224, 224, 3);

auto ex = net.create_extractor();
ex.input("in0", in);

ncnn::Mat out;
ex.extract("out0", out);

Or use Python

import numpy as np
import ncnn

net = ncnn.Net()
net.load_param("model.ncnn.param")
net.load_model("model.ncnn.bin")

x = np.zeros((3, 224, 224), np.float32)
mat = ncnn.Mat(x)

ex = net.create_extractor()
ex.input("in0", mat)

ret, out = ex.extract("out0")
print(np.array(out).shape)

See pnnx, use ncnn with PyTorch or ONNX, Python API, and examples for complete workflows.


Community

技术交流 QQ 群
637093648 (超多大佬)
答案:卷卷卷卷卷(已满)
Telegram Group

https://t.me/ncnnyes

Discord Channel

https://discord.gg/YRsxgmF

Pocky QQ 群(MLIR YES!)
677104663 (超多大佬)
答案:multi-level intermediate representation
他们都不知道 pnnx 有多好用群
818998520 (新群!)

Download & Build status

https://github.com/Tencent/ncnn/releases/latest

how to build ncnn library on Linux / Windows / macOS / Raspberry Pi3, Pi4 / POWER / Android / NVIDIA Jetson / iOS / WebAssembly / AllWinner D1 / Loongson 2K1000

Source

Android

Android shared

HarmonyOS

HarmonyOS shared

iOS

iOS-Simulator

macOS

Mac-Catalyst

watchOS

watchOS-Simulator

tvOS

tvOS-Simulator

visionOS

visionOS-Simulator

Apple xcframework

Ubuntu 22.04

Ubuntu 24.04

windows
VS2015

VS2017

VS2019

VS2022

WebAssembly

Linux (arm)

Linux (aarch64)

Linux (mips)

Linux (mips64)

Linux (ppc64)

Linux (riscv64)

Linux (loongarch64)


Build

Use the prebuilt packages above when possible. To build from source, see the full how to build ncnn library guide for Linux, Windows, macOS, Android, iOS, WebAssembly, HarmonyOS, Raspberry Pi, Jetson, and embedded targets.

Common Linux build:

git clone --recursive https://github.com/Tencent/ncnn.git
cd ncnn
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_VULKAN=ON -DNCNN_BUILD_EXAMPLES=ON ..
cmake --build . -j$(nproc)

Model Conversion

Source modelRecommended pathDocs
PyTorchpnnx.export(model, "model.pt", (input_tensor,)) or pnnx model.pt inputshape=[...]pnnx, PyTorch / ONNX guide
ONNXpnnx model.onnxpnnx, onnx tools
ncnn model optimizationncnnoptimize model.param model.bin new.param new.bin flagquantization, model file spec
Legacy Caffe / MXNet / DarknetUse compatibility converters when maintaining older modelscaffe, mxnet, darknet, AlexNet legacy tutorial

Use Netron to inspect .param, .onnx, and .pnnx.param graphs.


Features

  • No third-party runtime dependencies and no BLAS / NNPACK requirement.
  • Pure C++ implementation with C API and Python binding.
  • Optimized CPU inference for mobile and embedded processors, including ARM NEON and multi-core scheduling.
  • Vulkan GPU acceleration for supported platforms.
  • Low memory footprint with explicit blob/workspace allocator design.
  • Supports multi-input, multi-output, and multi-branch graphs.
  • PyTorch and ONNX conversion through pnnx, plus legacy converter support for older model formats.
  • Supports fp16 storage/arithmetic paths, int8 quantized inference, model optimization, and custom layers.
  • Direct memory reference loading for .param and .bin models.

Model and Workload Coverage

ncnn is still strong for classic and mobile CNN workloads, but current usage is broader than CNN-only deployment.

For operator-level detail, see supported PyTorch operator status, supported ONNX operator status, and operation param weight table.


Project Examples

AreaProject
Image generationzimage-ncnn-vulkan - Z-Image generation with ncnn and Vulkan
LLM / embedding / vision-languagencnn_llm - LLM, embedding, and vision-language examples with ncnn
Android classificationncnn-android-squeezenet
Android style transferncnn-android-styletransfer
Android detectionncnn-android-mobilenetssd, ncnn-android-yolov5, ncnn-android-yolov7, ncnn-android-scrfd
Face detectionmtcnn_ncnn
Qt / Android integrationqt_android_ncnn_lib_encrypt_example
Colorizationncnn-colorization-siggraph17
Fortran bindingncnn-fortran
Speech recognitionsherpa - real-time speech recognition on embedded and mobile devices

Documentation And FAQ

TopicLinks
Buildhow to build
PyTorch / ONNX conversionuse ncnn with PyTorch or ONNX, pnnx, PyTorch converter notes
API and examplesC++ examples, Python API, low-level operation API
Model formatparam and model file spec, operation param weight table
Extensioncustom layer guide, plugin tools
FAQdeepwiki, throw error, wrong result, Vulkan
Legacy beginner materialuse ncnn with AlexNet, AlexNet Chinese tutorial

License

BSD 3 Clause