PaddleSharp ๐ŸŒŸ [](https://github.com/sdcb/PaddleSharp/actions/workflows/main.yml) [](https://jq.qq.com/?_wv=1027&k=K4fBqpyQ)

July 4, 2025 ยท View on GitHub

๐Ÿ’— .NET Wrapper for PaddleInference C API, support Windows(x64) ๐Ÿ’ป, NVIDIA Cuda 11.8+ based GPU ๐ŸŽฎ and Linux(Ubuntu-22.04 x64) ๐Ÿง, currently contained following main components:

NuGet Packages/Docker Images ๐Ÿ“ฆ

Release notes ๐Ÿ“

Please checkout this page ๐Ÿ“„.

Infrastructure packages ๐Ÿ—๏ธ

NuGet Package ๐Ÿ’ผVersion ๐Ÿ“ŒDescription ๐Ÿ“š
Sdcb.PaddleInferenceNuGetPaddle Inference C API .NET binding โš™๏ธ

Native Packages ๐Ÿ—๏ธ

PackageVersion ๐Ÿ“ŒDescription
Sdcb.PaddleInference.runtime.win64.mklNuGetRecommended for most users (CPU, MKL)
Sdcb.PaddleInference.runtime.win64.openblasNuGetCPU, OpenBLAS
Sdcb.PaddleInference.runtime.win64.openblas-noavxNuGetCPU, no AVX, for old CPUs
Sdcb.PaddleInference.runtime.win64.cu118_cudnn89_sm61NuGetCUDA 11.8, GTX 10 Series
Sdcb.PaddleInference.runtime.win64.cu118_cudnn89_sm75NuGetCUDA 11.8, RTX 20/GTX 16xx Series
Sdcb.PaddleInference.runtime.win64.cu118_cudnn89_sm86NuGetCUDA 11.8, RTX 30 Series
Sdcb.PaddleInference.runtime.win64.cu118_cudnn89_sm89NuGetCUDA 11.8, RTX 40 Series
Sdcb.PaddleInference.runtime.win64.cu126_cudnn95_sm61NuGetCUDA 12.6, GTX 10 Series
Sdcb.PaddleInference.runtime.win64.cu126_cudnn95_sm75NuGetCUDA 12.6, RTX 20/GTX 16xx Series
Sdcb.PaddleInference.runtime.win64.cu126_cudnn95_sm86NuGetCUDA 12.6, RTX 30 Series
Sdcb.PaddleInference.runtime.win64.cu126_cudnn95_sm89NuGetCUDA 12.6, RTX 40 Series
Sdcb.PaddleInference.runtime.win64.cu129_cudnn910_sm61NuGetCUDA 12.9, GTX 10 Series
Sdcb.PaddleInference.runtime.win64.cu129_cudnn910_sm75NuGetCUDA 12.9, RTX 20/GTX 16xx Series
Sdcb.PaddleInference.runtime.win64.cu129_cudnn910_sm86NuGetCUDA 12.9, RTX 30 Series
Sdcb.PaddleInference.runtime.win64.cu129_cudnn910_sm89NuGetCUDA 12.9, RTX 40 Series
Sdcb.PaddleInference.runtime.win64.cu129_cudnn910_sm120NuGetCUDA 12.9, RTX 50 Series
Sdcb.PaddleInference.runtime.linux-x64.openblasNuGetLinux x64, OpenBLAS
Sdcb.PaddleInference.runtime.linux-x64.mklNuGetLinux x64, MKL
Sdcb.PaddleInference.runtime.linux-x64NuGetLinux x64, MKL+OpenVINO
Sdcb.PaddleInference.runtime.linux-arm64NuGetLinux ARM64
Sdcb.PaddleInference.runtime.osx-x64NuGetmacOS x64, include ONNXRuntime
Sdcb.PaddleInference.runtime.osx-arm64NuGetmacOS ARM64

Package Selection Guide:

  • We recommend Sdcb.PaddleInference.runtime.win64.mkl for most users. It offers the best balance between performance and package size. Please note that this package does not support GPU acceleration, making it suitable for most general scenarios.
  • openblas-noavx is tailored for older CPUs that do not support the AVX2 instruction set.
  • The remaining packages cover various CUDA combinations (GPU acceleration), supporting three CUDA versions:
    • CUDA 11.8: Supports 10โ€“40 series NVIDIA GPUs
    • CUDA 12.6: Supports 10โ€“40 series NVIDIA GPUs
    • CUDA 12.9: Supports 10โ€“50 series NVIDIA GPUs

Important:
Not all GPU packages are suitable for every card. Please refer to the following GPU-to-sm suffix mapping:

sm SuffixSupported GPU Series
sm61GTX 10 Series
sm75RTX 20 Series (and GTX 16xx series such as GTX 1660)
sm86RTX 30 Series
sm89RTX 40 Series
sm120RTX 50 Series (supported by CUDA 12.9 only)

Any other packages that starts with Sdcb.PaddleInference.runtime might deprecated.

All packages were compiled manually by me, with some code patches from here: https://github.com/sdcb/PaddleSharp/blob/master/build/capi.patch

Paddle Devices

  • Mkldnn - PaddleDevice.Mkldnn()

    Based on Mkldnn, generally fast

  • Openblas - PaddleDevice.Openblas()

    Based on openblas, slower, but dependencies file smaller and consume lesser memory

  • Onnx - PaddleDevice.Onnx()

    Based on onnxruntime, is also pretty fast and consume less memory

  • Gpu - PaddleDevice.Gpu()

    Much faster but relies on NVIDIA GPU and CUDA

    If you wants to use GPU, you should refer to FAQ How to enable GPU? section, CUDA/cuDNN/TensorRT need to be installed manually.

FAQ โ“

Why my code runs good in my windows machine, but DllNotFoundException in other machine: ๐Ÿ’ป

  1. Please ensure the latest Visual C++ Redistributable was installed in Windows (typically it should automatically installed if you have Visual Studio installed) ๐Ÿ› ๏ธ Otherwise, it will fail with the following error (Windows only):

    DllNotFoundException: Unable to load DLL 'paddle_inference_c' or one of its dependencies (0x8007007E)
    

    If it's Unable to load DLL OpenCvSharpExtern.dll or one of its dependencies, then most likely the Media Foundation is not installed in the Windows Server 2012 R2 machine: image

  2. Many old CPUs do not support AVX instructions, please ensure your CPU supports AVX, or download the x64-noavx-openblas DLLs and disable Mkldnn: PaddleDevice.Openblas() ๐Ÿš€

  3. If you're using Win7-x64, and your CPU does support AVX2, then you might also need to extract the following 3 DLLs into C:\Windows\System32 folder to make it run: ๐Ÿ’พ

    • api-ms-win-core-libraryloader-l1-2-0.dll
    • api-ms-win-core-processtopology-obsolete-l1-1-0.dll
    • API-MS-Win-Eventing-Provider-L1-1-0.dll

    You can download these 3 DLLs here: win7-x64-onnxruntime-missing-dlls.zip โฌ‡๏ธ

How to enable GPU? ๐ŸŽฎ

Enable GPU support can significantly improve the throughput and lower the CPU usage. ๐Ÿš€

Steps to use GPU in Windows:

  1. (for Windows) Install the package: Sdcb.PaddleInference.runtime.win64.cu120* instead of Sdcb.PaddleInference.runtime.win64.mkl, do not install both. ๐Ÿ“ฆ
  2. Install CUDA from NVIDIA, and configure environment variables to PATH or LD_LIBRARY_PATH (Linux) ๐Ÿ”ง
  3. Install cuDNN from NVIDIA, and configure environment variables to PATH or LD_LIBRARY_PATH (Linux) ๐Ÿ› ๏ธ
  4. Install TensorRT from NVIDIA, and configure environment variables to PATH or LD_LIBRARY_PATH (Linux) โš™๏ธ

You can refer to this blog page for GPU in Windows: ๅ…ณไบŽPaddleSharp GPUไฝฟ็”จ ๅธธ่ง้—ฎ้ข˜่ฎฐๅฝ• ๐Ÿ“

If you're using Linux, you need to compile your own OpenCvSharp4 environment following the docker build scripts and the CUDA/cuDNN/TensorRT configuration tasks. ๐Ÿง

After these steps are completed, you can try specifying PaddleDevice.Gpu() in the paddle device configuration parameter, then enjoy the performance boost! ๐ŸŽ‰

Thanks & Sponsors ๐Ÿ™

Contact ๐Ÿ“ž

QQ group of C#/.NET computer vision technical communication (C#/.NET่ฎก็ฎ—ๆœบ่ง†่ง‰ๆŠ€ๆœฏไบคๆต็พค): 579060605