cpp-api-reference.md
April 30, 2024 ยท View on GitHub
This section introduces some public classes and functions of PPLNN.
For brevity, we assume that using namespace ppl::nn; is always used in the following code snippets.
Engine and Ops
Defined in include/ppl/nn/engines/engine.h.
An Engine is a collection of op implementations running on specified devices such as CPU or Nvidia GPU.
Functions
ppl::common::RetCode Configure(uint32_t option, ...);
Sets various options for this engine. Parameters vary depending on the first parameter option.
x86::EngineFactory
A built-in engine factory that is used to create engines running on x86-compatible CPUs.
If you want to use built-in op implementations, you should call x86::RegisterBuiltinOps() manually.
Functions
Engine* x86::EngineFactory::Create(const x86::EngineOptions& options);
Creates an X86 engine instance with the specified options.
cuda::EngineFactory
A built-in engine factory that is used to create engines running on NVIDIA GPUs.
If you want to use built-in op implementations, you should call RegisterBuiltinOps() manually.
Functions
Engine* cuda::EngineFactory::Create(const cuda::EngineOptions& options);
Creates a CUDA engine instance with the specified options.
arm::EngineFactory
A built-in engine factory that is used to create engines running on arm aarch64 CPUs.
If you want to use built-in op implementations, you should call arm::RegisterBuiltinOps() manually.
Functions
Engine* arm::EngineFactory::Create(const arm::EngineOptions& options);
Creates a ARM engine instance with the specified options.
riscv::EngineFactory
A built-in engine factory that is used to create engines running on riscv64 CPUs.
If you want to use built-in op implementations, you should call riscv::RegisterBuiltinOps() manually.
Functions
Engine* riscv::EngineFactory::Create(const riscv::EngineOptions& options);
Creates an RISCV engine instance with the specified options.
onnx::RuntimeBuilderFactory
Defined in include/ppl/nn/models/onnx/runtime_builder_factory.h.
Functions
onnx::RuntimeBuilder* Create();
Creates an onnx::RuntimeBuilder instance.
onnx::RuntimeBuilder
Defined in include/ppl/nn/models/onnx/runtime_builder.h.
onnx::RuntimeBuilder is used to create Runtime instances.
Functions
ppl::common::RetCode LoadModel(const char* model_file);
ppl::common::RetCode LoadModel(const char* model_buf, uint64_t buf_len, const char* model_file_dir = nullptr);
Initializes an onnx::RuntimeBuilder instance from an ONNX model file or buffer. model_file_dir is used to parse external data and can be nullptr if there is no external data.
struct Resources final {
/** `engines` are used to evaluate the compute graph. Note that callers should guarantee that engines are valid during inferencing. */
Engine** engines;
uint32_t engine_num;
};
ppl::common::RetCode SetResources(const Resources&);
Sets the resources needed for preprocessing and evaluating models.
ppl::common::RetCode Preprocess();
prepare for creating Runtime instances.
Runtime* CreateRuntime() const;
Creates a Runtime instance which is used to evaluate a compute graph. This function is thread-safe.
pmx::RuntimeBuilderFactory
Defined in include/ppl/nn/models/pmx/runtime_builder_factory.h.
Functions
pmx::RuntimeBuilder* Create();
Creates an pmx::RuntimeBuilder instance.
pmx::RuntimeBuilder
Defined in include/ppl/nn/models/pmx/runtime_builder.h.
pmx::RuntimeBuilder is used to create Runtime instances.
Functions
ppl::common::RetCode LoadModel(const char* model_file);
ppl::common::RetCode LoadModel(const char* model_buf, uint64_t buf_len);
Initializes an pmx::RuntimeBuilder instance from an PMX model file or buffer.
struct Resources final {
/** `engines` are used to evaluate the compute graph. Note that callers should guarantee that engines are valid during inferencing. */
Engine** engines;
uint32_t engine_num;
};
ppl::common::RetCode SetResources(const Resources&);
Sets the resources needed for preprocessing and evaluating models.
ppl::common::RetCode Preprocess();
prepare for creating Runtime instances.
Runtime* CreateRuntime() const;
Creates a Runtime instance which is used to evaluate a compute graph. This function is thread-safe.
Runtime
Defined in include/ppl/nn/runtime/runtime.h.
Runtime is the main structure for evaluating a model.
Functions
ppl::common::RetCode ReserveTensor(const char* tensor_name);
Marks a tensor specified by tensor_name as reserved in order to avoid being fused and reused duing inferencing.
uint32_t GetInputCount() const;
Returns the number of input of the associated graph.
Tensor* GetInputTensor(uint32_t idx) const;
Returns the input tensor at position idx. Note that idx should be less than the number of inputs.
ppl::common::RetCode Run();
Runs the model with given inputs. Input data MUST be filled via the returned value of GetInputTensor() before calling this function.
uint32_t GetOutputCount() const;
Returns the number of outputs of the associated graph.
Tensor* GetOutputTensor(uint32_t idx) const;
Returns the output tensor at position idx. Note that idx should be less than the number of outputs.
uint32_t GetDeviceContextCount() const;
Returns the number of DeviceContext used by this Runtime instance.
DeviceContext* GetDeviceContext(uint32_t idx) const;
Returns the DeviceContext at position idx. Note that idx should be less than GetDeviceContextCount().
ppl::common::RetCode GetProfilingStatistics(ProfilingStatistics*) const;
Returns profiling statistics of each kernel. Note that this function is available if PPLNN_ENABLE_KERNEL_PROFILING is enable.
Tensor* GetTensor(const char* name) const;
Returns the specified tensor with name, or nullptr if not found.
Tensor
Defined in include/ppl/nn/runtime/tensor.h.
This structure represents the input/output data.
Functions
ppl::common::TensorShape* GetShape() const;
Returns the shape of this tensor.
ppl::common::RetCode CopyToHost(void* dst) const;
Copies tensor's data to dst which points to a host memory. Note that dst MUST have enough space.
ppl::common::RetCode CopyFromHost(const void* src);
Copies data to inner buffer from src, which points to a host memory. Note that inner buffer MUST be allocated before calling this function.
ppl::common::RetCode ConvertToHost(void* dst, const ppl::common::TensorShape& dst_desc) const;
Converts tensor's data to dst with the shape dst_desc.
ppl::common::RetCode ConvertFromHost(const void* src, const ppl::common::TensorShape& src_desc);
Converts data to inner buffer from src with the shape src_desc. Note that inner buffer MUST be allocated before calling this function.
DeviceContext* GetDeviceContext() const;
Returns context of the underlying Device.
void SetDeviceContext(DeviceContext* ctx);
Sets the device DeviceContext of this tensor to ctx. Note that this tensor's buffer will be released before ctx is set.
void SetBufferPtr(void* buf);
Sets the underlying buffer ptr. Note that buf can be read/written by the internal Device class.
void* GetBufferPtr() const;
Returns the underlying buffer ptr.
TensorShape
Defined in include/ppl/nn/common/tensor_shape.h.
Logger
Defined in include/ppl/nn/common/logger.h.
void SetCurrentLogger(Logger*);
Sets global logger for pplnn internal logging.
Logger* GetCurrentLogger();
Returns the logger for pplnn internal logging. Default is a StdioLogger that prints logs to stdout/stderr.
void Logger::SetLogLevel(uint32_t);
uint32_t Logger::GetLogLevel() const;
Sets/gets the level for logging. Any log level that is less than Logger::GetLogLevel() will not be logged.
All API references(in html format) can be generated by running doxygen docs/Doxyfile.