C++ inference with MNIST classification model.
February 25, 2022 ยท View on GitHub
Introduction
This example demonstrates the workflow to train a classification model in Python and to execute it in C++. Although this example shows how to use it on Ubuntu 16.04 so far, a similar procedure should work on the other operating systems with little effort for the build scripts adapting to your OS. We will add some of more useful examples in the near future.
Install C++ libraries
Follow the installation manual of C++ utility library
Note: this example requires NNabla Python package also be installed.
Train a classification model in Python
At first, you will train an MNIST classification model in Python-side. The example scripts of the MNIST classification training are provided in NNabla Examples repository. Clone or download it, then you can run train a classification model by the following commands.
# at nnabla-examples/image-classification/mnist-collection/
python classification.py # Optionally you can use -c cudnn option.
After training finishes, you can find a parameter file created in the tmp.monitor folder with name lenet_params_010000.h5
Create NNP file
In order to execute your trained model on C++ code, the trained model parameters must be converted to a NNabla file format (NNP) with a network definition. NNabla file format can store the information of network definitions, parameters, and executor settings etc. We provide an example script (found in this folder) which creates a NNP file from learned parameters and a Python script of model definition.
# at .
NNABLA_EXAMPLES_ROOT=<your local path to nnabla-examples> python save_nnp_classification.py
It reads parameter file, and creates a computation graph using loaded parameters. The computation graph is only used to dump the network structure into NNP file.
runtime_contents = {
'networks': [
{'name': 'runtime',
'batch_size': args.batch_size,
'outputs': {'y': pred},
'names': {'x': image}}],
'executors': [
{'name': 'runtime',
'network': 'runtime',
'data': ['x'],
'output': ['y']}]}
nn.utils.save.save(nnp_file, runtime_contents)
In the above code, the network structure containing parameters and the execution configuration is saved into the NNP file lenet_010000.nnp. The contents is described in a JSON like format. In the networks field, you add a network with a name of runtime. It has a default batch size. The computation graph can be set by the output variable pred in the outputs field. At the same time, the output variable pred of the computation graph is registered as a name y. To query an input or intermediate variable in the computation graph via the C++ interface, you should set a filed names in a format of {<name>: <Variable>}.
The named variables in the network are referenced by the executors config. The executor config is used in C++ for executing a network in a more simpler way. The executor runtime is added where the network runtime is executed. The input and output variables are specified by names that are registered in the networks field.
Build MNIST inference example C++ code
You can find an executable file 'mnist_runtime' under the build directory located at nnabla/build/bin. If you want to build it yourself using Makefile you can refer to the following process in linux environments. Also you can build an executable file 'mnist_runtime_cuda', that is not in the build directory, by following process.
make
The above command generates an executable mnist_runtime at the current directly.
The build file GNUmakefile is really simple. It links libnnabla.so and libnnabla_utils.so with the executable generated from mnist_runtime.cpp, and compiles with C++14 option -std=c++14.
CUDA_VERSION_SUFFIX=-102_8 make cuda
CUDA_VERSION_SUFFIX depends on the cuda library version you are using, you may check it in /usr/local/lib.
You can also compile an executable mnist_runtime_cuda that runs computation on your CUDA device by the above command if you install nnabla-ext-cuda in a right path. See GNUmakefile for details.
Execute handwritten digit classification
By running the generated example with no argument, you can see the usage documentation.
./mnist_runtime
Output:
Usage: ./mnist_runtime nnp_file input_pgm
Positional arguments:
nnp_file : .nnp file created by examples/vision/mnist/save_nnp_classification.py.
input_pgm : PGM (P5) file of a 28 x 28 image where pixel values < 256.
Sample images that I created using GIMP editor are located in this folder.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
The following command executes image classification with the trained model lenet_010000.nnp given an input image.

./mnist_runtime lenet_010000.nnp 5.pgm
The output shows it makes a prediction. In my case, it's correct.
Prediction scores: -24.1875 -14.0103 -13.2646 7.52215 -13.7401 31.1683 -0.501035 -4.69472 6.2626 1.87513
Prediction: 5
NOTE: The recognition performance is not perfect for the real hand-written digit images (i.e. the digit images contained in this example). For example, it often misclassifies the digit 6 as 5.
Walk through the example code
- Add include for the NNabla header files.
#include <nbla/logger.hpp>
#include <nbla_utils/nnp.hpp>
- Create a execution engine context. Following configuration enables our cached (memory pool) cpu array as array backend.
nbla::Context ctx{"cpu", "CpuCachedArray", "0", "default"};
- Create
Nnpobject with the default context.
nbla::utils::nnp::Nnp nnp(ctx);
- Set nnp file to the
Nnpobject. It immediately parses the file format and stores the extracted info.
nnp.add(nnp_file);
- Get an executor instance. The above
save_nnp_classification.pyscript saved an executor namedruntime.
auto executor = nnp.get_executor("runtime");
- Overwrite batch size as 1. This example always takes input for each image.
executor->set_batch_size(1); // Use batch_size = 1.
- Get input data as a CPU array. See computation_graph/variable.hpp and variable.hpp for API manual written in the headers.
nbla::CgVariablePtr x = executor->get_data_variables().at(0).variable;
uint8_t *data = x->variable()->cast_data_and_get_pointer<uint8_t>(ctx);
- Read input pgm file and store image data into the CPU array. The
read_pgm_mnistis implemented above themainfunction.
read_pgm_mnist(input_bin, data);
- Execute prediction.
executor->execute();
- Get output as an CPU array.
nbla::CgVariablePtr y = executor->get_output_variables().at(0).variable;
const float *y_data = y->variable()->get_data_pointer<float>(ctx);
- Show prediction scores and the most likely predicted number of the input image.
int prediction = 0;
float max_score = -1e10;
std::cout << "Prediction scores:";
for (int i = 0; i < 10; i++) {
if (y_data[i] > max_score) {
prediction = i;
max_score = y_data[i];
}
std::cout << " " << std::setw(5) << y_data[i];
}
std::cout << std::endl;
std::cout << "Prediction: " << prediction << std::endl;








