OpenVINO™ Model Server Developer Guide

March 19, 2026 · View on GitHub

Introduction

This document gives information and steps to run and debug tests. It gives information about following points :

OpenVINO™ Model Server Developer Guide

Set up the Development Environment

The tests in this guide are written in Python. Therefore, to complete the functional tests, Python 3.8 must be installed.

In-case of problems, see Debugging.

Prepare Environment to Use the Tests

git clone https://github.com/openvinotoolkit/model_server.git
cd model_server

Step 1: Compile source code

Build the development openvino/model_server-build Docker* image
```
make docker_build
```
or
```
make docker_build DLDT_PACKAGE_URL=<URL>
```
Note: URL to OpenVINO Toolkit package can be received after registration on OpenVINO™ Toolkit website

docker_build target by default builds multiple docker images:
- openvino/model_server:latest - smallest release image containing only necessary files to run model server on CPU
- openvino/model_server:latest-gpu - release image containing support for Intel GPU and CPU
- openvino/model_server:latest-nginx-mtls - release image containing exemplary NGINX MTLS configuration
- openvino/model_server-build:latest - image with builder environment containing all the tools to build OVMS
Note: docker_build target accepts the same set of parameters as release_image target described here: build_from_source.md

Download test LLM models

./prepare_llm_models.sh ./src/test/llm_testing

Mount the source code in the Docker container :

docker run -it -v ${PWD}:/ovms --entrypoint bash -p 9178:9178 openvino/model_server-build:latest

In the docker container context compile the source code via (choose distro ubuntu or redhat depending on the image type):
```
bazel build --//:distro=ubuntu --config=mp_on_py_on //src:ovms
```

NOTE: There are several options that would disable specific parts of OVMS. For details check ovms bazel build files. ```

From the container, run a single unit test (choose distro ubuntu or redhat depending on the image type):

bazel test --//:distro=ubuntu --config=mp_on_py_on --test_summary=detailed --test_output=all --test_filter='ModelVersionStatus.*' //src:ovms_test

Argument	Description
`test`	builds and runs the specified test target
`--test_summary=detailed`	the output includes failure information
`--test_output=all`	log all tests stdout at the end
`--test_filter='ModelVersionStatus.*'`	limits the tests run to the indicated test
`//src:ovms_test`	the test source

NOTE: For more information, see the bazel command-line reference NOTE: If container has access to Intel GPU device and test models, add --test_env RUN_GPU_TESTS=1 to run GPU unit tests.

Select one of these options to change the target image name or network port to be used in tests. It might be helpful on a shared development host:

With a Docker cache :

OVMS_CPP_DOCKER_IMAGE=<replace_with_unique_image_name> make docker_build
OVMS_CPP_DOCKER_IMAGE=<replace_with_unique_image_name> make test_functional
OVMS_CPP_CONTAINER_PORT=<unique_network_port> make test_perf

Without a Docker cache :

make docker_build NO_DOCKER_CACHE=true

Step 2: Install software

Install Python release 3.8.

NOTE: Python is only necessary to complete the functional tests in this guide.

Install the virtualenv package :
```
pip3 install virtualenv
```

Now the tests can be run.

Run the Tests

Use the tests below depending on the requirement.

Click the test that needs to be run:

Run test inference

Download an exemplary model ResNet50-binary model :
```
source tests/performance/download_model.sh
```
The script stores the model in the user home folder.
Start OVMS docker container with downloaded model

docker run -d --name server-test -v ~/resnet50-binary:/models/resnet50-binary -p 9178:9178 \
openvino/model_server:latest --model_name resnet-binary --model_path /models/resnet50-binary --port 9178

The grpc client connects to the OpenVINO Model Server service that is running on port 9178.

make venv
source .venv/bin/activate
pip3 install -r demos/common/python/requirements.txt
python tests/performance/grpc_latency.py --images_numpy_path tests/performance/imgs.npy --labels_numpy_path tests/performance/labels.npy \
--iteration 1000 --model_name resnet-binary --batchsize 1 --report_every 100 --input_name 0 --output_name 1463 --grpc_port 9178

Where:

Argument Used	Description
`images_numpy_path tests/performance/imgs.npy`	The path to a numpy array. `imgs.npy` is the numpy array with a batch of input data.
`labels_numpy_path tests/performance/labels.npy`	Includes a numpy array named labels.npy. This array has image classification results
`iteration 1000`	Run the data 1000 times
`batchsize 1`	Batch size to be used in the inference request
`report_every 10`	Number of iterations followed by results summary report
`input_name 0`	Name of the deployed model input called "0"
`output_name 1463`	Name of the deployed model output called "1463"

Run functional tests

The functional tests are written in Python. Therefore, to complete the tests in this section, Python 3.6 - 3.8 must be installed.

NOTE: In-case of additional problems, see the debugging section.

Run command

make test_functional

Configuration options are :

Variable	Description
`IMAGE`	Docker image name for the tests.
`TEST_DIR_CACHE`	Location from which models and test data are downloaded.
`TEST_DIR`	Location to which models and test data are copied during tests.
`TEST_DIR_CLEANUP`	Set to `True` to remove the directory under `TEST_DIR` after the tests.
`LOG_LEVEL`	The log level.
`BUILD_LOGS`	Path to save artifacts.
`START_CONTAINER_COMMAND`	The command to start the OpenVINO Model Storage container.
`CONTAINER_LOG_LINE`	The log line in the container that confirms the container started properly.

Add any configuration variables to the command line in this format :

export IMAGE="openvino/model_server:latest"

To make command repetition easier, create and store the configuration options in a file named user_config.py. Put this file in the main project directory.

Example:

os.environ["IMAGE"] = "openvino/model_server"

Run performance tests

Automated tests are configured to use the ResNet50 model.

Execute command to run latency test

make test_perf

Output

Running latency test
[--] Starting iterations
[--] Iteration   100/ 1000; Current latency: 10.52ms; Average latency: 11.35ms
[--] Iteration   200/ 1000; Current latency: 10.99ms; Average latency: 11.03ms
[--] Iteration   300/ 1000; Current latency: 9.60ms; Average latency: 11.02ms
[--] Iteration   400/ 1000; Current latency: 10.20ms; Average latency: 10.93ms
[--] Iteration   500/ 1000; Current latency: 10.45ms; Average latency: 10.84ms
[--] Iteration   600/ 1000; Current latency: 10.70ms; Average latency: 10.82ms
[--] Iteration   700/ 1000; Current latency: 9.47ms; Average latency: 10.88ms
[--] Iteration   800/ 1000; Current latency: 10.70ms; Average latency: 10.83ms
[--] Iteration   900/ 1000; Current latency: 11.09ms; Average latency: 10.85ms
[--] Iterations:  1000; Final average latency: 10.86ms; Classification accuracy: 100.0%

Execute command to run throughput test

make test_throughput

Output

Running throughput test
[25] Starting iterations
[23] Starting iterations
...
[11] Starting iterations
[24] Iterations:   500; Final average latency: 20.50ms; Classification accuracy: 100.0%
[25] Iterations:   500; Final average latency: 20.81ms; Classification accuracy: 100.0%
[6 ] Iterations:   500; Final average latency: 20.80ms; Classification accuracy: 100.0%
[26] Iterations:   500; Final average latency: 20.80ms; Classification accuracy: 100.0%
...
[11] Iterations:   500; Final average latency: 20.84ms; Classification accuracy: 100.0%

real	0m13.397s
user	1m22.277s
sys	0m39.333s
1076 FPS

Run tests on an OpenVINO Model Server binary file

To run tests on an OpenVINO Model Server binary file, use export to specify the following variable in user_config.py or in the environment. Replace "/home/<example_path>/dist/<os_name>/ovms/bin/ovms" with the path to your binary file:

tar -xvzf dist/<os_name>/ovms.tar.gz -C dist/<os_name>/

os.environ["OVMS_BINARY_PATH"] = "'${PWD}'/dist/<os_name>/ovms/bin/ovms"

export OVMS_BINARY_PATH="'${PWD}'/dist/<os_name>/ovms/bin/ovms"

The following command executed in the of OpenVINO Model Server binary file should return paths to the unpacked lib directory included in ovms.tar.gz (ovms/bin/./../lib).

ldd dist/<os_name>/ovms/bin/ovms

Otherwise use export to specify the following variable in user_config.py file or in the environment :

os.environ["LD_LIBRARY_PATH"] = "'${PWD}'/dist/<os_name>/ovms/lib"

export LD_LIBRARY_PATH="'${PWD}'/dist/<os_name>/ovms/lib"

NOTE: For additional problems, see the debugging section.

Executing the unit tests require building the ovms build image. All unit tests are expected to be started in a container using model server build image. Some unit tests require test models to be pulled and attached to the container. The following commands create the build image and start the unit tests:

make ovms_builder_image
make run_unit_tests

To run unit tests, verifying integration with Intel GPUs, add RUN_GPU_TESTS=1 parameter:

make run_unit_tests RUN_GPU_TESTS=1

NOTE: It is required to follow this guide to prepare host machine to work with VA API (Ubuntu).

On bare metal, run (just once):

sudo apt install -y \
    linux-headers-$(uname -r) \
    linux-modules-extra-$(uname -r) \
    flex bison \
    intel-fw-gpu intel-i915-dkms xpu-smi
sudo reboot

NOTE: It is required to execute unit tests on machine with Intel Data Center GPU.

NOTE: For RedHat base OS unit tests, which require VA API, are skipped.

Checking code coverage of unit tests

To check code coverage of unit tests, execute the following command to create build image and run unit tests with code coverage enabled:

make ovms_builder_image BASE_OS=ubuntu24 CHECK_COVERAGE=1 RUN_TESTS=1 MEDIAPIPE_DISABLE=0 PYTHON_DISABLE=0 OV_USE_BINARY=1 OVMS_CPP_DOCKER_IMAGE=ovms_coverage

Then run get_coverage target to extract report from the container:

make get_coverage OVMS_CPP_DOCKER_IMAGE=ovms_coverage

It should create report in genhtml directory. Open index.html file in this directory to check the code coverage report.

Debugging

Debugging options are available. Click on the required option :

Use gdb to debug in Docker

Build a project in a debug mode :
```
make docker_build BAZEL_BUILD_TYPE=dbg
```
NOTE: You can build also the debug version of the major dependencies like OpenVINO Runtime using extra flag CMAKE_BUILD_TYPE=Debug.

Run the container :

docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v ${PWD}:/ovms -p 9178:9178 --entrypoint bash openvino/model_server-build:latest

Prepare resnet50 model for OVMS in /models catalog and recompile the OpenVINO Model Server in docker container with debug symbols using command:

mkdir -p /models/1 && wget -P /models/1 https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.bin && wget -P /models/1 https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.xml

bazel build --config=mp_on_py_on //src:ovms -c dbg

gdb --args ./bazel-bin/src/ovms --model_name resnet --model_path /models --port 9178

NOTE: For best results, use the makefile parameter BAZEL_BUILD_TYPE=dbg to build the dependencies in debug mode as shown above

For unit test debugging, run command :

gdb --args ./bazel-bin/src/./ovms_test --gtest_filter='OvmsConfigTest.emptyInput'

For forking tests debugging, enable fork follow mode by running command:
```
# (in gdb cli) set follow-fork-mode child
```
For tracing what OpenVINO calls are used underneath you can use --define OV_TRACE=1 option when building ovms with bazel or its tests.

Use minitrace to display flame graph

Download the model files and store them in the models directory

mkdir -p models/resnet/1
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.bin https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.xml -o models/resnet/1/resnet50-binary-0001.bin -o models/resnet/1/resnet50-binary-0001.xml

Option 1. Use OpenVINO Model Server build image.

This is convenient way during development in case it is needed to add new or remove already existing traces.

Build OVMS build image locally.

make docker_build

Start the container.

docker run -it -v ${PWD}:/ovms --entrypoint bash -p 9178:9178 openvino/model_server-build:latest

Build OVMS with minitrace enabled.

bazel build --config=linux --copt="-DMTR_ENABLED" //src:ovms

Run OVMS with --trace_path specifying where to save flame graph JSON file.

bazel-bin/src/ovms --model_name resnet --model_path models/resnet --trace_path trace.json --port 9178

During app exit, the trace info will be saved into trace.json.
Use Chrome web browser chrome://tracing tool to display the graph.

Option 2. Build OVMS image with minitrace enabled

This is convenient when final image has to be used on different machine and no changes to existing traces do not need to be modified for debugging.

Build OVMS with minitrace enabled locally.

make docker_build MINITRACE=ON

Run OVMS with minitrace enabled and --trace_path to specify where to save trace JSON file. Since the file is flushed and saved at container shutdown, mount the host directory with write access to persist the file after container stops.

mkdir traces
chmod -R 777 traces

docker run -it -v ${PWD}:/workspace:rw -p 9178:9178 openvino/model_server --model_name resnet --model_path /workspace/models/resnet --trace_path /workspace/traces/trace.json --port 9178

During app exit, the trace info will be saved into ${PWD}/traces/trace.json.
Use Chrome web browser chrome://tracing tool to display the graph, similarly to Option 1.

Profiling macros

Macro	Description	Example Usage
OVMS_PROFILE_FUNCTION	Add this macro at the very beginning of a function. This will automatically add function name to trace marker.	`OVMS_PROFILER_FUNCTION();`
OVMS_PROFILE_SCOPE	Add this macro at the beginning of a code scope and add marker name. This will automatically add ending marker at the end of code scope.	`OVMS_PROFILER_SCOPE("My Code Scope Marker");`
OVMS_PROFILE_SYNC_BEGIN	For custom start and end markers, use this macro to mark beginning of synchronous event. Remember to use the same marker name for beginning and end.	`OVMS_PROFILER_SYNC_BEGIN("My Synchronous Event");`
OVMS_PROFILE_SYNC_END	For custom start and end markers, use this macro to mark ending of synchronous event. Remember to use the same marker name for beginning and end.	`OVMS_PROFILER_SYNC_END("My Synchronous Event");`
OVMS_PROFILE_ASYNC_BEGIN	For custom start and end markers, use this macro to mark beginning of asynchronous event. Remember to use the same marker name and id for beginning and end. Asynchronous markers need an identifier to correctly match events.	`OVMS_PROFILER_ASYNC_BEGIN("My Asynchronous Event", unique_id);`
OVMS_PROFILE_ASYNC_END	For custom start and end markers, use this macro to mark end of asynchronous event. Remember to use the same marker name and id for beginning and end. Asynchronous markers need an identifier to correctly match events.	`OVMS_PROFILER_ASYNC_END("My Asynchronous Event", unique_id);`

More information can be found in profiler.hpp file.

Debug functional tests

Use OpenVINO Model Server build image because it installs the necessary tools.

Add the ENTRYPOINT line in Dockerfile.ubuntu:

echo 'ENTRYPOINT ["/bin/bash", "-c", "sleep 3600; echo Server started on port; sleep 100000"]' >> Dockerfile.ubuntu

Build the project in debug mode :
```
make docker_build BAZEL_BUILD_TYPE=dbg
```
Open a terminal.

Run a test in this terminal. Change TEST_PATH to point to the test you want to debug:

make test_functional TEST_PATH=tests/functional/test_batching.py::TestBatchModelInference::test_run_inference_rest IMAGE=openvino/model_server-build:latest

Open a second terminal.
In this terminal identify the ID/hash of a running Docker container:
```
docker ps
```

Use the ID to execute a new bash shell into this container and start gdb. Make sure the parameters you pass to the OpenVINO Model Server match the parameters in the test code :

docker exec -ti HASH bash

In docker container:

cd /ovms/bazel-bin/src/ ; gdb --args ./ovms  --model_name age_gender --model_path /opt/ml/age_gender --port 9000 --rest_port 5500 --log_level TRACE

Open a third terminal.
In this terminal use the Docker container ID/hash to stop the sleep process that is preventing the tests from starting. These tests are waiting for stdout text "Server started on port" :
```
docker exec -ti HASH bash
```
In docker container:
```
yum install psmisc; killall sleep
```
Return to the first terminal to debug the test execution.