Intel® Optane™ Persistent Memory Controller Exporter

January 7, 2023 · View on GitHub

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project.

If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.

Contact: webadmin@linux.intel.com

Intel® Optane™ Persistent Memory Controller Exporter

Contributor Covenant

Intel® Optane™ PMCE is a utility for exposing health and performance metrics from Intel Optane DC Persistent memory modules (DCPMM) for Prometheus (an open source monitoring system). Exporter is linking to libipmctl and consume its API. Library is a part of IPMCTL project.

Exported Metrics

basic set: # sudo ./ipmctl_exporter

NameDescription
ipmctl_healthDCPMM health as reported in the SMART log
ipmctl_media_temperature_celsiusDevice media temperature in degrees Celsius
ipmctl_controller_temperature_celsiusDevice media temperature in degrees Celsius
ipmctl_lifespan_percentage_remainingAmount of lifespan remaining as a percentage
ipmctl_latched_dirty_shutdown_count_totalDevice shutdowns without notification
ipmctl_power_on_time_seconds_totalTotal power-on time over the lifetime of the device
ipmctl_up_time_seconds_totalTotal power-on time since the last power cycle of the device
ipmctl_power_cycles_totalNumber of power cycles over the lifetime of the device
ipmctl_fw_error_totalThe total number of firmware error log entries
ipmctl_unlatched_dirty_shutdown_count_totalNumber of times that the FW received an unexpected power loss
ipmctl_total_media_reads_totalLifetime number of 64 byte reads from media on the DCPMM
ipmctl_total_media_writes_totalLifetime number of 64 byte writes to media on the DCPMM
ipmctl_total_read_requests_totalLifetime number of DDRT read transactions the DCPMM has serviced
ipmctl_total_write_requests_totalLifetime number of DDRT write transactions the DCPMM has serviced
ipmctl_device_discovery_infoDescribes the capabilities supported by a DCPMM
ipmctl_device_security_capabilities_infoDescribes the security capabilities of a device
ipmctl_device_discovery_infoDescribes an enterprise-level view of a device

If you would like to add some alerts in Prometheus to get notification after reaching some configured thresholds, you may enable it as well (these are disabled by default) to do it try: # sudo ./ipmctl_exporter --enable-thresholds

NameDescription
ipmctl_media_temperature_enabledIndictes if firmware notifications are enabled when media temperature value is critical
ipmctl_media_temperature_upper_critical_threshold_celsiusThe upper media temperature critical threshold
ipmctl_media_temperature_lower_critical_threshold_celsiusThe lower media temperature critical threshold
ipmctl_media_temperature_upper_fatal_threshold_celsiusThe upper media temperature fatal threshold
ipmctl_media_temperature_lower_fatal_threshold_celsiusThe lower media temperature fatal threshold
ipmctl_media_temperature_upper_noncritical_threshold_celsiusThe upper media temperature noncritical threshold
ipmctl_media_temperature_lower_noncritical_threshold_celsiusThe lower media temperature noncritical threshold
ipmctl_controller_temperature_enabledIndictes if firmware notifications are enabled when controller temperature value is critical
ipmctl_controller_temperature_upper_critical_threshold_celsiusThe upper controller temperature critical threshold
ipmctl_controller_temperature_lower_critical_threshold_celsiusThe lower controller temperature critical threshold
ipmctl_controller_temperature_upper_fatal_threshold_celsiusThe upper controller temperature fatal threshold
ipmctl_controller_temperature_lower_fatal_threshold_celsiusThe lower controller temperature fatal threshold
ipmctl_controller_temperature_upper_noncritical_threshold_celsiusThe upper controller temperature noncritical threshold
ipmctl_controller_temperature_lower_noncritical_threshold_celsiusThe lower controller temperature noncritical threshold
ipmctl_lifespan_percentage_remaining_enabledIndictes if firmware notifications are enabled when lifespan percentage remaining value is critical
ipmctl_lifespan_percentage_remaining_upper_critical_thresholdThe upper lifespan percentage remaining critical threshold
ipmctl_lifespan_percentage_remaining_lower_critical_thresholdThe lower lifespan percentage remaining critical threshold
ipmctl_lifespan_percentage_remaining_upper_fatal_thresholdThe upper lifespan percentage remaining fatal threshold
ipmctl_lifespan_percentage_remaining_lower_fatal_thresholdThe lower lifespan percentage remaining fatal threshold
ipmctl_lifespan_percentage_remaining_upper_noncritical_thresholdThe upper lifespan percentage remaining noncritical threshold
ipmctl_lifespan_percentage_remaining_lower_noncritical_thresholdThe lower lifespan percentage remaining noncritical threshold

Labels returned by ipmctl_device_discovery_info

NameDescription
capacityRaw capacity in bytes.
channel_idThe memory channel number.
channel_posThe memory module's position in the memory channel.
controller_revision_idRevision identifier of the DCPMM non-volatile memory subsystem controller from FIS.
device_idThe device identifier - Little Endian.
fw_api_versionAPI version of the currently running FW.
fw_revisionThe current active firmware revision.
interface_format_codesCalculate_capabilities_for_populated_devices() in device.c.
lock_stateIndicates if the DCPMM is in a locked security state.
manageabilityCompatibility of the device, FW and configuration with the management software.
manufacturerThe manufacturer ID code determined by JEDEC JEP-106 - Little Endian.
manufacturing_dateDate the DCPMM was manufactured, assigned by vendor only valid if manufacturing_info_valid=1.
manufacturing_info_validManufacturing location and date validity.
manufacturing_locationDCPMM manufacturing location assigned by vendor only valid if manufacturing_info_valid=1.
master_passphrase_enabledIf 1, master passphrase is enabled on the DCPMM.
memory_controller_idThe ID of the associated memory controller.
memory_typeThe type of memory used by the DCPMM.
node_controller_idThe node controller ID.
part_numberThe manufacturer's model part number.
physical_idThe unique physical ID of the memory module.
revision_idThe revision identifier.
serial_numberSerial number assigned by the vendor - Little Endian.
skuStock keeping unit.
socket_idThe processor socket identifier.
subsystem_device_idDevice identifier of the DCPMM non-volatile memory subsystem controller.
subsystem_revision_idRevision identifier of the DCPMM non-volatile memory subsystem controller from NFIT.
subsystem_vendor_idVendor identifier of the DCPMM non-volatile memory subsystem controller - Little Endian.
uidUnique identifier of the device.
vendor_idThe vendor identifier - Little Endian.

Labels returned by ipmctl_device_security_capabilities_info

NameDescription
erase_crypto_capableDCPMM supports nvm_erase command with the CRYPTO.
master_passphrase_capableDCPMM supports set master passphrase command.
passphrase_capableDCPMM supports the nvm_(set/remove)_passphrase command.
uidUnique identifier of the device.
unlock_device_capableDCPMM supports the nvm_unlock_device command.

Build

As far as IPMCTL exporter utilize libipmctl as well as libndctl (both are external libraries) supported systems depends on availability of these libraries under different Operating Systems.

For Linux we highly recommend:

Fedora greater than 29 (Workstation Edition) x64 with installed latest golang compiler, latest pkg-config, latest GCC, latest cmake, and latest ipmctl + ndctl libraries, follow the steps below to prepare your environment for builds:

dnf install -y git cmake pkg-config gcc golang ndctl-libs libipmctl
git clone https://github.com/intel/ipmctl-exporter
cd ./ipmctl-exporter
cmake -S . -B output

To proceed with build:

export PKG_CONFIG_PATH=`pwd`/output/
make -C output

For Windows we highly recommend:

Windows Server 2016 Standard or Windows 7/8/8.1/10 x64 with installed latest golang compiler, latest pkg-config, latest TDM64-GCC, latest cmake, and latest ipmctl library, follow the steps below to prepare your environment for builds:

  • Install golang from here to C:\Go directory
  • Install tdm64-gcc from here to C:\TDM-GCC-64 directory
  • Install pkgconfiglite from here to C:\TDM-GCC-64\bin directory
  • Install cmake from here to C:\Program Files\CMake directory
  • Install ipmctl library from here choose latest build for Windows OS
  • From cmd.exe:

Attention: Please avoid whitespaces for git repository directory, some Windows OSes may face issues with parsing such paths.


git clone https://github.com/intel/ipmctl-exporter
cd ipmctl-exporter
cmake -S . -B output -G "MinGW Makefiles"

To proceed with build:

set PKG_CONFIG_PATH=%cd%\output\
mingw32-make -C output

Run

Referring to the list of default ports by default ipmctl-exporter serves on port 0.0.0.0:9757 at endpoint /metrics, for more details about the usage type:

sudo ./ipmctl_exporter --help

ipmctl_exporter as well as ipmctl tool has to be run as root user, otherwise you should receive error code 268 (INVALID PERMISSIONS) trying to collect some data.

Code of Conduct

We are following rules defined by Contributor Covenant Code of Coduct version 2.0