docs: document OTLPMetricsWriter feature

2026-02-03 20:40:17 -05:00 · 2026-01-20 17:38:58 +01:00 · 2026-01-20 17:38:58 +01:00 · ac1a0b891c
commit ac1a0b891c
parent 88c657d3b7
3 changed files with 111 additions and 0 deletions
--- a/doc/06-distributed-monitoring.md
+++ b/doc/06-distributed-monitoring.md
@ -2959,6 +2959,7 @@ By default, the following features provide advanced HA functionality:
 * [Graphite](09-object-types.md#objecttype-graphitewriter)
 * [InfluxDB](09-object-types.md#objecttype-influxdb2writer) (v1 and v2)
 * [OpenTsdb](09-object-types.md#objecttype-opentsdbwriter)
+* [OTLPMetrics](09-object-types.md#objecttype-otlpmetricswriter)
 * [Perfdata](09-object-types.md#objecttype-perfdatawriter) (for PNP)

 #### High-Availability with Checks <a id="distributed-monitoring-high-availability-checks"></a>
--- a/doc/09-object-types.md
+++ b/doc/09-object-types.md
@ -1865,6 +1865,43 @@ Configuration Attributes:
  host_template             | Dictionary                | **Optional.** Specify additional tags to be included with host metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags). Defaults to an `empty Dictionary`.
  service_template          | Dictionary                | **Optional.** Specify additional tags to be included with service metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). Defaults to an `empty Dictionary`.

+### OTLPMetricsWriter <a id="objecttype-otlpmetricswriter"></a>
+
+Emits metrics in [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/) format to a defined OpenTelemetry Collector
+or any other OTLP-compatible backend that accepts OTLP data over HTTP. This configuration object is available as
+[otlpmetrics feature](14-features.md#otlpmetrics-writer). You can find more information about OpenTelemetry and OTLP
+on the [OpenTelemetry website](https://opentelemetry.io/).
+
+A basic copy and pastable example configuration is shown below:
+
+```
+object OTLPMetricsWriter "otlp-metrics" {
+  host = "127.0.0.1"
+  port = 4318
+  metrics_endpoint = "/v1/metrics"
+  service_namespace = "icinga2-production"
+}
+```
+
+There are more configuration options available as described in the table below.
+
+| Name                     | Type       | Description                                                                                                                                  |
+|--------------------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------|
+| host                     | String     | **Required.** OTLP collector host address. Defaults to `127.0.0.1`.                                                                          |
+| port                     | Number     | **Required.** OTLP collector HTTP port. Defaults to `4318`.                                                                                  |
+| metrics\_endpoint        | String     | **Required.** OTLP metrics endpoint path. Defaults to `/v1/metrics`.                                                                         |
+| service\_namespace       | String     | **Required.** The namespace to associate with emitted metrics used in the `service.namespace` OTel resource attribute. Defaults to `icinga`. |
+| basic\_auth              | Dictionary | **Optional.** Username and password for HTTP basic authentication.                                                                           |
+| flush\_interval          | Duration   | **Optional.** How long to buffer data points before transferring to the OTLP collector. Defaults to `15s`.                                   |
+| flush\_threshold         | Number     | **Optional.** How many bytes to buffer before forcing a transfer to the OTLP collector. Defaults to `32MiB`.                                 |
+| enable\_ha               | Boolean    | **Optional.** Enable the high availability functionality. Has no effect in non-cluster setups. Defaults to `false`.                          |
+| enable\_send\_thresholds | Boolean    | **Optional.** Whether to stream warning, critical, minimum & maximum as separate metrics to the OTLP collector. Defaults to `false`.         |
+| diconnect\_timeout       | Duration   | **Optional.** Timeout to wait for any outstanding data to be flushed to the OTLP collector before disconnecting. Defaults to `10s`.          |
+| enable\_tls              | Boolean    | **Optional.** Whether to use a TLS stream. Defaults to `false`.                                                                              |
+| tls\_insecure\_noverify  | Boolean    | **Optional.** Disable TLS peer verification. Defaults to `false`.                                                                            |
+| tls\_ca\_file            | String     | **Optional.** Path to CA certificate to validate the remote host.                                                                            |
+| tls\_cert\_file          | String     | **Optional.** Path to the client certificate to present to the OTLP collector for mutual verification.                                       |
+| tls\_key\_file           | String     | **Optional.** Path to the client certificate key.                                                                                            |

 ### PerfdataWriter <a id="objecttype-perfdatawriter"></a>

--- a/doc/14-features.md
+++ b/doc/14-features.md
@ -73,6 +73,7 @@ best practice is to provide performance data.

 This data is parsed by features sending metrics to time series databases (TSDB):

+* [OpenTelemetry](14-features.md#otlpmetrics-writer)
 * [Graphite](14-features.md#graphite-carbon-cache-writer)
 * [InfluxDB](14-features.md#influxdb-writer)
 * [OpenTSDB](14-features.md#opentsdb-writer)
@ -751,6 +752,78 @@ mechanism ensures that metrics are written even if the cluster fails.
 The recommended way of running OpenTSDB in this scenario is a dedicated server
 where you have OpenTSDB running.

+### OTLPMetrics Writer <a id="otlpmetrics-writer"></a>
+
+The [OpenTelemetry Protocol (OTLP/HTTP)](https://opentelemetry.io/docs/specs/otlp/#otlphttp) metrics Writer feature
+allows Icinga 2 to send metrics to OpenTelemetry Collector or any other backend that supports the OTLP HTTP protocol,
+such as [Prometheus OTLP](https://prometheus.io/docs/guides/opentelemetry/) receiver,
+[Grafana Mimir](https://grafana.com/docs/mimir/latest/configure/configure-otel-collector/),
+[OpenSearch Data Prepper](https://docs.opensearch.org/latest/data-prepper/pipelines/configuration/sources/otlp-source/),
+etc. It enables seamless integration of Icinga 2 metrics into modern observability stacks, allowing you to leverage the
+capabilities of OpenTelemetry for advanced analysis and visualization of your monitoring data. OpenTelemetry provides a
+standardized way to collect, process, and export telemetry data, making it easier to integrate with numerous
+[monitoring and observability](https://opentelemetry.io/docs/collector/components/exporter/) tools effortlessly.
+
+!!! note
+
+    This feature has successfully been tested with OpenTelemetry Collector, Prometheus OTLP receiver, OpenSearch Data
+    Prepper, and Grafana Mimir. However, it should work with any backend that supports the OTLP HTTP protocol as well.
+
+In order to enable this feature, you can use the following command:
+
+```bash
+icinga2 feature enable otlpmetrics
+```
+
+By default, the OTLPMetrics Writer expects the OpenTelemetry Collector or any other OTLP HTTP receiver to listen at
+`127.0.0.1` on port `4318` but most of the third-party backends use their own ports, so you may need to adjust the
+configuration accordingly. Additionally, the `metrics_endpoint` can vary based on the backend you are using.
+For example, OpenTelemetry Collector uses `/v1/metrics` (is the default), while the Prometheus OTLP receiver uses
+`/api/v1/otlp/v1/metrics`. Therefore, it is important to set the correct `metrics_endpoint` in the configuration file.
+
+You can find more details about the configuration options [here](09-object-types.md#objecttype-otlpmetricswriter).
+
+The generated metric names follow the OpenTelemetry naming conventions and cannot be customized by end-users and are
+therefore always the same across all Icinga 2 installations. The OTLP Writer currently sends the following metrics:
+
+| Metric Name                     | Description                           |
+|---------------------------------|---------------------------------------|
+| state_check.perfdata            | Performance data metrics from checks. |
+| state_check.thresholds.warning  | Warning threshold values for checks.  |
+| state_check.thresholds.critical | Critical threshold values for checks. |
+| state_check.thresholds.min      | Minimum threshold values for checks.  |
+| state_check.thresholds.max      | Maximum threshold values for checks.  |
+
+By default, the writer will not stream any data point for the `state_check.thresholds.*` metrics. To enable the
+streaming of threshold metrics, you need to set the `enable_send_thresholds` option to `true` in the OTLPMetrics Writer
+configuration. Once enabled, it will send the threshold values for each performance data metric if they are available
+in the produced check results.
+
+The data points type for all the above metrics is [`gauge`](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#gauge)
+and the perfdata labels and their units (if available) are mapped OpenTelemetry metric points attributes. For example,
+a perfdata label `load1` with a value of `0.5` and unit `%` will be sent to the `state_check.perfdata` metric stream,
+with a metric point having a value of `0.5`, along with the attributes `label="load1"` and `unit="%"`. Additionally,
+each metric point will also include other relevant attributes such as `icinga2.host.name`, `icinga2.service.name`,
+`icinga2.command.name`, etc. as resource attributes. The complete list of data format and attributes can be obtained by
+letting the OpenTelemetry Collector log the received metrics either to the standard output or to a JSON file in a
+human-readable format.
+
+At the moment, the OTLPMetrics Writer allows you to configure only a single metrics resource attribute
+[`service.namespace`](https://opentelemetry.io/docs/specs/semconv/registry/attributes/service/#service-namespace) via
+the `service_namespace` option in the OTLPMetrics Writer config. This attribute can be used to group related metrics
+together in the backend. By default, it is set to `icinga`. You can customize it to better fit your monitoring
+environment. For example, you might set it to `production`, `staging`, or any other relevant namespace that categorizes
+your Icinga 2 metrics emitted to the OpenTelemetry backend effectively.
+
+#### OTLPMetrics in HA Cluster Zones <a id="otlpmetrics-writer-ha-cluster"></a>
+
+This writer supports [High Availability (HA)](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
+cluster zones in Icinga 2. If you enable this feature on all of your cluster endpoints, each OTLPMetrics Writer will
+send metrics independently to the configured OTLP collector. In order to avoid duplicate metrics being sent from
+multiple cluster endpoints, it is recommended to set the `enable_ha` option to `true` in the OTLPMetrics Writer config
+on all cluster endpoints. This will ensure that only one writer in the cluster is active at any given time, sending
+metrics to the configured OTLP collector. The other OTLPMetrics Writer will remain in standby mode and ready to take
+over if the active endpoint fails or becomes unavailable for any reason.

 ### Writing Performance Data Files <a id="writing-performance-data-files"></a>