docs: document OTLPMetricsWriter feature

This commit is contained in:
Yonas Habteab 2026-01-20 17:38:58 +01:00
parent 88c657d3b7
commit ac1a0b891c
3 changed files with 111 additions and 0 deletions

View file

@ -2959,6 +2959,7 @@ By default, the following features provide advanced HA functionality:
* [Graphite](09-object-types.md#objecttype-graphitewriter)
* [InfluxDB](09-object-types.md#objecttype-influxdb2writer) (v1 and v2)
* [OpenTsdb](09-object-types.md#objecttype-opentsdbwriter)
* [OTLPMetrics](09-object-types.md#objecttype-otlpmetricswriter)
* [Perfdata](09-object-types.md#objecttype-perfdatawriter) (for PNP)
#### High-Availability with Checks <a id="distributed-monitoring-high-availability-checks"></a>

View file

@ -1865,6 +1865,43 @@ Configuration Attributes:
host_template | Dictionary | **Optional.** Specify additional tags to be included with host metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags). Defaults to an `empty Dictionary`.
service_template | Dictionary | **Optional.** Specify additional tags to be included with service metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). Defaults to an `empty Dictionary`.
### OTLPMetricsWriter <a id="objecttype-otlpmetricswriter"></a>
Emits metrics in [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/) format to a defined OpenTelemetry Collector
or any other OTLP-compatible backend that accepts OTLP data over HTTP. This configuration object is available as
[otlpmetrics feature](14-features.md#otlpmetrics-writer). You can find more information about OpenTelemetry and OTLP
on the [OpenTelemetry website](https://opentelemetry.io/).
A basic copy and pastable example configuration is shown below:
```
object OTLPMetricsWriter "otlp-metrics" {
host = "127.0.0.1"
port = 4318
metrics_endpoint = "/v1/metrics"
service_namespace = "icinga2-production"
}
```
There are more configuration options available as described in the table below.
| Name | Type | Description |
|--------------------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| host | String | **Required.** OTLP collector host address. Defaults to `127.0.0.1`. |
| port | Number | **Required.** OTLP collector HTTP port. Defaults to `4318`. |
| metrics\_endpoint | String | **Required.** OTLP metrics endpoint path. Defaults to `/v1/metrics`. |
| service\_namespace | String | **Required.** The namespace to associate with emitted metrics used in the `service.namespace` OTel resource attribute. Defaults to `icinga`. |
| basic\_auth | Dictionary | **Optional.** Username and password for HTTP basic authentication. |
| flush\_interval | Duration | **Optional.** How long to buffer data points before transferring to the OTLP collector. Defaults to `15s`. |
| flush\_threshold | Number | **Optional.** How many bytes to buffer before forcing a transfer to the OTLP collector. Defaults to `32MiB`. |
| enable\_ha | Boolean | **Optional.** Enable the high availability functionality. Has no effect in non-cluster setups. Defaults to `false`. |
| enable\_send\_thresholds | Boolean | **Optional.** Whether to stream warning, critical, minimum & maximum as separate metrics to the OTLP collector. Defaults to `false`. |
| diconnect\_timeout | Duration | **Optional.** Timeout to wait for any outstanding data to be flushed to the OTLP collector before disconnecting. Defaults to `10s`. |
| enable\_tls | Boolean | **Optional.** Whether to use a TLS stream. Defaults to `false`. |
| tls\_insecure\_noverify | Boolean | **Optional.** Disable TLS peer verification. Defaults to `false`. |
| tls\_ca\_file | String | **Optional.** Path to CA certificate to validate the remote host. |
| tls\_cert\_file | String | **Optional.** Path to the client certificate to present to the OTLP collector for mutual verification. |
| tls\_key\_file | String | **Optional.** Path to the client certificate key. |
### PerfdataWriter <a id="objecttype-perfdatawriter"></a>

View file

@ -73,6 +73,7 @@ best practice is to provide performance data.
This data is parsed by features sending metrics to time series databases (TSDB):
* [OpenTelemetry](14-features.md#otlpmetrics-writer)
* [Graphite](14-features.md#graphite-carbon-cache-writer)
* [InfluxDB](14-features.md#influxdb-writer)
* [OpenTSDB](14-features.md#opentsdb-writer)
@ -751,6 +752,78 @@ mechanism ensures that metrics are written even if the cluster fails.
The recommended way of running OpenTSDB in this scenario is a dedicated server
where you have OpenTSDB running.
### OTLPMetrics Writer <a id="otlpmetrics-writer"></a>
The [OpenTelemetry Protocol (OTLP/HTTP)](https://opentelemetry.io/docs/specs/otlp/#otlphttp) metrics Writer feature
allows Icinga 2 to send metrics to OpenTelemetry Collector or any other backend that supports the OTLP HTTP protocol,
such as [Prometheus OTLP](https://prometheus.io/docs/guides/opentelemetry/) receiver,
[Grafana Mimir](https://grafana.com/docs/mimir/latest/configure/configure-otel-collector/),
[OpenSearch Data Prepper](https://docs.opensearch.org/latest/data-prepper/pipelines/configuration/sources/otlp-source/),
etc. It enables seamless integration of Icinga 2 metrics into modern observability stacks, allowing you to leverage the
capabilities of OpenTelemetry for advanced analysis and visualization of your monitoring data. OpenTelemetry provides a
standardized way to collect, process, and export telemetry data, making it easier to integrate with numerous
[monitoring and observability](https://opentelemetry.io/docs/collector/components/exporter/) tools effortlessly.
!!! note
This feature has successfully been tested with OpenTelemetry Collector, Prometheus OTLP receiver, OpenSearch Data
Prepper, and Grafana Mimir. However, it should work with any backend that supports the OTLP HTTP protocol as well.
In order to enable this feature, you can use the following command:
```bash
icinga2 feature enable otlpmetrics
```
By default, the OTLPMetrics Writer expects the OpenTelemetry Collector or any other OTLP HTTP receiver to listen at
`127.0.0.1` on port `4318` but most of the third-party backends use their own ports, so you may need to adjust the
configuration accordingly. Additionally, the `metrics_endpoint` can vary based on the backend you are using.
For example, OpenTelemetry Collector uses `/v1/metrics` (is the default), while the Prometheus OTLP receiver uses
`/api/v1/otlp/v1/metrics`. Therefore, it is important to set the correct `metrics_endpoint` in the configuration file.
You can find more details about the configuration options [here](09-object-types.md#objecttype-otlpmetricswriter).
The generated metric names follow the OpenTelemetry naming conventions and cannot be customized by end-users and are
therefore always the same across all Icinga 2 installations. The OTLP Writer currently sends the following metrics:
| Metric Name | Description |
|---------------------------------|---------------------------------------|
| state_check.perfdata | Performance data metrics from checks. |
| state_check.thresholds.warning | Warning threshold values for checks. |
| state_check.thresholds.critical | Critical threshold values for checks. |
| state_check.thresholds.min | Minimum threshold values for checks. |
| state_check.thresholds.max | Maximum threshold values for checks. |
By default, the writer will not stream any data point for the `state_check.thresholds.*` metrics. To enable the
streaming of threshold metrics, you need to set the `enable_send_thresholds` option to `true` in the OTLPMetrics Writer
configuration. Once enabled, it will send the threshold values for each performance data metric if they are available
in the produced check results.
The data points type for all the above metrics is [`gauge`](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#gauge)
and the perfdata labels and their units (if available) are mapped OpenTelemetry metric points attributes. For example,
a perfdata label `load1` with a value of `0.5` and unit `%` will be sent to the `state_check.perfdata` metric stream,
with a metric point having a value of `0.5`, along with the attributes `label="load1"` and `unit="%"`. Additionally,
each metric point will also include other relevant attributes such as `icinga2.host.name`, `icinga2.service.name`,
`icinga2.command.name`, etc. as resource attributes. The complete list of data format and attributes can be obtained by
letting the OpenTelemetry Collector log the received metrics either to the standard output or to a JSON file in a
human-readable format.
At the moment, the OTLPMetrics Writer allows you to configure only a single metrics resource attribute
[`service.namespace`](https://opentelemetry.io/docs/specs/semconv/registry/attributes/service/#service-namespace) via
the `service_namespace` option in the OTLPMetrics Writer config. This attribute can be used to group related metrics
together in the backend. By default, it is set to `icinga`. You can customize it to better fit your monitoring
environment. For example, you might set it to `production`, `staging`, or any other relevant namespace that categorizes
your Icinga 2 metrics emitted to the OpenTelemetry backend effectively.
#### OTLPMetrics in HA Cluster Zones <a id="otlpmetrics-writer-ha-cluster"></a>
This writer supports [High Availability (HA)](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
cluster zones in Icinga 2. If you enable this feature on all of your cluster endpoints, each OTLPMetrics Writer will
send metrics independently to the configured OTLP collector. In order to avoid duplicate metrics being sent from
multiple cluster endpoints, it is recommended to set the `enable_ha` option to `true` in the OTLPMetrics Writer config
on all cluster endpoints. This will ensure that only one writer in the cluster is active at any given time, sending
metrics to the configured OTLP collector. The other OTLPMetrics Writer will remain in standby mode and ready to take
over if the active endpoint fails or becomes unavailable for any reason.
### Writing Performance Data Files <a id="writing-performance-data-files"></a>