Why I recommend native Prometheus instrumentation over OpenTelemetry

July 17, 2025 by Julius Volz

With all the hype around OpenTelemetry (OTel), you may be tempted to use OpenTelemetry and its SDKs for all of your application instrumentation needs. However, when it comes to generating metrics for usage in Prometheus, you should at least think twice before going all in on OTel. Not only do you risk throwing away some of the core features that define Prometheus as a monitoring system, but you'll also end up with awkward metrics translation and escaping issues, as well as other inefficiencies and complexities. That's why I still recommend using Prometheus's own native instrumentation client libraries over the OTel SDKs if you want to get the best possible Prometheus monitoring experience. Let's have a look at the reasons.

Disclaimer: As a co-founder of Prometheus, I'm clearly biased here. There may still be reasons why you absolutely have to use OpenTelemetry with Prometheus, but advertisements for OpenTelemetry are found everywhere else already, so I won't repeat them here.

OpenTelemetry vs. Prometheus scopes

First, if you're not super familiar with OpenTelemetry and Prometheus, here's a quick comparison of the scopes of the two systems:

OTel vs. Prometheus scopes diagram

In short:

  • OpenTelemetry handles all three signal types (logs, metrics, and traces), but only cares about the generation (instrumentation) and transfer of the generated signals to some third-party backend system. The transfer usually happens via the OpenTelemetry Protocol (OTLP).
  • Prometheus only handles metrics (no logs or traces to be seen here), but it doesn't stop at generating them. Prometheus is a full monitoring system, so it also provides solutions for actively collecting and storing the data and making it queryable for dashboarding, alerting, and other use cases. The querying happens via the PromQL query language.

Both system scopes overlap in the generation and transfer of metrics, but they each come with their own client libraries, transfer protocols, data models, and differing philosophies for that overlapping part.

The basics of feeding OpenTelemetry metrics into Prometheus

While the two systems evolved separately, there are several ways to send OpenTelemetry metrics to Prometheus these days. While you could send them directly from an application to a Prometheus server, the more common approach in the OTel world is to first send all data to the OpenTelemetry Collector, a separate service that receives and aggregates telemetry from multiple applications. The Collector can then send the processed data to a backend like Prometheus:

OTel-to-Prometheus architecture diagram

From the Collector, the metrics can make it into Prometheus in a few different ways:

  • The Prometheus Exporter allows a Prometheus server to pull metrics from the Collector in Prometheus's native metrics format. This is not ideal for larger use cases, as it forces all metrics into a single large Prometheus scrape, introducing scaling and reliability challenges.
  • The Prometheus Remote Write Exporter pushes metrics to the Prometheus server using Prometheus's own remote write format for inter-server metrics transfers. This is more suitable for larger use cases.
  • Most common: The OTLP Exporter uses OpenTelemetry's OTLP to push metrics to Prometheus. This is a good alternative to using Prometheus's remote write format, and in practice what most people use when sending OTel data to Prometheus.

Reasons against using OpenTelemetry with Prometheus

With that out of the way, here are some of the reasons why I recommend using native Prometheus instrumentation over OpenTelemetry, especially if you mostly care about metrics and using them in Prometheus.

Reason 1: You throw away Prometheus's target health monitoring

Prometheus is a full monitoring system, and a monitoring system should not just be a mindless receptacle for random incoming metrics. To do its job, a monitoring system fundamentally needs to know what the world should look like, what the world actually currently looks like, and what the discrepancies are between the desired and actual states of the world. To achieve this, it needs to know which application processes, machines, and other monitoring targets should currently exist and be healthy in your infrastructure.

Prometheus's native monitoring model solves this challenge by combining two crucial concepts:

  • Service discovery: Prometheus integrates with a variety of service discovery mechanisms to figure out which monitoring targets should currently exist. For example, on a Kubernetes cluster the Prometheus server can subscribe to a list of all pods, services, ingresses, and so on, of a given type via the Kubernetes API. This list is then continuously updated. If Prometheus does not support the kind of service discovery that you need, you can even build your own.
  • Pull-based metrics collection with built-in health checks: By actively pulling and labeling metrics from each discovered target, Prometheus can automatically detect if a target is down or not responding correctly.

So Prometheus always has an up-to-date view of which targets should exist, and it records a synthetic up metric every time it scrapes a target, setting the sample value to 0 or 1, depending on the success of the scrape:

Prometheus's active target health monitoring

This makes it trivial to build basic health monitoring for all of your targets, meaning that you can easily find and alert on any targets that are down or unreachable:

# Alert if any target in the "demo" job is down for more than 5 minutes.
alert: TargetDown
expr: up{job="demo"} == 0
for: 5m

In contrast, OTLP is a push-based protocol and lacks any integration with Prometheus's service discovery capabilities. So there is no way for Prometheus to tell you if an expected metrics source is not reporting in (or vice versa). You lose the ability to detect if any of the following things go wrong:

  • A target that is supposed to be running is up, but is not sending metrics.
  • A target that is supposed to be running is down / absent and thus also not sending metrics.
  • A target that is not supposed to be monitored by this Prometheus server is sending metrics anyway (the pull model avoids this automatically).

Even if a target first reports data and then stops, you will have no way of knowing whether the target has been intentionally shut down or whether it is now crashing or its metrics reporting is broken. So you may actually have hundreds of broken service processes that you don't even know about.

To achieve a similar level of target health monitoring with OTLP, you will have to generate and ingest a separate set of metrics from some source of truth that tells you the intended state for each monitored target and then manually correlate this set of metrics with the incoming OTLP data, joining both sets of data on compatible identifying labels. This is a lot of extra work, so many people just ignore this problem completely (knowingly or not) and never even notice that they don't have proper target health monitoring.

Reason 2: You get changed metric names and/or ugly PromQL selectors

When using OpenTelemetry metrics in Prometheus, you will run into some character set compatibility issues, different metric naming conventions, and escaping woes:

Character set differences

In contrast to OpenTelemetry, Prometheus has never just focused on generating and transferring metrics, but also cares about using them after collection. Prometheus's approach to this is the PromQL query language that allows you to write expressions to power dashboards, alerting rules, ad-hoc debugging, and other use cases. When you design a query language, you naturally have to care about syntactical clashes between identifiers and other language constructs like operators - ideally without requiring any escaping around the identifiers. This is why Prometheus has historically been very conservative about the set of characters that it allowed in metric and label names. This is similar to most programming languages, which also don't allow characters like ., -, or / in identifiers, as they would clash with operators in the language. So until Prometheus 3.0, Prometheus only allowed alphanumeric characters and underscores in label names, while also allowing colons in metric names (regexes [a-zA-Z_][a-zA-Z0-9_]* and [a-zA-Z_:][a-zA-Z0-9_:]*, respectively).

In contrast, OpenTelemetry allows dots, dashes, and other operator-like characters in metric and attribute (label) names - this makes me question whether using OTel metrics in a query language was ever a major design consideration. This is despite the fact that PromQL was already a widely implemented de facto standard for querying metrics when OpenTelemetry was being standardized. In practice, prior to Prometheus 3.0, this meant that the metric and attribute names coming from OpenTelemetry had to be translated to Prometheus-compatible names by replacing unsupported characters with underscores (so my.metric.name would become my_metric_name).

Unit and type suffixes

At the same time, Prometheus's metric naming conventions require metric names to end with a suffix indicating the metric's unit, as well as a _total suffix for counter metrics (to disambiguate them from gauges indicating a current count). Both conventions make a metric more understandable, especially in the context of a larger PromQL expression that may be embedded somewhere in a YAML file that is checked in as code. In contrast, OpenTelemetry's naming conventions only treat the unit and type of a metric as separate metadata fields that are not supposed to be included in the metric name - this is inconvenient if you are working with PromQL expressions in an environment where you don't have immediate access to a metric's metadata. To work around this and still keep metric names understandable, the OTLP-to-Prometheus translation layer adds Prometheus-style unit and type suffixes to OpenTelemetry metric names by default.

So for example, the OTel metric k8s.pod.cpu.time would become k8s_pod_cpu_time_seconds_total after translation, both replacing the dots with underscores, as well as clarifying the unit (seconds) and type (counter) of the metric. Unfortunately, this name change can also be confusing for users.

UTF-8 support in Prometheus 3.0

Starting with version 3.0, Prometheus adds full UTF-8 support for metric and label names, so you can now theoretically store unaltered OTel metric names in Prometheus. However, this comes with some drawbacks when writing PromQL selectors: you have to quote names that use the extended character set, and you have to move any such quoted metric names inside of the curly braces (the label matchers list) of the time series selector. Otherwise PromQL would be unable to tell the metric name from a regular string literal expression.

So for example, if you had a traditional selector like this:

my_metric{my_label="value"}

...you would need to write it like this to support the extended character set (dots instead of underscores in this example):

{"my.metric", "my.label"="value"}

This selector syntax is less intuitive to read and more cumbersome to write, so keep this in mind before deciding to go beyond the original Prometheus character set for identifiers.

Starting with the UTF-8 support in Prometheus 3, the otlp.translation_strategy Prometheus configuration file setting allows you to configure the translation strategy for OTel metrics:

  • UnderscoreEscapingWithSuffixes: Flatten names to the traditional character set with underscores, as before. This is still the default.
  • NoUTF8EscapingWithSuffixes: Keep all original characters, but still add counter and unit suffixes to enhance clarity.
  • NoTranslation: Keep all original characters and don't add any suffixes (still experimental, might be removed).

Whether you want to keep the original character set and live with noisier selector syntax above is up to you. However, I would heavily recommend against using the NoTranslation option that even omits the unit and type suffixes. That will just make your metrics harder to understand when you're working with them later on.

However, if you use Prometheus's own instrumentation client libraries and naming conventions, you will never even have to think about these character set or naming differences, and your metric and label names will look identical on the instrumentation and usage sides.

Reason 3: Resource labels vs. target labels - same same but different?

Both Prometheus and OpenTelemetry like to attach a set of labels (or attributes, in OTel terminology) that give you information about the source of a metric, such as the application or service that generated it. In OpenTelemetry, these are called resource attributes, while Prometheus calls them target labels.

However, the two systems have somewhat different ideas about how to use these labels:

  • Prometheus target labels are attached to scraped metrics by the Prometheus server based on the target's metadata, which usually originates from a dynamic service discovery mechanism. Target labels tend to be relatively few and are mostly identifying (meaning that they are necessary to identify a target, instead of just adding extra information beyond that).
  • OpenTelemetry resource attributes are chosen by the application that generates the metrics and are usually way more numerous and detailed. They often provide a lot of additional context about the metric source, such as the programming language and version of the OTel SDK and other non-critical metadata.

Since Prometheus target labels are relatively few and mostly identifying, Prometheus attaches all of them (by default) to every metric that it scrapes from a target. That's convenient, but this model breaks down for OpenTelemetry's resource attributes, since attaching a long list of informational attributes to every ingested series makes the resulting metrics too expensive and harder to use.

So when receiving metrics via OTLP, Prometheus only attaches a small subset of the OTel resource attributes to all metrics by default:

  • service.name: This becomes the job label in Prometheus, which is used to identify the job (or service).
  • service.instance.id: This becomes the instance label in Prometheus, which is used to identify the instance (specific process) within the job.

All other resource attributes are only attached to a single target_info metric that is generated once for each resource. If you then really need to include one of these additional attributes in a query result, you have to join it in via PromQL. However, this gets a bit unwieldy. Instead of just writing a simple query like this:

rate(http_server_request_duration_seconds_count[5m])

...you will now have to write the following expression to also include the k8s_cluster_name resource attribute from the target_info metric:

    rate(http_server_request_duration_seconds_count[5m])
* on(job, instance) group_left(k8s_cluster_name)
    target_info

There is also a new info() function in Prometheus 3.0 that aims to make this a bit easier, but it's still marked as experimental and is not yet fully fleshed out.

If you are a Prometheus server operator, you can also configure which resource attributes are promoted to labels on each ingested series from a source.

In any case, you now have to think about all this - how to either join additional labels into your queries, or how to configure your Prometheus server to promote just the right subset of resource attributes to target labels. In contrast, if you use Prometheus's native instrumentation client libraries and scraping model, only the Prometheus administrator will usually have to think about a general target labeling configuration (or use the default).

Reason 4: More Prometheus settings required for ingestion

This is admittedly not a big downside, but more of an FYI: To allow Prometheus to receive metrics via OTLP, you will have to configure two additional settings on the Prometheus server:

  • You'll need to set the --web.enable-otlp-receiver command-line flag so that the Prometheus server starts the OTLP receiver on /api/v1/otlp/v1/metrics. This is because Prometheus is primarily intended to be a pull-based monitoring system, so allowing external clients to push metrics to it could present a security issue and needs to be enabled explicitly. You will now also have to protect this endpoint against pushes from unauthorized sources, a concern that doesn't exist when Prometheus is in charge of pulling metrics from targets.
  • Since OTLP does not guarantee in-order delivery of data (contrary to Prometheus's usual model of scraping and timestamping data itself), you will need to configure the Prometheus TSDB to allow out-of-order appends for some period of time (like 30 minutes):
    storage:
      tsdb:
        out_of_order_time_window: 30m

There are also potentially other settings you'll want to configure when using OTLP with Prometheus. See the Prometheus documentation for details.

Reason 5: OTel SDKs are complex and can be very slow

Another concern is the overall complexity and implementation inefficiency that you are taking on when using OpenTelemetry SDKs instead of Prometheus's native client libraries. OTel is a large and complex system that tries to do a lot of things, including logs and traces and complex features around views and aggregations for metrics inside of its instrumentation libraries. This means that its SDKs are also large and complex, and you will have to understand more concepts. In contrast, Prometheus's native instrumentation libraries are small and focused on just generating Prometheus metrics in an efficient way.

Disclaimer: In this example, I will mostly look at the Prometheus and OpenTelemetry SDKs for the Go programming language. I can't say for sure how the situation looks in all the other languages, but:

  • Go is a relevant example, as it's a popular language in the cloud native world and is used for heavy-duty server applications where performance matters.
  • Due to OTel's larger inherent conceptual complexity in the path between an event (like a counter increment) and the final OTLP output, I would not be surprised if the results are at least directionally similar in other languages as well.
  • I did create a vibe-coded variant of the same benchmarks for the Prometheus and OTel Java SDKs (since I don't speak Java) as well, and the results were similar. At least for multi-threaded performance, the Prometheus SDK was up to >30x faster.

I will also not go into OTel's more complex SDK initialization and setup code here, since that is a one-time cost that quickly amortizes itself in a larger codebase. However, I do want to take an exemplary look at the speed of counter increment operations in both the OpenTelemetry and Prometheus Go SDKs, since this is a common operation that may happen up to millions of times a second in a busy multi-core server application. You don't want your instrumentation framework to make up a noticeable portion of your CPU usage in this case, so the Prometheus Go SDK has been highly optimized for this kind of operation. The same is true for metric value updates in general.

Comparing counter increment performance

You can find the quick-and-dirty benchmark code on GitHub and see some raw results in a gist, but in short, the code creates and then increments a counter metric in a busy loop under different conditions: with and without labels (attributes), with different levels of parallelism, and with different levels of labeled child metric reference caching. In every instance, the Prometheus Go SDK is much faster than the OpenTelemetry one (on an Intel i7-12700KF CPU with 20 cores on Arch Linux):

Benchmarks for counter increments without labels

Benchmarks for counter increments with uncached labels

  • In the worst case (no labels and a parallelism of 16), the Prometheus SDK was around 26x faster.
  • In the best case (uncached labels and a parallelism of 2), the Prometheus SDK was still around 4.4x faster.

If you look at the raw benchmark results, you will also see memory allocations happening in the OTel SDK whenever attributes are set on a counter increment, while the Prometheus Go SDK allocates zero new memory for all cases.

Further optimizations with cached labels

Prometheus's Go SDK also lets you keep a reference to a specific labeled child metric, in case you know that you'll need to increment the same label set many times in a row. In this case, the comparison is even more extreme (up to a 53x difference), since the OTel SDK does not allow for this kind of optimization:

Benchmarks for counter increments with cached labels

A more subjective impression of the SDK code complexity

As another more subjective point of comparison, trying to find the line of code in the SDKs where the actual value increment happens took me about 5 seconds for the Prometheus Go SDK (starting at the Inc() call in the application code in VS Code), while I gave up trying to find the corresponding line in the OpenTelemetry Go SDK after around 15 minutes of looking for it through various indirections and abstractions. I might have been able to find it eventually, for example by profiling the running code or by looking at things again with a fresher mind. This also goes to show the different levels of complexity going on in the two SDKs.

Reason 6: If you want open standards, Prometheus is open and established as well

Finally, many people are adopting OpenTelemetry because they want to adopt an open standard for their monitoring and observability needs. However, Prometheus is also open source and has an open governance model. It is a mature, battle-tested, and a widely used system that is also backed by a large community of contributors and users. Many cloud vendors have adopted PromQL, Prometheus's Remote Write format, and other Prometheus interfaces as de facto standards for ingesting and querying metrics. And Prometheus's text-based format for exposing metrics from a target is so simple that it puts a really low floor on the kinds of situations where you can use it. If you really need to, you can even implement basic forms of the format in a few lines of shell script without having to depend on a library. For example, the following script would serve a simple Prometheus /metrics endpoint containing a single metric:

#!/bin/bash
mkdir -p /tmp/metrics
echo 'my_metric{label="value"} 42' > /tmp/metrics/metrics
npx serve -p 8080 -d /tmp/metrics

You could now point a Prometheus server at http://localhost:8080/metrics to pull metrics from this endpoint. Contrast this with OTLP being a deeply nested protocol-buffer-based transfer format, which really makes it much harder to implement without a proper SDK.

Finally, you still always have the option of bridging Prometheus metrics endpoints into OpenTelemetry / OTLP later on if you really need to. For example, you can use the Prometheus Bridge to expose native Prometheus metrics via OTLP in Go.

Conclusion

If you are using Prometheus as your monitoring system, I still strongly recommend using Prometheus's own native instrumentation client libraries and Prometheus's pull-based monitoring model for monitoring your services. You end up with more reliable, complete, and efficient monitoring, and you avoid the complexities of metric name translation, escaping issues, and the need to join additional labels into queries. Of course you may still have other reasons to use OpenTelemetry, but now you know the downsides as well.


July 17, 2025 by Julius Volz

Tags: prometheus, opentelemetry, instrumentation, metrics, service discovery, monitoring

Comments powered by Talkyard.