Monitoring TLS Endpoint Certificate Expiration with Prometheus

February 06, 2024 by Julius Volz

If you have HTTPS endpoints in your infrastructure, you will want to make sure that none of their TLS certificates are about to expire without you noticing. Let's have a look at how you can use Prometheus' Blackbox Exporter to monitor for TLS certificates that are about to expire soon, so you can renew them in time.

Setting up the Blackbox Exporter

The Blackbox Exporter is a Prometheus exporter that allows you to actively probe service endpoints from the outside and then return Prometheus metrics about the executed probe request. If you want to learn more about how this exporter works together with Prometheus and what else it can do, you can find all the details in our our Probing Services - Blackbox Exporter training.

Let's download and run the Blackbox Exporter:

# Download the Blackbox Exporter.
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gz

# Unpack it.
tar xvfz blackbox_exporter-0.24.0.linux-amd64.tar.gz

# Change into the unpacked directory.
cd blackbox_exporter-0.24.0.linux-amd64

# Run the Blackbox Exporter on its default port of 9115.
./blackbox_exporter

The included default blackbox.yml configuration file already includes a http_2xx prober module configuration that allows us to execute HTTP(S) requests through the Blackbox Exporter from Prometheus and get back various Prometheus metrics about the probe.

Try heading to http://localhost:9115/probe?module=http_2xx&target=https://promlabs.com in your browser to trigger a manual HTTPS probe request against the URL https://promlabs.com. In the response, you will see the resulting Prometheus metrics about the probe request:

# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.006111063
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.753497273
# [...abridged...]
# HELP probe_ssl_earliest_cert_expiry Returns last SSL chain expiry in unixtime
# TYPE probe_ssl_earliest_cert_expiry gauge
probe_ssl_earliest_cert_expiry 1.710767408e+09
# [...abridged...]

We will later configure Prometheus to scrape this /probe endpoint to trigger a probe and collect the same metrics.

The returned metrics include a probe_ssl_earliest_cert_expiry metric – the value of this metric is a Unix timestamp that tells us when the TLS certificate is about to expire (hopefully this timestamp is in the future, otherwise the certificate has already expired). We can later use this metric to alert us when the certificate is about to expire.

Setting up Prometheus to probe endpoints through the Blackbox Exporter

Let's download and unpack Prometheus:

# Download Prometheus.
wget https://github.com/prometheus/prometheus/releases/download/v2.49.1/prometheus-2.49.1.linux-amd64.tar.gz

# Unpack it.
tar xvfz prometheus-2.49.1.linux-amd64.tar.gz

# Change into the unpacked directory.
cd prometheus-2.49.1.linux-amd64

Now we need to create a Prometheus configuration file to scrape the Blackbox Exporter's /probe endpoint with the correct parameters to select the target to probe and the type of probe to execute. In this example, we will check the certificates of the URLs https://promlabs.com and https://prometheus.io. To get alerted about certificates that are about to expire, we also create a rule file with an alerting rule that compares the TLS certificate expiration timestamp against the current time and alerts when a certificate will expire within 7 days.

Overwrite the downloaded example prometheus.yml file with the following contents (explanatory comments inline):

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "tls-cert-rules.yml"

scrape_configs:
  - job_name: blackbox
    metrics_path: /probe # Do not scrape /metrics, but /probe.
    params:
      module: [http_2xx] # Send a "module" HTTP parameter to the exporter to select the right probe module.
    static_configs:
      - targets:
        - https://promlabs.com
        - https://prometheus.io
    # A relabeling config that lets us scrape target through the Blackbox Exporter,
    # while labeling the resulting metrics with the probed target's URL.
    relabel_configs:
      # Set the "target" HTTP parameter to the target URL that we want to probe.
      - source_labels: [__address__]
        target_label: __param_target
      # Set the "instance" label to the target URL that we want to probe.
      - source_labels: [__param_target]
        target_label: instance
      # Don't actually scrape the target itself, but the Blackbox Exporter.
      - target_label: __address__
        replacement: localhost:9115

Note: We use a set of relabeling rules above to scrape each target through the Blackbox Exporter, while labeling the resulting metrics with the probed target's URL instead of the Blackbox Exporter's own address. To learn more about relabeling in general, take a look at our free Relabeling training.

Create the tls-cert-rules.yml file with the expiration alerting rule:

groups:
- name: tls-cert-rules
  rules:
  - alert: TLSCertificateExpiring
    expr: probe_ssl_earliest_cert_expiry - time() < 7*24*3600
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Certificate for {{ $labels.instance }} is about to expire"
      description: "The certificate for {{ $labels.instance }} is about to expire in less than 7 days."

Now we can start Prometheus with our new configuration:

./prometheus

Give Prometheus a few seconds to scrape the Blackbox Exporter and collect metrics. Then head to http://localhost:9090/targets to check that the two endpoints are being probed correctly:

Prometheus scraping Blackbox Exporter targets

Head to http://localhost:9090/alerts to see the alerting rule we just created:

TLS certificate expiration alert

You should see an alert called TLSCertificateExpiring that is currently inactive. If you pointed Prometheus at one of your own endpoints for which a certificate is about to expire, the alert should become active.

You can also manually query the probe_ssl_earliest_cert_expiry metric to see the expiration timestamp of each certificate:

Querying TLS certificate expiry

And to check how many days are left until the certificate expires, you can use the time() function to get the current Unix time in seconds and subtract the probe_ssl_earliest_cert_expiry metric from it:

(probe_ssl_earliest_cert_expiry - time()) / 24 / 3600

Querying TLS certificate expiry duration

In the example above, the certificate for prometheus.io is about to expire in 50.5 days, while the certificate for promlabs.com is about to expire in 40.8 days.

Conclusion

Now we have set up a Prometheus instance that scrapes the Blackbox Exporter's /probe endpoint to trigger a probe against our HTTPS endpoints and to collect the resulting metrics. We also have an alerting rule that will fire an alert when the certificate is about to expire, allowing us to renew it in time.

If you want to learn more about how to set up alerting rules in Prometheus, take a look at our Alerting with Prometheus training. You can also learn more about the Blackbox Exporter in our Probing Services - Blackbox Exporter training.


February 06, 2024 by Julius Volz

Tags: alerting, tls, certificates, expiration, blackbox exporter