How to Configure Alertmanager and Integrate it with Prometheus
Alertmanager is a component of the Prometheus monitoring system that handles alerts sent by client applications such as Prometheus. It can group, filter, and route these alerts to various notification channels such as email, Slack, PagerDuty, or webhooks. We will through the steps to configure Alertmanager, create an alert, configure a webhook, and integrate Alertmanager with Prometheus.
Step 1: Install and Configure Alertmanager
First, you need to install Alertmanager. You can download the latest version from the official Prometheus website. After downloading, extract the binary and move it to a directory on your system. Then, create a configuration file for Alertmanager. Here is an example configuration file:
global:
resolve_timeout: 5m
route:
receiver: 'webhook'
receivers:
- name: 'webhook'
webhook_configs:
- url: 'http://example.com/alert'
In this example, we configure a global resolve timeout of 5 minutes, which means that an alert will be considered resolved if it hasn’t fired for 5 minutes. We also define a single route that sends alerts to the webhook receiver, which we will configure later. Finally, we define the webhook receiver and its URL.
Save this file as alertmanager.yml
in the same directory as the Alertmanager binary.
Step 2: Create an Alert
Next, we need to create an alert for Alertmanager to send. In this example, we will create an alert that fires when the CPU usage on a server is too high. We assume that Prometheus is already installed and configured to monitor the server.
Create a file called cpu_alert.rules
with the following content:
groups:
- name: cpu_alert.rules
rules:
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 1m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "The average CPU usage is above 80% for the last 5 minutes"
This rule file defines an alert called HighCPUUsage that fires when the CPU usage on a server is above 80% for the last 5 minutes. The severity of the alert is warning, and it has a summary and a description. Save this file in the Prometheus rules directory.
Step 3: Configure a Webhook
Next, we need to configure a webhook to receive the alerts from Alertmanager. In this example, we will use a webhook that sends alerts to a Slack channel.
Create a new incoming webhook in Slack and copy its URL. Then, modify the Alertmanager configuration file to include the following webhook configuration:
global:
resolve_timeout: 5m
route:
receiver: 'slack'
receivers:
- name: 'slack'
webhook_configs:
- url: 'https://hooks.slack.com/services/XXXXXXXXX/YYYYYYYYY/ZZZZZZZZZZZZZZZZZZZZZZZZ'
send_resolved: true
In this example, we define a new receiver called slack that sends alerts to the Slack webhook URL. We also set the send_resolved flag to true, which means that Alertmanager will send a notification when an alert is resolved.
Save the Alertmanager configuration file and restart Alertmanager.
Step 4: Integrate Alertmanager with Prometheus
To integrate Alertmanager with Prometheus, we need to add Alertmanager as a target in the Prometheus configuration file. Open the Prometheus configuration file (prometheus.yml
) and add the following Alertmanager configuration:
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
In this example, we define a single static Alertmanager target running on localhost at port 9093. Save the Prometheus configuration file and restart Prometheus.
Now that Alertmanager and Prometheus are configured and integrated, we can test the alert by causing the CPU usage on the server to exceed 80%. Once the alert fires, Alertmanager will send a notification to the configured webhook (in this example, the Slack webhook).
Supported Integrations for Alertmanager
Alertmanager supports various integrations with third-party tools and services such as email, PagerDuty, VictorOps, Slack, Microsoft Teams, and more. To configure an integration, you need to define a receiver in the Alertmanager configuration file and configure its settings according to the integration’s requirements. Here is an example receiver configuration for sending alerts to an email address:
receivers:
- name: 'email'
email_configs:
- to: 'admin@example.com'
send_resolved: true
In this example, we define a receiver called email that sends alerts to the email address admin@example.com
. We also set the send_resolved flag to true, which means that Alertmanager will send a notification when an alert is resolved.
Complete Alertmanager Configuration File
Here is the complete Alertmanager configuration file that includes the webhook configuration and receiver definitions for Slack:
global:
resolve_timeout: 5m
route:
receiver: 'slack'
receivers:
- name: 'slack'
webhook_configs:
- url: 'https://hooks.slack.com/services/XXXXXXXXX/YYYYYYYYY/ZZZZZZZZZZZZZZZZZZZZZZZZ'
send_resolved: true
email_configs:
- to: 'admin@example.com'
send_resolved: true
In this example, we define a single route that sends alerts to the Slack webhook and email receiver. The Slack webhook URL is configured in the url
field, and the email receiver is configured with the email address in the to
field. The send_resolved
flag is set to true for both receivers, which means that Alertmanager will send a notification when an alert is resolved.
Conclusion
Alertmanager is a powerful component of the Prometheus monitoring system that can help you manage and respond to alerts from various sources. By following the steps outlined, you can configure and integrate Alertmanager with Prometheus and send alerts to various notification channels such as email, Slack, PagerDuty, or webhooks. With Alertmanager, you can stay on top of your system’s health and quickly respond to issues as they arise.