Comprehensive Guide to Monitoring and Alerting with Prometheus, Grafana, and Alertmanager

Introduction

This documentation provides detailed instructions for setting up Prometheus, Grafana, and Alertmanager to monitor a Node.js application effectively. By following this guide, you'll learn how to install the necessary components, configure alerts based on specific metrics, and visualize those alerts in Grafana.

Prerequisites

Before starting, ensure that you have the following:

Prometheus: Installed and running.
Grafana: Installed and running.
Alertmanager: For handling alert notifications.
Node.js application: Exposing metrics at the /metrics endpoint.
Basic understanding of Prometheus metrics, alerts, and Docker (optional but useful).

Step 1: Install Alertmanager
- Option 1: Install using Docker
- Option 2: Manual installation
Step 2: Configure Alerting Rules in Prometheus
Step 3: Set Up Alertmanager
Step 4: Verify Alerts in Prometheus
Step 5: Display Alerts in Grafana
Conclusion

Step 1: Install Alertmanager

Option 1: Install Alertmanager Using Docker (Recommended)

To quickly set up Alertmanager using Docker, execute the following command in your terminal:

docker run -d --name=alertmanager -p 9093:9093 prom/alertmanager

Access: Once running, Alertmanager will be available at http://localhost:9093.

Option 2: Manual Installation

Download Alertmanager from the official Prometheus downloads page.
Extract the downloaded tarball:
```
 tar -xvzf alertmanager-*.tar.gz
```

Start Alertmanager with the following command:

 ./alertmanager --config.file=alertmanager.yml

Step 2: Configure Alerting Rules in Prometheus

Set up Prometheus to trigger alerts based on specific conditions (e.g., CPU usage exceeding 80%).

Create an Alerting Rules File: Create a file named alert.rules.yml in your Prometheus directory with the following content:

 groups:
   - name: example
     rules:
       - alert: HighCPUUsage
         expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100) > 80
         for: 1m
         labels:
           severity: critical
         annotations:
           summary: "High CPU usage detected on {{ $labels.instance }}"
           description: "CPU usage has been over 80% for the past 1 minute on {{ $labels.instance }}."

Update the Prometheus Configuration: Add the following to your prometheus.yml:

 rule_files:
   - "alert.rules.yml"

 alerting:
   alertmanagers:
     - static_configs:
         - targets:
           - "localhost:9093"  # Address of the Alertmanager instance

Restart Prometheus:

If running manually:

  ./prometheus --config.file=prometheus.yml

If using Docker:
```
  docker restart prometheus
```

Step 3: Set Up Alertmanager

Configure Alertmanager to handle alerts triggered by Prometheus.

Create/Modify the Alertmanager Configuration: Create or edit alertmanager.yml with the following example configuration to send notifications via email (adjust to your needs):

 route:
   receiver: 'email'

 receivers:
   - name: 'email'
     email_configs:
       - to: 'your-email@example.com'
         from: 'alertmanager@example.com'
         smarthost: 'smtp.example.com:587'
         auth_username: 'username'
         auth_password: 'password'

Start/Restart Alertmanager:

If running manually:

  ./alertmanager --config.file=alertmanager.yml

If using Docker:
```
  docker restart alertmanager
```

Step 4: Verify Alerts in Prometheus

Access Prometheus UI: Open http://localhost:9090.
Navigate to Alerts: Go to the Alerts section to see active and inactive alerts.
Simulate High CPU Usage:
- Use a tool like stress to create load on your CPU.
- Alternatively, adjust the PromQL expression to simulate high usage.
Monitor Alert Status: Once the CPU usage exceeds the defined threshold, the alert will transition from Pending to Firing.

Step 5: Display Alerts in Grafana

Grafana can visualize alerts from Prometheus, providing a unified view for monitoring.

Open Grafana: Access Grafana at http://localhost:3000.
Configure Data Source:
- Go to Configuration (gear icon) > Data Sources.
- Select your Prometheus data source.
- Ensure the Alerting section is enabled.
Create a Panel for Alerts:
- In your dashboard, click on the + icon and select Dashboard.
- Click on Add Panel and use the following PromQL to display alerts:
```
  ALERTS
```
- Choose a Table or Graph visualization type.
- Click Apply to save the panel.

Conclusion

By following this guide, you've successfully set up Prometheus to trigger alerts based on specific metrics such as CPU usage. You configured Alertmanager to manage those alerts and displayed them in Grafana for comprehensive monitoring. This setup enhances your ability to monitor the health of your Node.js application and respond promptly to issues.