Comprehensive Guide to Monitoring and Alerting with Prometheus, Grafana, and Alertmanager

Comprehensive Guide to Monitoring and Alerting with Prometheus, Grafana, and Alertmanager

Introduction

This documentation provides detailed instructions for setting up Prometheus, Grafana, and Alertmanager to monitor a Node.js application effectively. By following this guide, you'll learn how to install the necessary components, configure alerts based on specific metrics, and visualize those alerts in Grafana.

Prerequisites

Before starting, ensure that you have the following:

  • Prometheus: Installed and running.

  • Grafana: Installed and running.

  • Alertmanager: For handling alert notifications.

  • Node.js application: Exposing metrics at the /metrics endpoint.

  • Basic understanding of Prometheus metrics, alerts, and Docker (optional but useful).

Table of Contents

  1. Step 1: Install Alertmanager

    • Option 1: Install using Docker

    • Option 2: Manual installation

  2. Step 2: Configure Alerting Rules in Prometheus

  3. Step 3: Set Up Alertmanager

  4. Step 4: Verify Alerts in Prometheus

  5. Step 5: Display Alerts in Grafana

  6. Conclusion

Step 1: Install Alertmanager

To quickly set up Alertmanager using Docker, execute the following command in your terminal:

docker run -d --name=alertmanager -p 9093:9093 prom/alertmanager

Option 2: Manual Installation

  1. Download Alertmanager from the official Prometheus downloads page.

  2. Extract the downloaded tarball:

     tar -xvzf alertmanager-*.tar.gz
    
  3. Start Alertmanager with the following command:

     ./alertmanager --config.file=alertmanager.yml
    

Step 2: Configure Alerting Rules in Prometheus

Set up Prometheus to trigger alerts based on specific conditions (e.g., CPU usage exceeding 80%).

  1. Create an Alerting Rules File: Create a file named alert.rules.yml in your Prometheus directory with the following content:

     groups:
       - name: example
         rules:
           - alert: HighCPUUsage
             expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100) > 80
             for: 1m
             labels:
               severity: critical
             annotations:
               summary: "High CPU usage detected on {{ $labels.instance }}"
               description: "CPU usage has been over 80% for the past 1 minute on {{ $labels.instance }}."
    
  2. Update the Prometheus Configuration: Add the following to your prometheus.yml:

     rule_files:
       - "alert.rules.yml"
    
     alerting:
       alertmanagers:
         - static_configs:
             - targets:
               - "localhost:9093"  # Address of the Alertmanager instance
    
  3. Restart Prometheus:

    • If running manually:

        ./prometheus --config.file=prometheus.yml
      
    • If using Docker:

        docker restart prometheus
      

Step 3: Set Up Alertmanager

Configure Alertmanager to handle alerts triggered by Prometheus.

  1. Create/Modify the Alertmanager Configuration: Create or edit alertmanager.yml with the following example configuration to send notifications via email (adjust to your needs):

     route:
       receiver: 'email'
    
     receivers:
       - name: 'email'
         email_configs:
           - to: 'your-email@example.com'
             from: 'alertmanager@example.com'
             smarthost: 'smtp.example.com:587'
             auth_username: 'username'
             auth_password: 'password'
    
  2. Start/Restart Alertmanager:

    • If running manually:

        ./alertmanager --config.file=alertmanager.yml
      
    • If using Docker:

        docker restart alertmanager
      

Step 4: Verify Alerts in Prometheus

  1. Access Prometheus UI: Open http://localhost:9090.

  2. Navigate to Alerts: Go to the Alerts section to see active and inactive alerts.

  3. Simulate High CPU Usage:

    • Use a tool like stress to create load on your CPU.

    • Alternatively, adjust the PromQL expression to simulate high usage.

  4. Monitor Alert Status: Once the CPU usage exceeds the defined threshold, the alert will transition from Pending to Firing.

Step 5: Display Alerts in Grafana

Grafana can visualize alerts from Prometheus, providing a unified view for monitoring.

  1. Open Grafana: Access Grafana at http://localhost:3000.

  2. Configure Data Source:

    • Go to Configuration (gear icon) > Data Sources.

    • Select your Prometheus data source.

    • Ensure the Alerting section is enabled.

  3. Create a Panel for Alerts:

    • In your dashboard, click on the + icon and select Dashboard.

    • Click on Add Panel and use the following PromQL to display alerts:

        ALERTS
      
    • Choose a Table or Graph visualization type.

    • Click Apply to save the panel.

Conclusion

By following this guide, you've successfully set up Prometheus to trigger alerts based on specific metrics such as CPU usage. You configured Alertmanager to manage those alerts and displayed them in Grafana for comprehensive monitoring. This setup enhances your ability to monitor the health of your Node.js application and respond promptly to issues.