Day 2: Metrics, Monitoring and Prometheus

šŸ’ Metrics vs Monitoring

ā‡ Metrics - Historical data of the events to understand the health of the system.

Example: A patient in ICU, is taken care of periodically and collect the data like BP,heart beat and blood glucose levels.

10:00am - HB - 72

10:15am - HB - 78
All these data in the following Time-stamped data points represents current state of the system - Metrics

ā‡ Monitoring: Metrics + Visualization + Alerting

All the metrics is fed to the monitoring platform like prometheus and it scrapes the metrics of the system and represent the data in a visually appealing way of dashboards , also sends an alert on certain conditions met of the metrics data.

Example:

  • Cpu utilization of k8s nodes is the realtime data .

  • Abnormalities - like if cpu util > 80% send alert to alert manger

  • Dashboard in pie chart or graph format

ā‡ Prometheus: Monitoring Platform for Kubernetes

Prometheus server :

At the core we have server it is responsible for scraping the metrics from different targets and store them in TSDB format.

Components:

1ļøāƒ£ Retrieval - This handles the scrapping of metrics from different targets.

2ļøāƒ£ TSDB - data scraped is stored in this TSDB in Key value format.

3ļøāƒ£ HTTP server - this provides UI and api to let PromQL query to query the prometheus server to get the information.
Storage - the scraped data is actually stored on local disk (HDD/SSD).

šŸ¤” How does prometheus pulls the metrics from various sources(targets)?

  1. Node level metrics - Node exporter runs as a DS which gathers all node level metrics.

  2. K8s resources metrics - Kube-state-metrics Queries the Kubernetes API server to gather metrics related to Kubernetes resources like pod,deploy,svc etc

  3. Application level metrics - Developers expose application-specific metrics at the /metrics endpoint. Prometheus scrapes these endpoints.

Alertmanager is used to send alerts based on rules configured in Prometheus
Grafana for data visualization where Prometheus is configured as a data source.

ā‡ All prometheus needs is targets to monitor. What are those targets?

  • Targets - Be it server, application or application service or database service.

  • Units of those targets:

  • System - memory and disk usage data

  • Application - no of exceptions, no of requests and durations.

ā‡ Installation and configuration:

  1. STEP-1: Create an EKS cluster with managed node group type

  2. Step-2: Associate OIDC provider

  3. Step-3: Update kubeconfig file to access the cluster api

  4. Step-4: Install kube-prometheus-stack using helm chart Deploy the chart into a new namespace "monitoringā€(alongside alertmanager config)

  5. Step-5: Verify the Installation and access promethus UI , grafana UI and Alertmanage UI after port forwarding

  6. Step-6: Cleanup the cluster - uninstall helm chart and delete namespace and cluster

Ā