Monitoring Microservices with Prometheus and Grafana on Kubernetes

Welcome, fellow engineers and IT enthusiasts! In today’s dynamic world of cloud-native applications, microservices have become the backbone of scalable and resilient systems. While offering immense benefits in terms of agility and resilience, they also introduce a significant challenge: understanding what’s happening inside your distributed ecosystem. How do you keep an eye on hundreds, or even thousands, of tiny services interacting in real-time? How do you know when something’s about to break, or when performance is degrading?

That’s where robust observability comes in. We’re talking about tools that don’t just tell you *if* something is down, but *why* it’s down, and how it’s performing. Today, we’re diving deep into the industry-standard duo for monitoring: Prometheus for metric collection and Grafana for powerful visualization, all running seamlessly on Kubernetes.

This isn’t just theory; we’re going to walk through a practical, hands-on example, deploying sample Go microservices, setting up Prometheus to scrape their metrics, and then building interactive dashboards in Grafana to gain immediate insights. By the end of this post (and the accompanying video and code), you’ll have a complete, functional monitoring stack and a clear understanding of how to implement it in your own Kubernetes environments. Get ready to bring clarity to your microservice chaos!

Understanding the Observability Duo: Prometheus and Grafana

At the heart of our monitoring solution lies the powerful combination of Prometheus and Grafana.

Prometheus: The Metric Collector

Prometheus, often referred to as a monitoring and alerting toolkit, acts as our time-series database. Unlike traditional monitoring systems that wait for data to be pushed to them, Prometheus operates on a **pull model**. It periodically scrapes metrics from configured targets, which in our case will be our microservices. These metrics are numeric data points recorded over time, like request counts, error rates, or CPU utilization. It’s purpose-built for the dynamic nature of cloud environments, making it ideal for microservices.

Grafana: The Visualization Powerhouse

Then, we have Grafana, the visualization powerhouse. Grafana doesn’t collect data itself; instead, it connects to various data sources, including Prometheus, and transforms raw metrics into stunning, interactive dashboards. Think of it as the ultimate storyteller for your data, turning complex numbers into intuitive graphs and charts that help you quickly understand the health and performance of your applications. Together, they form a symbiotic relationship: Prometheus collects the rich, detailed telemetry, and Grafana presents it in a way that allows you to make informed decisions swiftly. This separation of concerns—collection versus visualization—is one of the key reasons this stack is so popular and effective.

The Sample Application: Designed for Observability

To demonstrate this setup, we’ve prepared a simple yet effective sample application. It consists of two basic Go microservices, aptly named microservice-a and microservice-b. These aren’t just any services; they’re designed with observability in mind. Specifically, they expose Prometheus metrics on a dedicated /metrics endpoint.

Inside the sample-app/main.go file, you’ll find how standard Prometheus client libraries are used to define custom metrics, such as http_requests_total (a counter tracking HTTP requests) and last_request_timestamp (a gauge recording when the last request occurred). This instrumentation is crucial because it’s how your applications communicate their internal state to the monitoring system.

Furthermore, when we deploy these services to Kubernetes using the manifests in the k8s/ directory, you’ll notice specific annotations on their Kubernetes Service objects:

apiVersion: v1
kind: Service
metadata:
  name: microservice-a
  labels:
    app: microservice-a
  annotations:
    # Prometheus annotations for discovery via Service DNS
    prometheus.io/scrape: "true" # Indicate that Prometheus should scrape this service
    prometheus.io/port: "8080"   # The port on which metrics are exposed
    prometheus.io/path: "/metrics" # The path where metrics are exposed (default is /metrics)
spec:
  selector:
    app: microservice-a
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
      name: http

These annotations are not just metadata; they are key signals that Prometheus will use for automatic service discovery, telling it exactly which services to monitor and on which ports. It’s a clever way to integrate monitoring directly into your Kubernetes deployment strategy without needing manual configuration for every new service.

Prerequisites: Setting Up Your Environment

Before we can deploy our monitoring stack and microservices, we need to ensure our environment is ready. First and foremost, you’ll need a running Kubernetes cluster. Whether it’s a local setup like Minikube or Kind, or a managed service like GKE, EKS, or AKS, the principles remain the same. Alongside your cluster, you’ll need:

kubectl: The Kubernetes command-line tool, for interacting with your cluster.
helm: The Kubernetes package manager, which we’ll use extensively for deploying Prometheus and Grafana.
docker: Docker engine, if you plan to build the sample microservice images locally.

Step-by-Step Deployment Guide

Let’s get hands-on and deploy our monitoring stack.

1. Build and Load Sample Microservice Images

The first practical step involves building these Docker images and making them accessible to your Kubernetes cluster. Our project includes a Dockerfile within the sample-app directory.

# Navigate to the sample-app directory
cd sample-app

# Build the Docker image
docker build -t local-registry/sample-microservice:latest .

If you’re using a local cluster like Kind or Minikube, you’ll then need to load this image directly into the cluster’s image cache, avoiding the need for a remote registry:

# For Kind cluster
kind load docker-image local-registry/sample-microservice:latest

# For Minikube cluster
minikube image load local-registry/sample-microservice:latest

For remote clusters (GKE, EKS, AKS, etc.), the standard practice is to push your image to a container registry (like Docker Hub or Google Container Registry) and then update your Kubernetes deployment manifests to pull from that registry.

2. Deploy Sample Microservices

With our custom microservice images built and accessible, the next step is to deploy them onto our Kubernetes cluster. We’ve provided Kubernetes deployment and service manifests for microservice-a and microservice-b within the k8s/ directory. Remember the Prometheus annotations we discussed earlier – they are crucial here.

# Navigate back to the root of the project
cd ..

# Apply the Kubernetes manifests
kubectl apply -f k8s/

After applying the manifests, it’s always good practice to verify that your pods are in a Running state:

kubectl get pods -l app=microservice-a
kubectl get pods -l app=microservice-b

This ensures our applications are up and ready to expose their metrics for collection, laying the groundwork for Prometheus to begin its work.

3. Deploy Prometheus using Helm

Now that our microservices are running and exposing metrics, it’s time to bring in Prometheus. We’ll deploy Prometheus using its official Helm chart, which simplifies the installation and configuration process significantly.

# Add the Prometheus community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Deploy Prometheus using the custom values file
helm install prometheus prometheus-community/prometheus -f prometheus/prometheus-values.yaml --namespace monitoring --create-namespace

Wait for Prometheus pods to be ready:

kubectl get pods -n monitoring -l app=prometheus

Demystifying `prometheus/prometheus-values.yaml`

This file is the core of how Prometheus intelligently discovers and scrapes our microservices. Within this file, we’ve defined an extraScrapeConfigs section, specifically a job named kubernetes-services. This job leverages Kubernetes service discovery, meaning Prometheus will actively look for Service objects within the cluster.

The relabel_configs are where the filtering and transformation logic happens:

The first rule ensures that Prometheus *only* scrapes services that have the prometheus.io/scrape: "true" annotation.
The second rule further refines this by targeting services with the app label matching microservice-a or microservice-b, focusing only on our sample applications.
Subsequent rules dynamically set the __metrics_path__ and __address__ based on prometheus.io/path and prometheus.io/port annotations respectively. This means Prometheus automatically knows which path to hit and which port to use for metrics.
Finally, we add a service_name label based on the Kubernetes service name. This custom label is incredibly powerful for querying and filtering metrics later in Grafana, allowing us to easily distinguish between microservice-a and microservice-b‘s metrics.

This intelligent, annotation-driven discovery is why Prometheus scales so well in dynamic Kubernetes environments.

# prometheus/prometheus-values.yaml (excerpt)
server:
  extraScrapeConfigs: |
    - job_name: 'kubernetes-services'
      kubernetes_sd_configs:
      - role: service
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        regex: 'true'
        action: keep
      - source_labels: [__meta_kubernetes_service_label_app]
        regex: '(microservice-a|microservice-b)'
        action: keep
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        regex: '(.+)'
        target_label: __metrics_path__
        replacement: $1
        action: replace
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        regex: '([^:]+)(?::\d+)?;(\d+)'
        target_label: __address__
        replacement: $1:$2
        action: replace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: service_name
        action: replace

4. Deploy Grafana using Helm

With Prometheus now diligently collecting metrics, it’s time to visualize that data with Grafana. Similar to Prometheus, we’ll deploy Grafana using its official Helm chart.

# Add the Grafana Helm repository
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

# Deploy Grafana using the custom values file
helm install grafana grafana/grafana -f grafana/grafana-values.yaml --namespace monitoring

Wait for Grafana pod to be ready:

kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana

Understanding `grafana/grafana-values.yaml`

This file is packed with essential configurations:

It sets the initial admin username and password (remember to change this in production!).
Enables persistence for Grafana data, ensuring your dashboards and settings aren’t lost if the pod restarts.
Most importantly, it provisions our Prometheus data source and a pre-built dashboard. The datasources.yaml section tells Grafana how to connect to our Prometheus server using its internal Kubernetes service name, http://prometheus-server.monitoring.svc.cluster.local. This internal URL ensures seamless communication within the cluster.
Furthermore, the dashboards section enables automatic dashboard provisioning, pointing Grafana to a ConfigMap that will contain our microservices-dashboard.json. This means our sample dashboard will be automatically loaded upon Grafana’s deployment.

# grafana/grafana-values.yaml (excerpt)
admin:
  user: admin
  password: prom-grafana-password # Change this!

persistence:
  enabled: true
  type: pvc
  size: 5Gi

datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
      - name: Prometheus
        type: prometheus
        uid: prometheus-ds
        url: http://prometheus-server.monitoring.svc.cluster.local
        access: proxy
        isDefault: true

dashboards:
  default:
    enabled: true
    providers:
      - name: 'default'
        orgId: 1
        folder: ''
        type: file
        disableDelete: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards/default

5. Access Prometheus and Grafana UIs

With both Prometheus and Grafana deployed, let’s explore their user interfaces.

Access Prometheus UI

To access the Prometheus UI, we’ll use kubectl port-forward:

kubectl port-forward svc/prometheus-server 9090:80 -n monitoring

Open your browser and navigate to http://localhost:9090. From here, you can go to the “Status” menu and click on “Targets” to verify that Prometheus is actively scraping metrics from microservice-a and microservice-b. You should see their status as “UP”.

Access Grafana UI

For Grafana, use another port-forward command:

kubectl port-forward svc/grafana 3000:80 -n monitoring

Open http://localhost:3000 in your browser. You’ll be prompted to log in. Use the credentials defined in grafana-values.yaml:

Username: admin
Password: prom-grafana-password

Once logged in, you should immediately see the “Sample Microservices Overview” dashboard available under the Dashboards section. Click on it to see the pre-configured graphs and metrics. You can also confirm the Prometheus data source is correctly configured by navigating to “Connections” -> “Data sources”. This step confirms our entire monitoring pipeline is operational, from metric collection to visualization.

6. Generate Some Traffic (Optional, but Recommended)

To truly see our Grafana dashboard come alive, let’s generate some artificial traffic for our microservices.

We’ll start by port-forwarding to one of our microservices, for instance, microservice-a:

kubectl port-forward svc/microservice-a 8080:8080

Then, in a new terminal, you can make a series of curl requests to http://localhost:8080/ or http://localhost:8080/data:

curl http://localhost:8080/
curl http://localhost:8080/data
curl http://localhost:8080/
# Repeat a few times

As you make these requests, switch back to your Grafana dashboard. You should observe the “HTTP Requests Per Second” graph updating in near real-time, showing the rate of requests for both microservice-a and microservice-b. You’ll also see the “Total HTTP Requests” graphs for each service steadily climbing, and the “Last Request Timestamp” panels updating. This interactive demonstration highlights the power of this monitoring stack: you can trigger events in your application and immediately see their impact reflected visually in Grafana, providing invaluable feedback for development, debugging, and operational insights. It truly brings your system’s behavior to light.

Understanding the Grafana Dashboard: A PromQL Primer

Let’s briefly peek behind the curtain of our Grafana dashboard and understand how it pulls data from Prometheus using PromQL, the Prometheus Query Language. Take, for example, the “HTTP Requests Per Second by Service” panel. Its query uses:

sum by (service_name) (rate(http_requests_total{job="kubernetes-services"}[1m]))

Let’s break that down:

http_requests_total: This is our counter metric, tracking cumulative HTTP requests.
[1m]: Specifies a 1-minute rate, meaning it calculates the average rate of increase for this counter over the last minute.
rate(): This function is crucial for understanding how frequently events are occurring, rather than just their cumulative count. It calculates the per-second average rate of increase of the time series.
job="kubernetes-services": Filters the metrics to only those collected by our custom Kubernetes service discovery job in Prometheus.
sum by (service_name): Aggregates the results, grouping them by the service_name label we added during Prometheus’s relabeling phase. This allows us to see separate lines for microservice-a and microservice-b on the same graph.

Similarly, other panels utilize different PromQL functions and label filters to display total requests, or simply the last recorded value of a gauge like last_request_timestamp_seconds. Understanding PromQL, even at a basic level, empowers you to create custom queries and tailor your dashboards to extract precise insights about your microservices’ behavior.

Watch the Full Walkthrough

Prefer a visual guide? Watch the accompanying YouTube video for a complete, step-by-step walkthrough of everything discussed in this blog post:

Explore the Code

All the source code, Kubernetes manifests, and Helm values files used in this guide are available on GitHub. Feel free to clone the repository, experiment, and adapt it for your own projects.

➡️ Get the Code on GitHub

Cleanup

To remove all deployed components from your cluster:

# 1. Delete Prometheus and Grafana Helm releases:
helm uninstall prometheus -n monitoring
helm uninstall grafana -n monitoring

# 2. Delete the sample microservices:
kubectl delete -f k8s/

# 3. Delete the monitoring namespace:
kubectl delete namespace monitoring

Conclusion

And that brings us to the end of our deep dive into monitoring microservices on Kubernetes with Prometheus and Grafana. We’ve gone from the very basics of why observability is critical in a microservice architecture to deploying a full, functional monitoring stack using Helm. We set up our sample Go microservices to expose Prometheus metrics, configured Prometheus to intelligently discover and scrape those metrics using Kubernetes service annotations and custom scrape rules, and finally, deployed Grafana to visualize everything on a dynamic, pre-built dashboard. We even generated some traffic to see the metrics flowing in real-time.

This setup provides you with an incredibly powerful foundation for understanding the health, performance, and behavior of your distributed applications. It empowers you to proactively identify issues, optimize resource utilization, and ensure the reliability of your services. Observability isn’t just a buzzword; it’s an essential discipline for modern software development.

I hope this guide has provided you with a clear roadmap and the practical knowledge to implement these best practices in your own projects. If you found this helpful, please share it with your fellow engineers and consider subscribing to our channel for more in-depth technical tutorials. Your support helps us create more valuable content like this!

Modern Java developement with Devops and AI

recent posts

about