In an earlier post we saw how to setup a Prometheus instance in our cluster. Here we create alerts on the Prometheus monitoring for when some action is needed on our pods. No need for constant monitoring with Prometheus alerting.
For in depth documentation on Prometheus go here. More on alert managing with CoreOs is here.
Alert manager
Here is an overview of the Prometheus architecture in a diagram.

We already saw multiple Custom Resourse Definitions (CRDs) that the Prometheus Operator gives us. The Alertmanager is also a CRD. The definition of this instance is very simple. The complete code for this setup is found on this branch on github.
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
name: sybrenbolandit
spec:
replicas: 1
We also need to tell the Prometheus pod what alert manager to use:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
replicas: 1
alerting:
alertmanagers:
- name: alertmanager-sybrenbolandit
namespace: sybrenbolandit
port: web
But the most interesting is configuration of the Receiver, the place where the alerts are sent. Here we configure alerts in a slack channel. We need the api_url of our slack which you can find for yourself by following this. Here is the config file:
apiVersion: v1
kind: Secret
metadata:
name: "alertmanager-sybrenbolandit"
stringData:
alertmanager.yaml: |
global:
resolve_timeout: 1m
route:
group_by: ['env', 'job', 'alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 1m
receiver: 'slack'
receivers:
- name: slack
slack_configs:
- api_url: ##Your slack api url##
icon_url: https://avatars3.githubusercontent.com/u/3380462
send_resolved: true
channel: '##Your slack channel name##'
title: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.prometheus }} {{ .GroupLabels.job }}'
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndocumentation: {{ .CommonAnnotations.documentation }}\n"
Note that we can configure a lot more but this will be out of scope for now.
Alert – Application down!
Now we have the architecture in place. We only need to add a Prometheus Rule for our application that triggers an alert. We build from the Spring Boot application that we deployed in the cluster in an earlier post. The configuration of this rule is done as another CRD and is found on this branch on github.
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
creationTimestamp: null
name: java-spring-api-rules
labels:
prometheus: sybrenbolandit
role: alert-rules
spec:
groups:
- name: java-spring-api
rules:
- alert: NoHealthyHosts
expr: sum(up{job="java-spring-api"}) < 1 or absent(up{job="java-spring-api"})
for: 30s
labels:
severity: critical
annotations:
summary: "No healthy hosts - java-spring-api"
Note that the important part is under
. Here we define when the alert is fired. Here we say that we want to be warned when there are no healthy hosts of our application. expr
When we now scale our application down to 0 replicas… (to force an alert) And scale up again some time later:

Hopefully you can now start with your own Prometheus Alerting. Happy alerting!
Sety Auto Components : 16, Benham Hall Lane, Charni Road.