Alertmanager 部署指南

介绍一下,Altermanager 一般在 Grafana 中是外置告警器,在 Grafana 中配置好它之后,触发了 Alerting 规则之后,就会让 Altermanager 根据它的配置,用某种方式向接收方发送告警信息

本文示例为配置 Altermanager k8s 服务通过 SMTP 发送告警

准备配置文件

  1. configMap

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    cat >> Altermanager-configMap.yaml << EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: alertmanager-config
    namespace: kube-system
    labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: EnsureExists
    data:
    alertmanager.yml: |
    global:
    resolve_timeout: 5m
    smtp_smarthost: 'smtp.163.com:25'
    smtp_from: 'Sunnyrain233@163.com'
    smtp_auth_username: 'Sunnyrain233@163.com'
    smtp_auth_password: 'xxxxxx'
    smtp_require_tls: true

    receivers:
    - name: default-receiver
    email_configs:
    - to: "Sunnyrain233@163.com"

    route:
    group_interval: 1m
    group_wait: 10s
    receiver: default-receiver
    repeat_interval: 1m

    EOF

    详细配置文档

  2. deployment

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    cat >> Altermanager-deployment.yaml << EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: alertmanager
    namespace: kube-system
    labels:
    k8s-app: alertmanager
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: latest
    spec:
    replicas: 1
    selector:
    matchLabels:
    k8s-app: alertmanager
    version: latest
    template:
    metadata:
    labels:
    k8s-app: alertmanager
    version: latest
    annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
    priorityClassName: system-cluster-critical
    containers:
    - name: prometheus-alertmanager
    image: "prom/alertmanager"
    imagePullPolicy: "IfNotPresent"
    args:
    - --config.file=/etc/config/alertmanager.yml
    - --storage.path=/data
    # - --web.external-url=/
    ports:
    - containerPort: 9093
    readinessProbe:
    httpGet:
    path: /#/status
    port: 9093
    initialDelaySeconds: 30
    timeoutSeconds: 30
    volumeMounts:
    - name: config-volume
    mountPath: /etc/config
    - name: storage-volume
    mountPath: "/data"
    subPath: ""
    resources:
    limits:
    cpu: 10m
    memory: 50Mi
    requests:
    cpu: 10m
    memory: 50Mi
    - name: prometheus-alertmanager-configmap-reload
    image: "jimmidyson/configmap-reload:v0.1"
    imagePullPolicy: "IfNotPresent"
    args:
    - --volume-dir=/etc/config
    - --webhook-url=http://localhost:9093/-/reload
    volumeMounts:
    - name: config-volume
    mountPath: /etc/config
    readOnly: true
    resources:
    limits:
    cpu: 10m
    memory: 10Mi
    requests:
    cpu: 10m
    memory: 10Mi
    volumes:
    - name: config-volume
    configMap:
    name: alertmanager-config
    - name: storage-volume
    persistentVolumeClaim:
    claimName: alertmanager
    EOF

    DockerHub

  3. PVC(PersistentVolumeClaim)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat >> Altermanager-pvc.yaml << EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    name: alertmanager
    namespace: kube-system
    labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: EnsureExists
    spec:
    storageClassName: managed-nfs-storage
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: "2Gi"
    EOF
  4. Service

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    cat >> Altermanager-service.yaml << EOF
    apiVersion: v1
    kind: Service
    metadata:
    name: alertmanager
    namespace: kube-system
    labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Alertmanager"
    spec:
    ports:
    - name: http
    port: 80
    protocol: TCP
    targetPort: 9093
    selector:
    k8s-app: alertmanager
    type: "ClusterIP"
    EOF

应用配置

1
2
3
4
5
kubectl apply -f Altermanager-deployment.yaml
kubectl apply -f Altermanager-configMap.yaml
kubectl apply -f Altermanager-pvc.yaml
kubectl apply -f Altermanager-service.yaml
kubectl get svc -n kube-system | grep alertmanager

等容器运行正常后,在 Grafana 中 Alerting 的 Admin 点击 Add AlertManager

微信截图_20220327162701.png

查看 alertManager Pod 的 IP address
1
kubectl describe pod $(kubectl get pod -n kube-system | awk '{print $1}' | grep alert) -n kube-system | grep IP

添加你的 AltermanagerURL 到 Grafana http://< pod-IP >:9093

然后在 Grafana 中添加 Alert rules 触发告警至 firing 之后,就会发邮件到你的收件邮箱啦

微信截图_20220327162217.png