rancher添加prometheus-operator的告警配置
2021-01-04
首先查询crd资源,需要指定prometheus-operator所在的namespace
kubectl get PrometheusRule -n cattle-prometheus
把对应的crd资源导出yaml
kubectl get PrometheusRule c-55xkf -n cattle-prometheus -o yaml > rules.yml
把文件作为模板,修改rules文件后再次apply到集群即可
kubectl apply -f rules.yml
示例
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
annotations:
labels:
cattle.io/creator: norman
source: rancher-alert
name: mycrd
namespace: cattle-prometheus
spec:
groups:
- name: c-55xkf:event-alert
rules:
- alert: container restart
annotations:
current_value: 'The container {{ $labels.container }} in pod {{ $labels.pod }} has restarted at least {{ humanize $value}} times in the last hour on instance {{ $labels.instance }}.'
expr: delta(kube_pod_container_status_restarts_total[20m])>0
for: 10s
labels:
alert_name: container restart
alert_type: metric
cluster_name: 'test-cluster (ID: c-55xkf)'
comparison: greater than
duration: 10s
expression: delta(kube_pod_container_status_restarts_total[20m])>0
group_id: c-55xkf:event-alert
rule_id: c-55xkf:event-alert_car-kqqpn
severity: critical
threshold_value: "0"
注意:
- 要修改name,prometheus会在rules对应目录下生成一个对应的新配置文件
- 每次更新前获取yaml中的
resourceVersion
- 如果新创建了告警组,下面需要至少保留一条告警配置,否则使用这个group_id创建的alerts rules不生效
标题:rancher添加prometheus-operator的告警配置
作者:fish2018
地址:http://devopser.org/articles/2020/12/22/1608622948424.html