Kubernetes中基于Prometheus指标的自动缩放

admin 2020年10月8日21:16:34评论242 views字数 15502阅读51分40秒阅读模式

Kubernetes中基于Prometheus指标的自动缩放

Kubernetes中基于Prometheus指标的自动缩放
使用Kubernetes进行容器编排的主要优势之一是它可以非常轻松地水平扩展我们的应用程序并解决增加的负载。本质上,水平pod自动缩放可以根据CPU和内存使用量来扩展部署,但是在更复杂的场景中,我们需要制定在扩展决定之前考虑其他指标。

Prometheus是用于监视部署的工作负载和Kubernetes集群本身的标准工具。Prometheus适配器可帮助我们利用Prometheus收集的指标并使用它们来制定扩展决策。这些指标由API服务公开,并且我们的Horizontal Pod Autoscaling对象可以轻松使用。

部署架构

我们将使用Prometheus适配器从Prometheus安装中提取自定义指标,然后让Horizontal Pod Autoscaler(HPA)使用它来放大或缩小Pod。

Kubernetes中基于Prometheus指标的自动缩放

需要的准备:

  1. 关于水平POD自动缩放的基本知识
  2. Prometheus部署在集群中或可通过端点访问。

我们将使用Prometheus-Thanos高可用性部署。

部署样本应用程序

首先,我们部署一个示例应用程序,在该应用程序上将测试Prometheus指标自动缩放。我们可以使用下面的清单来做:

apiVersion: v1
kind: Namespace
metadata:
name: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: nginx
name: nginx-deployment
spec:
replicas: 1
template:
metadata:
annotations:
prometheus.io/path: "/status/format/prometheus"
prometheus.io/scrape: "true"
prometheus.io/port: "80"
labels:
app: nginx-server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx-server
topologyKey: kubernetes.io/hostname
containers:
- name: nginx-demo
image: vaibhavthakur/nginx-vts:1.0
imagePullPolicy: Always
resources:
limits:
cpu: 2500m
requests:
cpu: 2000m
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
namespace: nginx
name: nginx-service
spec:
ports:
- port: 80
targetPort: 80
name: http
selector:
app: nginx-server
type: LoadBalancer

这将创建一个名为nginx的命名空间,并在其中部署示例Nginx应用程序。可以使用该服务访问该应用程序,并通过端口80在端点/status/format/ prometheus处公开nginx vts指标。为了进行设置,我们为ExternalIP创建了一个DNS条目,该条目映射到nginx.gotham.com。

root$ kubectl get deploy 
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   1/1     1            1           43d

root$ kubectl get pods 
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-65d8df7488-c578v   1/1     Running   0          9h

root$ kubectl get svc
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
nginx-service   ClusterIP   10.63.253.154   35.232.67.34      80/TCP    43d

root$ kubectl describe deploy nginx-deployment
Name:                   nginx-deployment
Namespace:              nginx
CreationTimestamp:      Tue, 08 Oct 2019 11:47:36 -0700
Labels:                 app=nginx-server
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{},"name":"nginx-deployment","namespace":"nginx"},"spec":...
Selector:               app=nginx-server
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 1 max surge
Pod Template:
  Labels:       app=nginx-server
  Annotations:  prometheus.io/path: /status/format/prometheus
                prometheus.io/port: 80
                prometheus.io/scrape: true
  Containers:
   nginx-demo:
    Image:      vaibhavthakur/nginx-vts:v1.0
    Port:       80/TCP
    Host Port:  0/TCP
    Limits:
      cpu:  250m
    Requests:
      cpu:        200m
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   nginx-deployment-65d8df7488 (1/1 replicas created)
Events:          <none>


root$ curl nginx.gotham.com
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width35em;
        margin0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

这些是应用程序当前公开的所有指标.

$ curl nginx.gotham.com/status/format/prometheus
# HELP nginx_vts_info Nginx info
# TYPE nginx_vts_info gauge
nginx_vts_info{hostname="nginx-deployment-65d8df7488-c578v",version="1.13.12"1
# HELP nginx_vts_start_time_seconds Nginx start time
# TYPE nginx_vts_start_time_seconds gauge
nginx_vts_start_time_seconds 1574283147.043
# HELP nginx_vts_main_connections Nginx connections
# TYPE nginx_vts_main_connections gauge
nginx_vts_main_connections{status="accepted"215
nginx_vts_main_connections{status="active"4
nginx_vts_main_connections{status="handled"215
nginx_vts_main_connections{status="reading"0
nginx_vts_main_connections{status="requests"15577
nginx_vts_main_connections{status="waiting"3
nginx_vts_main_connections{status="writing"1
# HELP nginx_vts_main_shm_usage_bytes Shared memory [ngx_http_vhost_traffic_status] info
# TYPE nginx_vts_main_shm_usage_bytes gauge
nginx_vts_main_shm_usage_bytes{shared="max_size"1048575
nginx_vts_main_shm_usage_bytes{shared="used_size"3510
nginx_vts_main_shm_usage_bytes{shared="used_node"1
# HELP nginx_vts_server_bytes_total The request/response bytes
# TYPE nginx_vts_server_bytes_total counter
# HELP nginx_vts_server_requests_total The requests counter
# TYPE nginx_vts_server_requests_total counter
# HELP nginx_vts_server_request_seconds_total The request processing time in seconds
# TYPE nginx_vts_server_request_seconds_total counter
# HELP nginx_vts_server_request_seconds The average of request processing times in seconds
# TYPE nginx_vts_server_request_seconds gauge
# HELP nginx_vts_server_request_duration_seconds The histogram of request processing time
# TYPE nginx_vts_server_request_duration_seconds histogram
# HELP nginx_vts_server_cache_total The requests cache counter
# TYPE nginx_vts_server_cache_total counter
nginx_vts_server_bytes_total{host="_",direction="in"3303449
nginx_vts_server_bytes_total{host="_",direction="out"61641572
nginx_vts_server_requests_total{host="_",code="1xx"0
nginx_vts_server_requests_total{host="_",code="2xx"15574
nginx_vts_server_requests_total{host="_",code="3xx"0
nginx_vts_server_requests_total{host="_",code="4xx"2
nginx_vts_server_requests_total{host="_",code="5xx"0
nginx_vts_server_requests_total{host="_",code="total"15576
nginx_vts_server_request_seconds_total{host="_"0.000
nginx_vts_server_request_seconds{host="_"0.000
nginx_vts_server_cache_total{host="_",status="miss"0
nginx_vts_server_cache_total{host="_",status="bypass"0
nginx_vts_server_cache_total{host="_",status="expired"0
nginx_vts_server_cache_total{host="_",status="stale"0
nginx_vts_server_cache_total{host="_",status="updating"0
nginx_vts_server_cache_total{host="_",status="revalidated"0
nginx_vts_server_cache_total{host="_",status="hit"0
nginx_vts_server_cache_total{host="_",status="scarce"0
nginx_vts_server_bytes_total{host="*",direction="in"3303449
nginx_vts_server_bytes_total{host="*",direction="out"61641572
nginx_vts_server_requests_total{host="*",code="1xx"0
nginx_vts_server_requests_total{host="*",code="2xx"15574
nginx_vts_server_requests_total{host="*",code="3xx"0
nginx_vts_server_requests_total{host="*",code="4xx"2
nginx_vts_server_requests_total{host="*",code="5xx"0
nginx_vts_server_requests_total{host="*",code="total"15576
nginx_vts_server_request_seconds_total{host="*"0.000
nginx_vts_server_request_seconds{host="*"0.000
nginx_vts_server_cache_total{host="*",status="miss"0
nginx_vts_server_cache_total{host="*",status="bypass"0
nginx_vts_server_cache_total{host="*",status="expired"0
nginx_vts_server_cache_total{host="*",status="stale"0
nginx_vts_server_cache_total{host="*",status="updating"0
nginx_vts_server_cache_total{host="*",status="revalidated"0
nginx_vts_server_cache_total{host="*",status="hit"0
nginx_vts_server_cache_total{host="*",status="scarce"0

其中,我们对nginx_vts_server_requests_total最感兴趣。我们将使用该指标的值来确定是否扩展我们的Nginx部署。

创建Prometheus适配器ConfigMap

使用下面的清单创建Prometheus适配器Configmap:

apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: 'nginx_vts_server_requests_total'
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))

此配置映射仅指定一个指标。但是,我们总是可以添加更多指标。

创建Prometheus适配器部署

使用以下清单来部署Prometheus适配器:

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: custom-metrics-apiserver
template:
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
spec:
serviceAccountName: monitoring
containers:
- name: custom-metrics-apiserver
image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.4.1
args:
- /adapter
- --secure-port=6443
- --tls-cert-file=/var/run/serving-cert/serving.crt
- --tls-private-key-file=/var/run/serving-cert/serving.key
- --logtostderr=true
- --prometheus-url=http://thanos-querier.monitoring:9090/
- --metrics-relist-interval=30s
- --v=10
- --config=/etc/adapter/config.yaml
ports:
- containerPort: 6443
volumeMounts:
- mountPath: /var/run/serving-cert
name: volume-serving-cert
readOnly: true
- mountPath: /etc/adapter/
name: config
readOnly: true
volumes:
- name: volume-serving-cert
secret:
secretName: cm-adapter-serving-certs
- name: config
configMap:
name: adapter-config

这将创建我们的部署,该部署将产生Prometheus适配器容器,以从Prometheus中提取指标。应当指出,我们已经设定了

--prometheus-url = http://thanos-querier.monitoring:9090 /。这是因为我们在与Prometheus适配器相同的Kubernetes集群中的监视名称空间中部署了Prometheus-Thanos集群。可以更改此参数以指向自己的Prometheus部署。

如果注意到此容器的日志,则可以看到它正在获取配置文件中定义的指标:

I1122 00:26:53.228394       1 api.go:74] GET http://thanos-querier.monitoring:9090/api/v1/series?match%5B%5D=nginx_vts_server_requests_total&start=1574381213.217 200 OK
I1122 00:26:53.234234       1 api.go:93] Response Body: {"status":"success","data":[{"__name__":"nginx_vts_server_requests_total","app":"nginx-server","cluster":"prometheus-ha","code":"1xx","host":"*","instance":"10.60.64.39:80","job":"kubernetes-pods","kubernetes_namespace":"nginx","kubernetes_pod_name":"nginx-deployment-65d8df7488-sbp95","pod_template_hash":"65d8df7488"},{"__name__":"nginx_vts_server_requests_total","app":"nginx-server","cluster":"prometheus-ha","code":"1xx","host":"*","instance":"10.60.64.8:80","job":"kubernetes-pods","kubernetes_namespace":"nginx","kubernetes_pod_name":"nginx-deployment-65d8df7488-mwzxg","pod_template_hash":"65d8df7488"}

创建Prometheus适配器API服务

下面的清单将创建一个API服务,以便Kubernetes API可以访问我们的Prometheus适配器,从而可以通过我们的Horizontal Pod Autoscaler获取指标。

apiVersion: v1
kind: Service
metadata:
name: custom-metrics-apiserver
namespace: monitoring
spec:
ports:
- port: 443
targetPort: 6443
selector:
app: custom-metrics-apiserver
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
service:
name: custom-metrics-apiserver
namespace: monitoring
group: custom.metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100

测试设置

让我们检查一下所有可用的自定义指标:

root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

{
  "kind""APIResourceList",
  "apiVersion""v1",
  "groupVersion""custom.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name""pods/nginx_vts_server_requests_per_second",
      "singularName""",
      "namespaced"true,
      "kind""MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name""namespaces/nginx_vts_server_requests_per_second",
      "singularName""",
      "namespaced"false,
      "kind""MetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}

我们可以看到nginx_vts_server_requests_per_second指标可用。现在,让我们检查该指标的当前值:

root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/nginx/pods/*/nginx_vts_server_requests_per_second" | jq .

{
  "kind""MetricValueList",
  "apiVersion""custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink""/apis/custom.metrics.k8s.io/v1beta1/namespaces/nginx/pods/%2A/nginx_vts_server_requests_per_second"
  },
  "items": [
    {
      "describedObject": {
        "kind""Pod",
        "namespace""nginx",
        "name""nginx-deployment-65d8df7488-v575j",
        "apiVersion""/v1"
      },
      "metricName""nginx_vts_server_requests_per_second",
      "timestamp""2019-11-19T18:38:21Z",
      "value""1236m"
    }
  ]
}

创建将利用这些指标的HPA。我们可以使用下面的清单来做到这一点。

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-custom-hpa
namespace: nginx
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: nginx-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: nginx_vts_server_requests_per_second
targetAverageValue: 4000m

应用此清单后,可以按以下方式检查HPA的当前状态:

root$ kubectl describe hpa
Name:               nginx-custom-hpa
Namespace:          nginx
Labels:             <none>
Annotations:        autoscaling.alpha.kubernetes.io/metrics:
                      [{"type":"Pods","pods":{"metricName":"nginx_vts_server_requests_per_second","targetAverageValue":"4"}}]
                    kubectl.kubernetes.io/last-applied-configuration:
                      {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-custom-hpa","namespace":"n...
CreationTimestamp:  Thu, 21 Nov 2019 11:11:05 -0800
Reference:          Deployment/nginx-deployment
Min replicas:       2
Max replicas:       10
Deployment pods:    0 current / 0 desired
Events:             <none>

现在,让我们在服务上产生一些负载。为此,我们将使用一个名为Vegeta的实用程序。
在单独的终端中运行以下命令:

echo "GET http://nginx.gotham.com/" | vegeta attack -rate=5 -duration=0 | vegeta report

同时监视nginx容器和水平容器自动缩放器,应该会看到类似的内容:


root$ kubectl get -w pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-65d8df7488-mwzxg   1/1     Running   0          9h
nginx-deployment-65d8df7488-sbp95   1/1     Running   0          4m9s
NAME                                AGE
nginx-deployment-65d8df7488-pwjzm   0s
nginx-deployment-65d8df7488-pwjzm   0s
nginx-deployment-65d8df7488-pwjzm   0s
nginx-deployment-65d8df7488-pwjzm   2s
nginx-deployment-65d8df7488-pwjzm   4s
nginx-deployment-65d8df7488-jvbvp   0s
nginx-deployment-65d8df7488-jvbvp   0s
nginx-deployment-65d8df7488-jvbvp   1s
nginx-deployment-65d8df7488-jvbvp   4s
nginx-deployment-65d8df7488-jvbvp   7s
nginx-deployment-65d8df7488-skjkm   0s
nginx-deployment-65d8df7488-skjkm   0s
nginx-deployment-65d8df7488-jh5vw   0s
nginx-deployment-65d8df7488-skjkm   0s
nginx-deployment-65d8df7488-jh5vw   0s
nginx-deployment-65d8df7488-jh5vw   1s
nginx-deployment-65d8df7488-skjkm   2s
nginx-deployment-65d8df7488-jh5vw   2s
nginx-deployment-65d8df7488-skjkm   3s
nginx-deployment-65d8df7488-jh5vw   4s

root$ kubectl get hpa
NAME               REFERENCE                     TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-deployment   5223m/4   2         10        3          5m5s

可以清楚地看到,HPA按照要求扩展了pod,当我们中断Vegeta命令时,我们得到了vegeta报告。它清楚地表明应用程序满足了我们所有的请求。

root$ echo "GET http://nginx.gotham.com/" | vegeta attack -rate=5 -duration=0 | vegeta report
^CRequests [total, rate, throughput] 224, 5.02, 5.02
Duration [total, attack, wait] 44.663806863s, 44.601823883s, 61.98298ms
Latencies [mean, 50, 95, 99, max] 63.3879ms, 60.867241ms, 79.414139ms, 111.981619ms, 229.310088ms
Bytes In [total, mean] 137088, 612.00
Bytes Out [total, mean] 0, 0.00
Success [ratio] 100.00%
Status Codes [code:count] 200:224
Error Set:

最后

此设置演示了如何使用Prometheus适配器基于一些自定义指标来自动扩展部署。为了简单起见,我们仅从Prometheus服务器中获取了一个指标。但是,可以将适配器Configmap扩展为获取某些或所有可用度量并将其用于自动缩放。

如果Prometheus安装在我们的Kubernetes集群之外,则只需确保可从集群访问查询端点,然后在适配器部署清单中对其进行更新。在更复杂的场景中,可以获取多个指标并结合使用以制定扩展决策。

文章来自:python运维技术

Kubernetes中基于Prometheus指标的自动缩放

马哥教育Linux、Python、Go系列课程火热报名中


Kubernetes中基于Prometheus指标的自动缩放

本文始发于微信公众号(马哥Linux运维):Kubernetes中基于Prometheus指标的自动缩放

  • 左青龙
  • 微信扫一扫
  • weinxin
  • 右白虎
  • 微信扫一扫
  • weinxin
admin
  • 本文由 发表于 2020年10月8日21:16:34
  • 转载请保留本文链接(CN-SEC中文网:感谢原作者辛苦付出):
                   Kubernetes中基于Prometheus指标的自动缩放http://cn-sec.com/archives/152275.html

发表评论

匿名网友 填写信息