0


aws eks 理解和使用 karpenter 根据负载动态置备节点

karpenter的主要目的是根据pod的调度请求,动态预置节点来满足无法调度的pod。清空pod时快速删除节点释放资源

  1. default-scheduler 0/2 nodes are available: 1 Insufficient cpu, 1 node(s) were unschedulable
  2. default-scheduler 0/1 nodes are available: 1 Insufficient cpu.
  3. karpenter Pod should schedule on ip-192-168-21-5.cn-north-1.compute.internal

工作流,WatchingEvaluatingProvisioningRemoving

部署和配置karpenter

创建eks集群

  1. #!/bin/bashexportCLUSTER_NAME="testkarpenter"exportAWS_DEFAULT_REGION="cn-north-1"exportAWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"echo$KARPENTER_VERSION$CLUSTER_NAME$AWS_DEFAULT_REGION$AWS_ACCOUNT_ID
  2. eksctl create cluster -f - <<EOF
  3. ---
  4. apiVersion: eksctl.io/v1alpha5
  5. kind: ClusterConfig
  6. metadata:
  7. name: ${CLUSTER_NAME}
  8. region: ${AWS_DEFAULT_REGION}
  9. version: "1.23"
  10. tags:
  11. karpenter.sh/discovery: ${CLUSTER_NAME}
  12. managedNodeGroups:
  13. - instanceType: m5.large
  14. amiFamily: AmazonLinux2
  15. name: ${CLUSTER_NAME}-ng
  16. desiredCapacity: 1
  17. minSize: 1
  18. maxSize: 2
  19. iam:
  20. withOIDC: true
  21. EOF

部署karpenter

  1. exportKARPENTER_VERSION=v0.22.1
  2. exportCLUSTER_NAME="testkarpenter"exportAWS_DEFAULT_REGION="cn-north-1"exportAWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"echo$KARPENTER_VERSION$CLUSTER_NAME$AWS_DEFAULT_REGION$AWS_ACCOUNT_IDTEMPOUT=$(mktemp)curl-fsSL https://karpenter.sh/"${KARPENTER_VERSION}"/getting-started/getting-started-with-eksctl/cloudformation.yaml >$TEMPOUT\&& aws cloudformation deploy \
  3. --stack-name "Karpenter-${CLUSTER_NAME}"\
  4. --template-file "${TEMPOUT}"\--capabilities CAPABILITY_NAMED_IAM \
  5. --parameter-overrides "ClusterName=${CLUSTER_NAME}"
  6. eksctl create iamidentitymapping \--username system:node:{{EC2PrivateDNSName}}\--cluster"${CLUSTER_NAME}"\--arn"arn:aws-cn:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"\--group system:bootstrappers \--group system:nodes
  7. eksctl create iamserviceaccount \--cluster"${CLUSTER_NAME}"--name karpenter --namespace karpenter \
  8. --role-name "${CLUSTER_NAME}-karpenter"\
  9. --attach-policy-arn "arn:aws-cn:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}"\
  10. --role-only \--approve
  11. aws iam create-service-linked-role --aws-service-name spot.amazonaws.com ||trueexportCLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME}--query"cluster.endpoint"--output text)"exportKARPENTER_IAM_ROLE_ARN="arn:aws-cn:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
  12. helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version${KARPENTER_VERSION}--namespace karpenter --create-namespace \--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN}\--setsettings.aws.clusterName=${CLUSTER_NAME}\--setsettings.aws.clusterEndpoint=${CLUSTER_ENDPOINT}\--setsettings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME}\--setsettings.aws.interruptionQueueName=${CLUSTER_NAME}\--wait# 旧版本v18部署# helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \# --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \# --set clusterName=${CLUSTER_NAME} \# --set clusterEndpoint=${CLUSTER_ENDPOINT} \# --set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \# --wait

创建示例provisioner

  1. exportCLUSTER_NAME="testkarpenter"cat<<EOF| kubectl apply -f -
  2. apiVersion: karpenter.sh/v1alpha5
  3. kind: Provisioner
  4. metadata:
  5. name: default
  6. spec:
  7. requirements:
  8. - key: karpenter.sh/capacity-type
  9. operator: In
  10. values: ["spot"]
  11. limits:
  12. resources:
  13. cpu: 1000
  14. providerRef:
  15. name: default
  16. ttlSecondsAfterEmpty: 10
  17. ---
  18. apiVersion: karpenter.k8s.aws/v1alpha1
  19. kind: AWSNodeTemplate
  20. metadata:
  21. name: default
  22. spec:
  23. subnetSelector:
  24. karpenter.sh/discovery: ${CLUSTER_NAME}
  25. securityGroupSelector:
  26. karpenter.sh/discovery: ${CLUSTER_NAME}
  27. EOF

部署负载测试

  1. cat <<EOF | kubectl apply -f -
  2. apiVersion: apps/v1
  3. kind: Deployment
  4. metadata:
  5. name: inflate
  6. spec:
  7. replicas: 0
  8. selector:
  9. matchLabels:
  10. app: inflate
  11. template:
  12. metadata:
  13. labels:
  14. app: inflate
  15. spec:
  16. terminationGracePeriodSeconds: 0
  17. containers:
  18. - name: inflate
  19. image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
  20. resources:
  21. requests:
  22. cpu: 1
  23. EOF
  24. kubectl scale deployment inflate --replicas 5
  25. kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

清理资源

  1. #!/bin/bashset-xexportCLUSTER_NAME="testkarpenter"exportAWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"#helm uninstall karpenter --namespace karpenter
  2. aws iam detach-role-policy --role-name="${CLUSTER_NAME}-karpenter" --policy-arn="arn:aws-cn:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}"
  3. aws iam delete-policy --policy-arn="arn:aws-cn:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}"
  4. aws iam delete-role --role-name="${CLUSTER_NAME}-karpenter"
  5. aws cloudformation delete-stack --stack-name "Karpenter-${CLUSTER_NAME}"
  6. aws ec2 describe-launch-templates \| jq -r".LaunchTemplates[].LaunchTemplateName"\|grep-i"Karpenter-${CLUSTER_NAME}"\|xargs -I{} aws ec2 delete-launch-template --launch-template-name {}
  7. eksctl delete cluster --name"${CLUSTER_NAME}"

可以通过configmap或者容器环境变量的方式配置karpenter

https://karpenter.sh/v0.22.1/concepts/settings/#environment-variables–cli-flags

  1. Environment:
  2. CLUSTER_NAME: testkarpenter
  3. CLUSTER_ENDPOINT: https://Bxxxxxxxxxxxxxxx39D808.sk1.cn-north-1.eks.amazonaws.com.c

provisioner

https://karpenter.sh/v0.22.1/concepts/provisioners/

provisioner为karpenter的crd指定资源的预置配置,每个provisioner管理一组不同的节点。

provisioner适配不同资源要求的pod,karpenter根据pod的属性调度和预置资源(计算方式未知),不需要再创建节点组了。

  • pod可以使用 Well-known labels 配置pod请求特定实例(类型,架构,操作系统等)
  • 默认的AWSNodeTemplate通过安全组和子网资源判断节点启动,资源使用 karpenter.sh/discovery 标记
  • ttlSecondsAfterEmpty参数为pod删除节点空置后关闭节点前的时间。
  • ttlSecondsUntilExpired 指定节点存续时间
  • weight 指定provisioner权重

只要pod的掉地请求不超过provisioner的限制就会寻找最佳匹配,但是如果无法匹配,pod会保持在

  1. unscheduled

状态。

集群和karpenter controller运行后,配置provisioner和wordload的约束

  • Set up provisioners,可以设置的约束有,taints,labels,requirements
  • Deploy workloads,可以指定pod的约束有,resources,nodeselector,nodeAffinity,podAffinity/podAntiAffinity,tolerations,topologySpreadConstraints,Persistent volume topology

karpenter自动预置新节点响应无法调度的pod,机制是获取集群events,然后向底层cloud provider发送指令

provisioner的使用逻辑

provisioner能够为节点设置约束,限制节点创建的区域,类型,在节点启动时增加label。

集群中配置的每个provisoner都会循环检查

provisioner应该是互斥的,一个pod不应该通过适配多个provisioner。如果有多个provisoner则会按照权重选择权重最高

可以参考官方的provisioner配置启动实例

https://karpenter.sh/v0.22.1/concepts/provisioners/

这里使用默认的awsnotetemplate,并且在启动节点时增加taint,指定启动类型为

  1. on-demand

,权重为10

  1. apiVersion: karpenter.sh/v1alpha5
  2. kind: Provisioner
  3. metadata:name: banana
  4. spec:taints:-key: example.com/special-taint
  5. effect: NoSchedule
  6. labels:billing-team: my-team
  7. requirements:-key: karpenter.sh/capacity-type
  8. operator: In
  9. values:["on-demand"]limits:resources:cpu:1000providerRef:name: default
  10. ttlSecondsAfterEmpty:10weight:10

指定pod容忍和nodeselector,此时默认provisioner和banana都适配,于是按照权重选择banana

  1. Launching node with 3 pods requesting {"cpu":"3125m","pods":"5"} from types t3a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge, c5.xlarge and 166 other(s){"commit":"51becf8-dirty", "provisioner":"banana"}

node templates

https://karpenter.sh/v0.22.1/concepts/node-templates/

awsnodetemplate是aws相关的特定配置,不同的provisioner可以使用同一个awsnotetemplate,可以理解为对于默认创建的启动模板的参数更改

示例模板和可用的参数

  1. apiVersion: karpenter.sh/v1alpha5
  2. kind: Provisioner
  3. metadata:name: default
  4. spec:providerRef:name: default
  5. ---apiVersion: karpenter.k8s.aws/v1alpha1
  6. kind: AWSNodeTemplate
  7. metadata:name: default
  8. spec:subnetSelector:{...}# required, discovers tagged subnets to attach to instancessecurityGroupSelector:{...}# required, discovers tagged security groups to attach to instancesinstanceProfile:"..."# optional, overrides the node's identity from global settingsamiFamily:"..."# optional, resolves a default ami and userdataamiSelector:{...}# optional, discovers tagged amis to override the amiFamily's defaultuserData:"..."# optional, overrides autogenerated userdata with a merge semantictags:{...}# optional, propagates tags to underlying EC2 resourcesmetadataOptions:{...}# optional, configures IMDS for the instanceblockDeviceMappings:[...]# optional, configures storage devices for the instance

例如通过awsnodetemplate指定所需要使用的ami,在启动节点时karpenter自动识别ami兼容的架构启动实例,如果发现多个可用ami,则使用最新的一个。如果没有找到ami则不启动实例

https://karpenter.sh/v0.22.1/concepts/node-templates/#ami-selection

  1. apiVersion: karpenter.k8s.aws/v1alpha1
  2. kind: AWSNodeTemplate
  3. metadata:name: default
  4. spec:amiSelector:aws-ids:"ami-08f2bf224b42c81da"subnetSelector:karpenter.sh/discovery: testkarpenter
  5. securityGroupSelector:karpenter.sh/discovery: testkarpenter

karpenter工作流

karpenter通过几层约束限制提供资源

  • cloud provider约束,包括实例类型,架构,区域
  • provisioner约束
  • pod deployment约束

节点启动流程

  • 发现需要预置资源的pod,包含commitidFound 15 provisionable pod(s){"commit":"51becf8-dirty"}
  • 计算pod需要启动的节点数量,以下预期数量为17说明包括了ds的podLaunching node with 15 pods requesting {"cpu":"15125m","pods":"17"} from types c3.4xlarge, c4.4xlarge, r3.4xlarge, m4.4xlarge, c5.4xlarge and 110 other(s){"commit":"51becf8-dirty", "provisioner":"default"}
  • 创建启动模板并启动实例Created launch template, Karpenter-testkarpenter-8870469202198737887 {"commit":"51becf8-dirty", "provisioner":"default"}Launched instance: i-09fxxxxxxxx14e, hostname: ip-192-168-39-191.cn-north-1.compute.internal, type: r5.4xlarge, zone: cn-north-1b, capacityType: spot {"commit":"51becf8-dirty", "provisioner":"default"}创建的启动模板中userdata如下#!/bin/bash -xeexec>>(tee /var/log/user-data.log|logger -t user-data -s2>/dev/console)2>&1/etc/eks/bootstrap.sh 'testkarpenter' --apiserver-endpoint 'https://xxxxxxxx.sk1.cn-north-1.eks.amazonaws.com.cn' --b64-cluster-ca 'xxxxxxx'\--container-runtime containerd \--kubelet-extra-args '--node-labels=billing-team=my-team,karpenter.sh/capacity-type=on-demand,karpenter.sh/provisioner-name=banana --register-with-taints=example.com/special-taint=:NoSchedule'
  • 删除pod节点空置,添加节点终止的ttlAdded TTL to empty node{"commit":"51becf8-dirty", "node":"ip-192-168-39-191.cn-north-1.compute.internal"}
  • karpenter使用finalizer处理删除节点,cordon节点,drain所有的pod,然后终止实例,删除节点Triggering termination after 30s for empty node{"commit":"51becf8-dirty", "node":"ip-192-168-39-191.cn-north-1.compute.internal"}Cordoned node{"commit":"51becf8-dirty", "node":"ip-192-168-39-191.cn-north-1.compute.internal"}Deleted node{"commit":"51becf8-dirty", "node":"ip-192-168-39-191.cn-north-1.compute.internal"}Deleted launch template Karpenter-testkarpenter-xx (lt-0aa5exxxxbcfe0){"commit":"51becf8-dirty"}

相关错误

如果没有创建provisioner,会出现以下错误,pod处于pending状态

  1. Provisioning failed, creating scheduler, no provisioners found {"commit":"51becf8-dirty"}

如果provisioner的配置和pod的调度要求不一致,并且不存在可用的其他provisioner,则出现以下不兼容错误

  1. controller.provisioning Could not schedule pod, incompatible with provisioner "banana", did not tolerate example.com/special-taint=:NoSchedule {"commit":"51becf8-dirty", "pod":"default/condition-workload-64f8b97857-lxsjv"}

如果启动时容量不足会出现以下错误

  1. Provisioning failed, launching node, creating cloud provider instance, with fleet error(s), InsufficientInstanceCapacity: We currently do not have sufficient p3.8xlarge capacity in the Availability Zone you requested (cn-north-1a). Our system will be working on provisioning additional capacity. You can currently get p3.8xlarge capacity by not specifying an Availability Zone in your request or choosing cn-north-1b.

默认provisioner不兼容,于实controller放松约束,继续调度pod

  1. Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default]{"commit":"51becf8-dirty", "pod":"karpenter/karpenter-76f776664b-5r9fz"}
  2. controller.provisioning Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints ={"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}}{"commit":"51becf8-dirty", "pod":"karpenter/karpenter-76f776664b-5r9fz"}
标签: aws 云计算 大数据

本文转载自: https://blog.csdn.net/sinat_41567654/article/details/128662286
版权归原作者 zhojiew 所有, 如有侵权,请联系我们删除。

“aws eks 理解和使用 karpenter 根据负载动态置备节点”的评论:

还没有评论