0


Flink on K8S集群搭建及StreamPark平台安装

1.环境准备

1.1 介绍

在使用 Flink&Spark 时发现从编程模型, 启动配置到运维管理都有很多可以抽象共用的地方, 目前streampark提供了一个flink一站式的流处理作业开发管理平台, 从流处理作业开发到上线全生命周期都做了支持, 是一个一站式的流出来计算平台。
未来spark开发也在规划范围内,目前还不支持

1.2 下载

StreamPark安装包下载:https://streampark.apache.org/download
StreamPark官网:https://streampark.apache.org/docs/intro
最新版为2.1.2,本次安装为2.1.2版本

1.3 已有组件及版本

序号名称配置1K8S(TEK)CPU:316,Me:361.57G,Storage:349.1G,Pods:3253,IP:3*273 (v1.26.1)2NFS(CFS)40G磁盘(目前,可扩充)3Harbor(容器仓库)未知4flinkflink-(1.13.0/1.14.4/1.16.2)5StreamPark2.1.26Mysql5.7

2.挂载与权限

2.1 k8s挂载

找运维开通权限,主要是kubectl的权限

下载kubectl
权限文件:/root/.kube/config

2.2 nfs挂载

找运维要服务ip

  1. #挂载前,请确保系统中已经安装了nfs-utils或nfs-commonsudo yum install nfs-utils
  2. #创建待挂载目标目录mkdir<待挂载目标目录>mkdir /localfolder/
  3. #挂载文件系统#挂载 CFS 根目录#以下命令可以到 CFS 控制台-文件系统详情-挂载点详情中获取,由于部分旧版本文件系统不支持 noresvport 参数,具体挂载命令请以控制台建议命令为主。配置 norevsport 参数后,在网络重连时使用新的 TCP 端口,可以保障在网络异常到恢复期间、客户端和文件系统的连接不会中断,建议启用该参数。#另,部分旧版本 Linux 内核需要使用 vers=4 挂载,若使用 vers=4.0 挂载有异常,可以尝试修改为 vers=4。sudomount-t nfs -overs=4.0,noresvport <挂载点 IP>:/ <待挂载目录>sudomount-t nfs -overs=4.0,noresvport 10.0.24.4:/ /localfolder
  4. #挂载 CFS 子目录#以下命令可以到 CFS 控制台-文件系统详情-挂载点详情中获取,由于部分旧版本文件系统不支持 noresvport 参数,具体挂载命令请以控制台建议命令为主。配置 norevsport 参数后,在网络重连时使用新的 TCP 端口,可以保障在网络异常到恢复期间、客户端和文件系统的连接不会中断,建议启用该参数。#另,部分旧版本 Linux 内核需要使用 vers=4 挂载,若使用 vers=4.0 挂载有异常,可以尝试修改为 vers=4。sudomount-t nfs -overs=4.0,noresvport 10.0.24.4:/subfolder /localfolder

腾讯nfs(CFS)使用文档:https://cloud.tencent.com/document/product/582/11523

2.3 Harbor

需要找运维开通权限

此处注意:项目库要公开才可以使用

3.测试Flink

在k8s为flink单独开通命名空间,并创建相应账户

  1. kubectl create clusterrolebinding flink-role-bind --clusterrole=edit --serviceaccount=flink:flink

在有k8s操作权限的节点运行flink-session

  1. bin/flink run \-e kubernetes-session \-Dkubernetes.namespace=flink \
  2. -Dkubernetes.rest-service.exposed.type=NodePort \
  3. -Dkubernetes.cluster-id=flink-cluster \-c WordCount1 \
  4. /data/package/jar/flink_test-1.0-SNAPSHOT.jar
  5. #参考配置
  6. bin/kubernetes-session.sh \-Dkubernetes.namespace=flink \
  7. -Dkubernetes.jobmanager.service-account=flink \
  8. -Dkubernetes.rest-service.exposed.type=NodePort \
  9. -Dkubernetes.cluster-id=flink-cluster \-Dkubernetes.jobmanager.cpu=0.2\-Djobmanager.memory.process.size=1024m \
  10. -Dresourcemanager.taskmanager-timeout=3600000\-Dkubernetes.taskmanager.cpu=0.2\-Dtaskmanager.memory.process.size=1024m \-Dtaskmanager.numberOfTaskSlots=1

4.安装StreamPark

4.1 streampark(k8s)镜像打包

  1. #编译
  2. ./build #注意maven 镜像配置,要不找不到依赖包同时要安装npm#测试npm是否安装npm-v#streampark安装包添加mysql连接包cp /data/module/streampart_2.12-2.1.2/lib/mysql-connector-java-8.0.30.jar lib/
  3. #配置maven配置拷贝cp /data/module/maven-3.6.3/conf/settings.xml /data/module/docker/streampark-docker/
  4. #修改application.yml
  5. profiles.active: mysql #[h2,pgsql,mysql]
  6. lark-url: https://open.feishu.cn
  7. workspace:
  8. local: /opt/streampark_workspace
  9. #配置application-mysql.ymltee /data/module/docker/streampark-docker/streampark-2.1.2/conf/application-mysql.yml <<-'EOF'
  10. spring.datasource.driver-class-name: com.mysql.cj.jdbc.Driver
  11. streampark.docker.http-client.docker-host: ${DOCKER_HOST:}
  12. streampark.maven.settings: ${MAVEN_SETTINGS_PATH:/root/.m2/settings.xml}
  13. streampark.workspace.local: ${WORKSPACE_PATH:/opt/streampark_workspace}
  14. EOF# 编写Dockerfile#需要提前准备kubectl、settings.xml 、config(kubectl的密钥)tee docker<<-'EOF'
  15. FROM flink:1.17.1-scala_2.12-java8
  16. WORKDIR /opt/streampark/
  17. ADD ./streampark-2.1.2/ /opt/streampark/
  18. ADD ./kubectl /opt/streampark/
  19. ADD ./settings.xml /root/.m2/
  20. USER root
  21. RUN sed -i -e 's/eval $NOHUP/eval/' bin/streampark.sh \
  22. && sed -i -e 's/>> "$APP_OUT" 2>&1 "&"//' bin/streampark.sh \
  23. && install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl \
  24. && mkdir -p ~/.kube
  25. WORKDIR /opt/streampark/
  26. ADD ./config /root/.kube/
  27. RUN chown -R flink /opt/streampark/
  28. EXPOSE 10000
  29. EOF#构建镜像docker build -f Dockerfile -t apache/streampark-flink:2.1.2 .#推送镜像到仓库docker tag apache/streampark-flink:2.1.2 storage/bigdata/streampark-flink:2.1.2
  30. docker push storage/bigdata/streampark-flink:2.1.2
  31. docker tag apache/streampark-flink:2.1.2-rc4 storage/bigdata/streampark-flink:2.1.2-rc4
  32. docker push storage/bigdata/streampark-flink:2.1.2-rc4

4.2部署MySQL的pod

  1. #k8s上创建mysql的namespace#含义:kubectl create clusterrolebinding ClusterRoleBinding名 --clusterrole=绑定的Role serviceaccount=被绑定的SA -n 命名空间
  2. kubectl create namespace mysql
  3. kubectl create serviceaccount mysql
  4. kubectl create clusterrolebinding mysql-role-bind --clusterrole=edit --serviceaccount=mysql:mysql -n mysql
  5. clusterrolebinding.rbac.authorization.k8s.io/mysql-role-bind created
  6. #查看角色绑定
  7. kubectl get clusterrolebinding flink-role-bind -n flink -o yaml
  8. kubectl get clusterrolebinding mysql-role-bind -n mysql -o yaml
  9. kubectl get clusterrolebinding mysql-role-bind -n flink -o yaml
  10. #配置pvc和pv和nfs指定(腾讯nfs可直接使用)
  11. apiVersion: v1
  12. kind: PersistentVolume
  13. metadata:
  14. name: data-mysql
  15. spec:
  16. accessModes:
  17. - ReadWriteMany
  18. capacity:
  19. storage: 10Gi
  20. csi:
  21. driver: com.tencent.cloud.csi.cfs
  22. volumeAttributes:
  23. host: x x x
  24. path: /data_mysql
  25. vers: "4"
  26. volumeHandle: cfs #此处需要每个pv都不相同,否则挂载两个pvc会报错
  27. persistentVolumeReclaimPolicy: Retain
  28. storageClassName: data-mysql
  29. volumeMode: Filesystem
  30. ---
  31. apiVersion: v1
  32. kind: PersistentVolumeClaim
  33. metadata:
  34. name: data-mysql
  35. namespace: mysql
  36. spec:
  37. accessModes:
  38. - ReadWriteMany
  39. resources:
  40. requests:
  41. storage: 10Gi
  42. storageClassName: data-mysql
  43. #docker拉去mysql镜像并上传至镜像库docker tag mysql:5.7 storage/bigdata/mysql:5.7
  44. docker push storage/bigdata/mysql:5.7
  45. #查看集群的node ip
  46. kubectl get node# 编写mysql的yaml文件,并做配置sudomkdir-p /data/module/docker/mysql/{conf,data}/
  47. sudotee /data/module/docker/mysql/conf/pod-db-mysql.yaml <<-'EOF'
  48. apiVersion: v1
  49. kind: ConfigMap
  50. metadata:
  51. name: conf-mysql
  52. namespace: mysql
  53. data:
  54. mysql.cnf: |
  55. [mysqld]
  56. #Mysql服务的唯一编号 每个mysql服务Id需唯一
  57. server-id=1
  58. # 允许访问的IP网段
  59. bind-address=0.0.0.0
  60. #设置时区
  61. default-time_zone='+8:00'
  62. #数据库默认字符集,主流字符集支持一些特殊表情符号(特殊表情符占用4个字节)
  63. character-set-server=utf8mb4
  64. #数据库字符集对应一些排序等规则,注意要和character-set-server对应
  65. collation-server=utf8mb4_general_ci
  66. #设置client连接mysql时的字符集,防止乱码
  67. init_connect='SET NAMES utf8mb4'
  68. #是否对sql语句大小写敏感,1表示不敏感
  69. lower_case_table_names=1
  70. #最大连接数
  71. max_connections=400
  72. #最大错误连接数
  73. max_connect_errors=1000
  74. #TIMESTAMP如果没有显示声明NOT NULL,允许NULL值
  75. explicit_defaults_for_timestamp=true
  76. #SQL数据包发送的大小,如果有BLOB对象建议修改成1G
  77. max_allowed_packet=128M
  78. #MySQL连接闲置超过一定时间后(单位:秒)将会被强行关闭
  79. #MySQL默认的wait_timeout 值为8个小时, interactive_timeout参数需要同时配置才能生效
  80. interactive_timeout=3600
  81. wait_timeout=3600
  82. ---
  83. apiVersion: v1
  84. kind: Pod
  85. metadata:
  86. name: pod-db-mysql
  87. namespace: mysql
  88. spec:
  89. #serviceAccount: mysql
  90. nodeName: xxx
  91. hostNetwork: true #主机网络可见(会占用node端口)
  92. containers:
  93. - name: mysql-k8s
  94. image: storage/bigdata/mysql:5.7
  95. env:
  96. - name: TZ
  97. value: "Asia/Shanghai"
  98. - name: LANG
  99. value: "zh_CN.UTF-8"
  100. - name: MYSQL_ROOT_PASSWORD
  101. value: "xxxx"
  102. ports:
  103. #- containerPort: 3306
  104. volumeMounts:
  105. - mountPath: /var/lib/mysql
  106. subPath: mysql
  107. name: data-mysql
  108. - mountPath: /etc/mysql/conf.d
  109. name: conf-volume
  110. readOnly: true
  111. volumes:
  112. - name: data-mysql
  113. persistentVolumeClaim:
  114. claimName: data-mysql
  115. - name: conf-volume
  116. configMap:
  117. name: conf-mysql
  118. EOF#kubectl delete -f /opt/module/k3s/conf/pod-db-mysql-k3s100.yaml#启动pod
  119. kubectl apply -f /data/module/docker/mysql/conf/pod-db-mysql.yaml
  120. kubectl delete -f /data/module/docker/mysql/conf/pod-db-mysql.yaml
  121. # 稍等片刻
  122. kubectl get pod -A-owide
  123. kubectl describe pod pod-db-mysql
  124. kubectl logs --tail=100-f pod-db-mysql -n mysql

4.3 mysql 数据库初始化

  1. # 复制数据库文件cp-r /data/software/incubator-streampark-2.1.2-rc3/streampark-console/streampark-console-service/src/main/assembly/script/ /localnfs/data_mysql/mysql/streampark-sql/
  2. cp-r /data/module/docker/streampark-docker/streampark-2.2.0/script/ /localnfs/data_mysql/mysql/streampark-sql/
  3. # 创建用户及数据库# 进入mysql容器
  4. kubectl exec-n mysql -it pod-db-mysql -- bash#------------------------进入mysql容器----------------------------
  5. mysql -uroot-proot
  6. create database if not exists `streampark` character set utf8mb4 collate utf8mb4_general_ci;
  7. create user 'xxxx'@'%' IDENTIFIED WITH mysql_native_password by 'xxx';
  8. grant ALL PRIVILEGES ON streampark.* to 'xxxx'@'%';
  9. flush privileges;
  10. -- 导入数据文件
  11. use streampark;source /var/lib/mysql/streampark-sql/schema/mysql-schema.sql
  12. source /var/lib/mysql/streampark-sql/data/mysql-data.sql
  13. -- 退出mysql
  14. quit
  15. #------------------------退出mysql容器------------------------exit#查看mysql 的建表vim /data/module/docker/streampark-docker/streampark-2.2.0/script/schema/mysql-schema.sql
  16. vim /data/module/docker/streampark-docker/streampark-2.2.0/script/data/mysql-data.sql

4.4 创建StreamPark的pod

  1. #k8s上创建mysql的namespace#含义:kubectl create clusterrolebinding ClusterRoleBinding名 --clusterrole=绑定的Role serviceaccount=被绑定的SA -n 命名空间
  2. kubectl create namespace streampark
  3. kubectl create serviceaccount streampark
  4. kubectl create clusterrolebinding streampark-role-bind --clusterrole=edit --serviceaccount=streampark:streampark -n streampark
  5. clusterrolebinding.rbac.authorization.k8s.io/mysql-role-bind created
  6. #配置pvc和pv和nfs指定vim pv-pvc-streampark.yaml
  7. apiVersion: v1
  8. kind: PersistentVolume
  9. metadata:
  10. name: data-streampark
  11. spec:
  12. accessModes:
  13. - ReadWriteMany
  14. capacity:
  15. storage: 10Gi
  16. csi:
  17. driver: com.tencent.cloud.csi.cfs
  18. volumeAttributes:
  19. host: xxx
  20. path: /data_streampark
  21. vers: "4"
  22. volumeHandle: cfs #此处需要每个pv都不相同,否则挂载两个pvc会报错
  23. persistentVolumeReclaimPolicy: Retain
  24. storageClassName: data-streampark
  25. volumeMode: Filesystem
  26. ---
  27. apiVersion: v1
  28. kind: PersistentVolumeClaim
  29. metadata:
  30. name: data-streampark
  31. namespace: streampark
  32. spec:
  33. accessModes:
  34. - ReadWriteMany
  35. resources:
  36. requests:
  37. storage: 10Gi
  38. storageClassName: data-streampark
  39. #执行
  40. kubectl apply -f /data/module/docker/k8s/pv-pvc-streampark.yaml
  41. kubectl delete -f /data/module/docker/k8s/pv-pvc-streampark.yaml
  42. # 在xxx 上执行,在指定 节点 xxx 安装StreamParksudotee /data/pod-app-streampark.yaml <<-'EOF'
  43. apiVersion: apps/v1
  44. kind: Deployment
  45. metadata:
  46. labels:
  47. app: pod-app-streampark
  48. name: pod-app-streampark
  49. namespace: streampark
  50. spec:
  51. replicas: 1
  52. selector:
  53. matchLabels:
  54. app: pod-app-streampark
  55. template:
  56. metadata:
  57. labels:
  58. app: pod-app-streampark
  59. spec:
  60. nodeName: xxx
  61. hostNetwork: true #主机网络可见(会占用node端口)
  62. containers:
  63. - name: streampark
  64. image: storage/bigdata/streampark-flink:2.1.2-rc4
  65. imagePullPolicy: Always
  66. env:
  67. - name: TZ
  68. value: "Asia/Shanghai"
  69. - name: LANG
  70. value: "zh_CN.UTF-8"
  71. - name: SPRING_PROFILES_ACTIVE
  72. value: "mysql"
  73. - name: SPRING_DATASOURCE_URL
  74. value: "jdbc:mysql://xxx:3306/streampark?useSSL=false&useUnicode=true&characterEncoding=UTF-8&allowPublicKeyRetrieval=false&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8"
  75. - name: SPRING_DATASOURCE_USERNAME
  76. value: "xxxx"
  77. - name: SPRING_DATASOURCE_PASSWORD
  78. value: "xxxx"
  79. - name: DOCKER_HOST
  80. value: "tcp://xxx:2375"
  81. - name: DEBUG_OPTS #调试端口参数
  82. value: "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:10001"
  83. ports:
  84. #- containerPort: 10000
  85. volumeMounts:
  86. - mountPath: /root/.kube
  87. subPath: .kube
  88. name: conf-volume
  89. - mountPath: /opt/streampark_workspace
  90. subPath: streampark_workspace
  91. name: data-volume
  92. command: ["sh","-c","bash bin/startup.sh debug"]
  93. volumes:
  94. - name: conf-volume
  95. hostPath:
  96. path: /root
  97. type: DirectoryOrCreate
  98. - name: data-volume
  99. nfs:
  100. path: /data_streampark
  101. server: xxx
  102. EOF#kubectl delete -f /opt/module/k3s/conf/pod-app-streampark-k3s100.yaml#启动pod
  103. kubectl apply -f /data/pod-app-streampark.yaml
  104. kubectl delete -f /data/pod-app-streampark.yaml
  105. # 稍等片刻
  106. kubectl get pod -A-o wide -n mysql
  107. kubectl describe pod pod-app-streampark-k3s100
  108. kubectl logs --tail=1000-f pod-app-streampark -n mysql
  109. # 进入mysql容器中创建用户及数据库
  110. kubectl exec-n mysql -it pod-app-streampark -- bash#-c streampark-k3s100# 为默认命名空间添加权限 kubectl create clusterrolebinding flink-role-binding-default --clusterrole=edit --serviceaccount=flink_dev:default
  111. kubectl create clusterrolebinding flink-role-binding-default --clusterrole=edit --serviceaccount=default:default

本文转载自: https://blog.csdn.net/Brother_ning/article/details/135228630
版权归原作者 tuoluzhe8521 所有, 如有侵权,请联系我们删除。

“Flink on K8S集群搭建及StreamPark平台安装”的评论:

还没有评论