0


Flink配置Yarn日志聚合、配置历史日志。

Flink配置Yarn日志聚合、配置历史日志

对于已经结束的yarn应用,flink进程已经退出无法提供webui服务。所以需要通过JobHistoryServer查看保留在yarn上的日志。
下面就给大家分享一下我在配置方面的经历吧。

1.yarn配置聚合日志

编辑:yarn-site.xml

说明: 开启后任务执行 “完毕” 后,才会上传日志至hdfs

查询:yarn logs -applicationId application_1546250639760_0055

配置

  1. <!--
  2. 配置20220402-开启日志聚合-开始
  3. 说明:开启后任务执行 “完毕” 后,才会上传日志至hdfs
  4. 查询命令:yarn logs -applicationId application_1546250639760_0055
  5. --><property><name>yarn.log-aggregation.retain-seconds</name><value>10080</value><description>日志存储时间</description></property><property><name>yarn.log-aggregation-enable</name><value>true</value><description>是否启用日志聚集功能</description></property><property><name>yarn.nodemanager.remote-app-log-dir</name><value>/yarn</value><description>当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效),如此便可通过appmaster UI查看作业的运行日志。</description></property><property><name>yarn.nodemanager.remote-app-log-dir-suffix</name><value>logs</value><description>远程日志目录子目录名称(启用日志聚集功能时有效)</description></property><!-- 配置20220402-开启日志聚合-结束 -->
实验1:hadoop自带的wordcount实验。
  1. #词频统计#1.先vim创建一个文件,里面随便写点东西#2.put到hafs上#2.执行命令
  2. hadoop jar \
  3. /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount \
  4. /test1 \
  5. /test2/o2

现象:正常。

1.配置之前:运行 yarn log xxxxxxxx 看不到运行成长产生的日志。

1.1运行:

  1. [hdfs@bigdata1 hadoop]$ hadoop jar /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapred uce-examples-3.0.0.jar wordcount /test1 /test2/o3
  2. 2022-04-02 01:33:47,691 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 22022-04-02 01:33:48,229 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hado op-yarn/staging/hdfs/.staging/job_1648877577075_0001
  3. 2022-04-02 01:33:48,445 INFO input.FileInputFormat: Total input files to process :12022-04-02 01:33:48,519 INFO mapreduce.JobSubmitter: number of splits:1
  4. 2022-04-02 01:33:48,556 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.en abled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
  5. 2022-04-02 01:33:48,659 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648877577075_0001
  6. 2022-04-02 01:33:48,661 INFO mapreduce.JobSubmitter: Executing with tokens: []2022-04-02 01:33:48,843 INFO conf.Configuration: resource-types.xml not found
  7. 2022-04-02 01:33:48,844 INFO resource.ResourceUtils: Unable to find'resource-types.xml'.2022-04-02 01:33:49,261 INFO impl.YarnClientImpl: Submitted application application_1648877577075_0001
  8. 2022-04-02 01:33:49,300 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/applica tion_1648877577075_0001/
  9. 2022-04-02 01:33:49,300 INFO mapreduce.Job: Running job: job_1648877577075_0001
  10. 2022-04-02 01:33:56,416 INFO mapreduce.Job: Job job_1648877577075_0001 running in uber mode :false2022-04-02 01:33:56,417 INFO mapreduce.Job: map 0% reduce 0%
  11. 2022-04-02 01:34:02,490 INFO mapreduce.Job: map 100% reduce 0%
  12. 2022-04-02 01:34:08,529 INFO mapreduce.Job: map 100% reduce 100%
  13. 2022-04-02 01:34:08,540 INFO mapreduce.Job: Job job_1648877577075_0001 completed successfully
  14. 2022-04-02 01:34:08,633 INFO mapreduce.Job: Counters: 53
  15. File System Counters
  16. FILE: Number of bytes read=1843
  17. FILE: Number of bytes written=417739
  18. FILE: Number of readoperations=0
  19. FILE: Number of large readoperations=0
  20. FILE: Number of writeoperations=0
  21. HDFS: Number of bytes read=2071
  22. HDFS: Number of bytes written=1386
  23. HDFS: Number of readoperations=8
  24. HDFS: Number of large readoperations=0
  25. HDFS: Number of writeoperations=2
  26. Job Counters
  27. Launched map tasks=1
  28. Launched reduce tasks=1
  29. Rack-local map tasks=1
  30. Total time spent by all maps in occupied slots (ms)=13196
  31. Total time spent by all reduces in occupied slots (ms)=28888
  32. Total time spent by all map tasks (ms)=3299
  33. Total time spent by all reduce tasks (ms)=3611
  34. Total vcore-milliseconds taken by all map tasks=3299
  35. Total vcore-milliseconds taken by all reduce tasks=3611
  36. Total megabyte-milliseconds taken by all map tasks=13512704
  37. Total megabyte-milliseconds taken by all reduce tasks=29581312
  38. Map-Reduce Framework
  39. Map input records=50
  40. Map output records=167
  41. Map output bytes=2346
  42. Map output materialized bytes=1843
  43. Input splitbytes=97
  44. Combine input records=167
  45. Combine output records=113
  46. Reduce input groups=113
  47. Reduce shuffle bytes=1843
  48. Reduce input records=113
  49. Reduce output records=113
  50. Spilled Records=226
  51. Shuffled Maps =1
  52. Failed Shuffles=0
  53. Merged Map outputs=1
  54. GC time elapsed (ms)=100
  55. CPU time spent (ms)=1440
  56. Physical memory (bytes)snapshot=557318144
  57. Virtual memory (bytes)snapshot=13850431488
  58. Total committed heap usage (bytes)=390070272
  59. Peak Map Physical memory (bytes)=331268096
  60. Peak Map Virtual memory (bytes)=5254959104
  61. Peak Reduce Physical memory (bytes)=226050048
  62. Peak Reduce Virtual memory (bytes)=8595472384
  63. Shuffle Errors
  64. BAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0
  65. File Input Format Counters
  66. Bytes Read=1974
  67. File Output Format Counters
  68. Bytes Written=1386

1.2执行查询命令:

  1. [hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0001
  2. 2022-04-02 01:40:46,605 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 2
  3. File /yarn/hdfs/logs/application_1648877577075_0001 does not exist.
  4. Can not find any log file matching the pattern: [ALL]for the application: application_1648877577075_000 1
  5. Can not find the logs for the application: application_1648877577075_0001 with the appOwner: hdfs
  6. [hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0002
  7. 2022-04-02 01:40:57,983 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 2
  8. Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
  9. Can not find the appOwner. Please specify the correct appOwner
  10. Could not locate application logs for application_1648877577075_0002

2.配置之后:可以看到完整的运行日志

2.1运行:

  1. [hdfs@bigdata1 hadoop]$ hadoop jar /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount /test1 /test2/a1
  2. 2022-04-02 02:25:09,179 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:8032
  3. 2022-04-02 02:25:09,718 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hdfs/.staging/job_1648879195625_0002
  4. 2022-04-02 02:25:09,936 INFO input.FileInputFormat: Total input files to process :12022-04-02 02:25:10,009 INFO mapreduce.JobSubmitter: number of splits:1
  5. 2022-04-02 02:25:10,043 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
  6. 2022-04-02 02:25:10,144 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648879195625_0002
  7. 2022-04-02 02:25:10,145 INFO mapreduce.JobSubmitter: Executing with tokens: []2022-04-02 02:25:10,325 INFO conf.Configuration: resource-types.xml not found
  8. 2022-04-02 02:25:10,325 INFO resource.ResourceUtils: Unable to find'resource-types.xml'.2022-04-02 02:25:10,380 INFO impl.YarnClientImpl: Submitted application application_1648879195625_0002
  9. 2022-04-02 02:25:10,417 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/application_1648879195625_0002/
  10. 2022-04-02 02:25:10,417 INFO mapreduce.Job: Running job: job_1648879195625_0002
  11. 2022-04-02 02:25:17,508 INFO mapreduce.Job: Job job_1648879195625_0002 running in uber mode :false2022-04-02 02:25:17,509 INFO mapreduce.Job: map 0% reduce 0%
  12. 2022-04-02 02:25:23,575 INFO mapreduce.Job: map 100% reduce 0%
  13. 2022-04-02 02:25:28,607 INFO mapreduce.Job: map 100% reduce 100%
  14. 2022-04-02 02:25:28,616 INFO mapreduce.Job: Job job_1648879195625_0002 completed successfully
  15. 2022-04-02 02:25:28,707 INFO mapreduce.Job: Counters: 53
  16. File System Counters
  17. FILE: Number of bytes read=1843
  18. FILE: Number of bytes written=417711
  19. FILE: Number of readoperations=0
  20. FILE: Number of large readoperations=0
  21. FILE: Number of writeoperations=0
  22. HDFS: Number of bytes read=2071
  23. HDFS: Number of bytes written=1386
  24. HDFS: Number of readoperations=8
  25. HDFS: Number of large readoperations=0
  26. HDFS: Number of writeoperations=2
  27. Job Counters
  28. Launched map tasks=1
  29. Launched reduce tasks=1
  30. Data-local map tasks=1
  31. Total time spent by all maps in occupied slots (ms)=12876
  32. Total time spent by all reduces in occupied slots (ms)=21016
  33. Total time spent by all map tasks (ms)=3219
  34. Total time spent by all reduce tasks (ms)=2627
  35. Total vcore-milliseconds taken by all map tasks=3219
  36. Total vcore-milliseconds taken by all reduce tasks=2627
  37. Total megabyte-milliseconds taken by all map tasks=13185024
  38. Total megabyte-milliseconds taken by all reduce tasks=21520384
  39. Map-Reduce Framework
  40. Map input records=50
  41. Map output records=167
  42. Map output bytes=2346
  43. Map output materialized bytes=1843
  44. Input splitbytes=97
  45. Combine input records=167
  46. Combine output records=113
  47. Reduce input groups=113
  48. Reduce shuffle bytes=1843
  49. Reduce input records=113
  50. Reduce output records=113
  51. Spilled Records=226
  52. Shuffled Maps =1
  53. Failed Shuffles=0
  54. Merged Map outputs=1
  55. GC time elapsed (ms)=97
  56. CPU time spent (ms)=1360
  57. Physical memory (bytes)snapshot=552706048
  58. Virtual memory (bytes)snapshot=13832036352
  59. Total committed heap usage (bytes)=391643136
  60. Peak Map Physical memory (bytes)=329871360
  61. Peak Map Virtual memory (bytes)=5243228160
  62. Peak Reduce Physical memory (bytes)=222834688
  63. Peak Reduce Virtual memory (bytes)=8588808192
  64. Shuffle Errors
  65. BAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0
  66. File Input Format Counters
  67. Bytes Read=1974
  68. File Output Format Counters
  69. Bytes Written=1386

2.2执行查询命令

  1. ..................
  2. 非常多完整日志。

实验总结:

  1. 1)配置完就可以用 yarn logs -applicationId application_xxxxxxx 看到丰富的日志内容。
  2. 2)但是8088 web中的logs没法查看了,why?
  3. 因为:没有配置历史服务,看下一节
实验2:flink on yarn 模式运行的时候,日志。
  1. 实验内容:
  2. flink job 消费kafka, 并且print()输出。
  3. job配置:
  4. flink并行度:1

首先启动job。 可以发现job运行的容器在cm2上。

在这里插入图片描述

接下来可以看到 该job的一些日志,包括jm的和tm的。 tm的日志就是Stdout中看到的,并且刷新可以看到文件size在变大。

在这里插入图片描述

再根据我们在yarn 上配置的容器日志位置可以看到日志保存在什么位置。下图可见是保存在各个节点的/yarn/container-logs下。

在这里插入图片描述

我们到cm2的/yarn/container-logs 下看看。

在这里插入图片描述

这个和 flink控制台显示的一模一样。

在这里插入图片描述

当程序执行完毕或者cancel之后。容器日志被自动删除。为此有如下疑问1( 但是聚合日志在hdfs上正常有。)

在这里插入图片描述

疑问1:如下配置不是设置日志在容器所在节点下的保存时间吗?

在这里插入图片描述

疑问2:如果job是多个并行度怎么办?
  1. 如果是多个并行度。 那么‘总日志’ = 多个容器的日志合并。 也就是日志聚合的结果。
2.2 cdh上配置聚合日志(默认配置好的)

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

  1. #其中yarn.nodemanager.log-dirs 表示每个nodemanager上的容器产生的日志保存地址。
  2. 但是日志聚合会把同一个job分散的日志进行聚合到hdfs.

在这里插入图片描述

2.yarn配置历史日志

  1. 学习
  2. https://blog.csdn.net/qq_38038143/article/details/88641288
  3. https://blog.csdn.net/qq_35440040/article/details/84233655?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~aggregatepage~first_rank_ecpm_v1~rank_v31_ecpm-1-84233655.pc_agg_new_rank&utm_term=yarn%E5%8E%86%E5%8F%B2%E6%97%A5%E5%BF%97&spm=1000.2123.3001.4430
  4. https://blog.csdn.net/duyenson/article/details/118994693
  5. https://www.cnblogs.com/zwgblog/p/6079361.html

配置1:mapred-site.xml

  1. <property><name>mapreduce.jobhistory.address</name><value>master:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value></property>

配置2:yarn-site.xml

  1. <!--Spark Yarn--><!-- 是否开启聚合日志 --><property><name>yarn.log-aggregation-enable</name><value>true</value></property><!-- 配置日志服务器的地址,work节点使用 --><property><name>yarn.log.server.url</name><value>http://master:19888/jobhistory/logs/</value></property><!-- 配置日志过期时间,单位秒 --><property><name>yarn.log-aggregation.retain-seconds</name><value>86400</value></property>

分发

启动: mr-jobhistory-daemon.sh start historyserver

jps
在这里插入图片描述

查看日志:在8088端口点 id , 进去点log.

3.yarn配置历史日志plus->timelineservice

上聊天截图中大佬给的,他是hadoop 3.13,我当时是3.0.0 没配置成功。 以后再继续研究吧

yarn-site

  1. <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.resourcemanager.hostname</name><value>bigdata1</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>bigdata1:8088</value></property><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value></property><property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value></property><property><name>yarn.scheduler.maximum-allocation-mb</name><value>4096</value></property><property><name>yarn.nodemanager.resource.memory-mb</name><value>30720</value></property><property><name>yarn.scheduler.minimum-allocation-vcores</name><value>1</value></property><property><name>yarn.scheduler.maximum-allocation-vcores</name><value>4</value></property><property><name>yarn.nodemanager.resource.cpu-vcores</name><value>12</value></property><property><name>yarn.nodemanager.pmem-check-enabled</name><value>false</value></property><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value></property><property><name>yarn.log-aggregation-enable</name><value>true</value></property><property><name>yarn.log-aggregation.retain-seconds</name><value>86400</value></property><property><name>yarn.timeline-service.enabled</name><value>true</value></property><property><name>yarn.timeline-service.hostname</name><value>${yarn.resourcemanager.hostname}</value></property><property><name>yarn.timeline-service.address</name><value>${yarn.timeline-service.hostname}:10020</value></property><property><name>yarn.timeline-service.webapp.address</name><value>${yarn.timeline-service.hostname}:8188</value></property><property><name>yarn.log.server.url</name><value>http://${yarn.timeline-service.webapp.address}/applicationhistory/logs</value></property><property><name>yarn.timeline-service.ttl-enable</name><value>true</value></property><property><name>yarn.timeline-service.ttl-ms</name><value>86400000</value></property><property><name>yarn.timeline-service.http-cross-origin.enabled</name><value>true</value></property><property><name>yarn.resourcemanager.system-metrics-publisher.enabled</name><value>true</value></property><property><name>yarn.timeline-service.generic-application-history.enabled</name><value>true</value></property></configuration>

marped-site

  1. <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.job.emit-timeline-data</name><value>true</value></property></configuration>

至此,大喊一声“优雅~”。

标签: 大数据 flink yarn

本文转载自: https://blog.csdn.net/myself_ning/article/details/125520608
版权归原作者 大宁哥 所有, 如有侵权,请联系我们删除。

“Flink配置Yarn日志聚合、配置历史日志。”的评论:

还没有评论