Flink配置Yarn日志聚合、配置历史日志。

Flink配置Yarn日志聚合、配置历史日志

对于已经结束的yarn应用，flink进程已经退出无法提供webui服务。所以需要通过JobHistoryServer查看保留在yarn上的日志。
下面就给大家分享一下我在配置方面的经历吧。

1.yarn配置聚合日志

编辑：yarn-site.xml

说明：开启后任务执行 “完毕” 后，才会上传日志至hdfs

查询：yarn logs -applicationId application_1546250639760_0055

配置：

<!--
        配置20220402-开启日志聚合-开始
        说明：开启后任务执行 “完毕” 后，才会上传日志至hdfs
        查询命令：yarn logs -applicationId application_1546250639760_0055
    --><property><name>yarn.log-aggregation.retain-seconds</name><value>10080</value><description>日志存储时间</description></property><property><name>yarn.log-aggregation-enable</name><value>true</value><description>是否启用日志聚集功能</description></property><property><name>yarn.nodemanager.remote-app-log-dir</name><value>/yarn</value><description>当应用程序运行结束后，日志被转移到的HDFS目录（启用日志聚集功能时有效），如此便可通过appmaster UI查看作业的运行日志。</description></property><property><name>yarn.nodemanager.remote-app-log-dir-suffix</name><value>logs</value><description>远程日志目录子目录名称（启用日志聚集功能时有效）</description></property><!-- 配置20220402-开启日志聚合-结束 -->

实验1：hadoop自带的wordcount实验。

#词频统计#1.先vim创建一个文件，里面随便写点东西#2.put到hafs上#2.执行命令
hadoop jar  \
/usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount \
/test1 \
/test2/o2

现象：正常。

1.配置之前：运行 yarn log xxxxxxxx 看不到运行成长产生的日志。

1.1运行：

[hdfs@bigdata1 hadoop]$ hadoop jar  /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapred                                                                                                                uce-examples-3.0.0.jar wordcount /test1 /test2/o3
2022-04-02 01:33:47,691 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                22022-04-02 01:33:48,229 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hado                                                                                                                op-yarn/staging/hdfs/.staging/job_1648877577075_0001
2022-04-02 01:33:48,445 INFO input.FileInputFormat: Total input files to process :12022-04-02 01:33:48,519 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 01:33:48,556 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.en                                                                                                                abled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 01:33:48,659 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648877577075_0001
2022-04-02 01:33:48,661 INFO mapreduce.JobSubmitter: Executing with tokens: []2022-04-02 01:33:48,843 INFO conf.Configuration: resource-types.xml not found
2022-04-02 01:33:48,844 INFO resource.ResourceUtils: Unable to find'resource-types.xml'.2022-04-02 01:33:49,261 INFO impl.YarnClientImpl: Submitted application application_1648877577075_0001
2022-04-02 01:33:49,300 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/applica                                                                                                                tion_1648877577075_0001/
2022-04-02 01:33:49,300 INFO mapreduce.Job: Running job: job_1648877577075_0001
2022-04-02 01:33:56,416 INFO mapreduce.Job: Job job_1648877577075_0001 running in uber mode :false2022-04-02 01:33:56,417 INFO mapreduce.Job:  map 0% reduce 0%
2022-04-02 01:34:02,490 INFO mapreduce.Job:  map 100% reduce 0%
2022-04-02 01:34:08,529 INFO mapreduce.Job:  map 100% reduce 100%
2022-04-02 01:34:08,540 INFO mapreduce.Job: Job job_1648877577075_0001 completed successfully
2022-04-02 01:34:08,633 INFO mapreduce.Job: Counters: 53
        File System Counters
                FILE: Number of bytes read=1843
                FILE: Number of bytes written=417739
                FILE: Number of readoperations=0
                FILE: Number of large readoperations=0
                FILE: Number of writeoperations=0
                HDFS: Number of bytes read=2071
                HDFS: Number of bytes written=1386
                HDFS: Number of readoperations=8
                HDFS: Number of large readoperations=0
                HDFS: Number of writeoperations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Rack-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=13196
                Total time spent by all reduces in occupied slots (ms)=28888
                Total time spent by all map tasks (ms)=3299
                Total time spent by all reduce tasks (ms)=3611
                Total vcore-milliseconds taken by all map tasks=3299
                Total vcore-milliseconds taken by all reduce tasks=3611
                Total megabyte-milliseconds taken by all map tasks=13512704
                Total megabyte-milliseconds taken by all reduce tasks=29581312
        Map-Reduce Framework
                Map input records=50
                Map output records=167
                Map output bytes=2346
                Map output materialized bytes=1843
                Input splitbytes=97
                Combine input records=167
                Combine output records=113
                Reduce input groups=113
                Reduce shuffle bytes=1843
                Reduce input records=113
                Reduce output records=113
                Spilled Records=226
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=100
                CPU time spent (ms)=1440
                Physical memory (bytes)snapshot=557318144
                Virtual memory (bytes)snapshot=13850431488
                Total committed heap usage (bytes)=390070272
                Peak Map Physical memory (bytes)=331268096
                Peak Map Virtual memory (bytes)=5254959104
                Peak Reduce Physical memory (bytes)=226050048
                Peak Reduce Virtual memory (bytes)=8595472384
        Shuffle Errors
                BAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1974
        File Output Format Counters
                Bytes Written=1386

1.2执行查询命令：

[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0001
2022-04-02 01:40:46,605 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                2
File /yarn/hdfs/logs/application_1648877577075_0001 does not exist.

Can not find any log file matching the pattern: [ALL]for the application: application_1648877577075_000                                                                                                                1
Can not find the logs for the application: application_1648877577075_0001 with the appOwner: hdfs
[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0002
2022-04-02 01:40:57,983 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                2
Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
Can not find the appOwner. Please specify the correct appOwner
Could not locate application logs for application_1648877577075_0002

2.配置之后：可以看到完整的运行日志

2.1运行：

[hdfs@bigdata1 hadoop]$ hadoop jar  /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount /test1 /test2/a1
2022-04-02 02:25:09,179 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:8032
2022-04-02 02:25:09,718 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hdfs/.staging/job_1648879195625_0002
2022-04-02 02:25:09,936 INFO input.FileInputFormat: Total input files to process :12022-04-02 02:25:10,009 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 02:25:10,043 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 02:25:10,144 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648879195625_0002
2022-04-02 02:25:10,145 INFO mapreduce.JobSubmitter: Executing with tokens: []2022-04-02 02:25:10,325 INFO conf.Configuration: resource-types.xml not found
2022-04-02 02:25:10,325 INFO resource.ResourceUtils: Unable to find'resource-types.xml'.2022-04-02 02:25:10,380 INFO impl.YarnClientImpl: Submitted application application_1648879195625_0002
2022-04-02 02:25:10,417 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/application_1648879195625_0002/
2022-04-02 02:25:10,417 INFO mapreduce.Job: Running job: job_1648879195625_0002
2022-04-02 02:25:17,508 INFO mapreduce.Job: Job job_1648879195625_0002 running in uber mode :false2022-04-02 02:25:17,509 INFO mapreduce.Job:  map 0% reduce 0%
2022-04-02 02:25:23,575 INFO mapreduce.Job:  map 100% reduce 0%
2022-04-02 02:25:28,607 INFO mapreduce.Job:  map 100% reduce 100%
2022-04-02 02:25:28,616 INFO mapreduce.Job: Job job_1648879195625_0002 completed successfully
2022-04-02 02:25:28,707 INFO mapreduce.Job: Counters: 53
        File System Counters
                FILE: Number of bytes read=1843
                FILE: Number of bytes written=417711
                FILE: Number of readoperations=0
                FILE: Number of large readoperations=0
                FILE: Number of writeoperations=0
                HDFS: Number of bytes read=2071
                HDFS: Number of bytes written=1386
                HDFS: Number of readoperations=8
                HDFS: Number of large readoperations=0
                HDFS: Number of writeoperations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=12876
                Total time spent by all reduces in occupied slots (ms)=21016
                Total time spent by all map tasks (ms)=3219
                Total time spent by all reduce tasks (ms)=2627
                Total vcore-milliseconds taken by all map tasks=3219
                Total vcore-milliseconds taken by all reduce tasks=2627
                Total megabyte-milliseconds taken by all map tasks=13185024
                Total megabyte-milliseconds taken by all reduce tasks=21520384
        Map-Reduce Framework
                Map input records=50
                Map output records=167
                Map output bytes=2346
                Map output materialized bytes=1843
                Input splitbytes=97
                Combine input records=167
                Combine output records=113
                Reduce input groups=113
                Reduce shuffle bytes=1843
                Reduce input records=113
                Reduce output records=113
                Spilled Records=226
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=97
                CPU time spent (ms)=1360
                Physical memory (bytes)snapshot=552706048
                Virtual memory (bytes)snapshot=13832036352
                Total committed heap usage (bytes)=391643136
                Peak Map Physical memory (bytes)=329871360
                Peak Map Virtual memory (bytes)=5243228160
                Peak Reduce Physical memory (bytes)=222834688
                Peak Reduce Virtual memory (bytes)=8588808192
        Shuffle Errors
                BAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1974
        File Output Format Counters
                Bytes Written=1386

2.2执行查询命令

..................
非常多完整日志。

实验总结：

1)配置完就可以用 yarn logs -applicationId application_xxxxxxx 看到丰富的日志内容。
2)但是8088 web中的logs没法查看了，why?
    因为：没有配置历史服务，看下一节

实验2：flink on yarn 模式运行的时候，日志。

实验内容：
    flink job 消费kafka, 并且print()输出。
job配置：
    flink并行度：1

首先启动job。可以发现job运行的容器在cm2上。

在这里插入图片描述

接下来可以看到该job的一些日志，包括jm的和tm的。 tm的日志就是Stdout中看到的，并且刷新可以看到文件size在变大。

在这里插入图片描述

再根据我们在yarn 上配置的容器日志位置可以看到日志保存在什么位置。下图可见是保存在各个节点的/yarn/container-logs下。

在这里插入图片描述

我们到cm2的/yarn/container-logs 下看看。

在这里插入图片描述

这个和 flink控制台显示的一模一样。

在这里插入图片描述

当程序执行完毕或者cancel之后。容器日志被自动删除。为此有如下疑问1（但是聚合日志在hdfs上正常有。）

在这里插入图片描述

疑问1：如下配置不是设置日志在容器所在节点下的保存时间吗？

在这里插入图片描述

疑问2：如果job是多个并行度怎么办？

如果是多个并行度。 那么‘总日志’ = 多个容器的日志合并。 也就是日志聚合的结果。

2.2 cdh上配置聚合日志(默认配置好的)

在这里插入图片描述

#其中yarn.nodemanager.log-dirs 表示每个nodemanager上的容器产生的日志保存地址。
    但是日志聚合会把同一个job分散的日志进行聚合到hdfs.

在这里插入图片描述

2.yarn配置历史日志

学习
https://blog.csdn.net/qq_38038143/article/details/88641288

https://blog.csdn.net/qq_35440040/article/details/84233655?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~aggregatepage~first_rank_ecpm_v1~rank_v31_ecpm-1-84233655.pc_agg_new_rank&utm_term=yarn%E5%8E%86%E5%8F%B2%E6%97%A5%E5%BF%97&spm=1000.2123.3001.4430

https://blog.csdn.net/duyenson/article/details/118994693

https://www.cnblogs.com/zwgblog/p/6079361.html

配置1：mapred-site.xml

<property><name>mapreduce.jobhistory.address</name><value>master:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value></property>

配置2：yarn-site.xml

<!--Spark Yarn--><!-- 是否开启聚合日志 --><property><name>yarn.log-aggregation-enable</name><value>true</value></property><!-- 配置日志服务器的地址,work节点使用 --><property><name>yarn.log.server.url</name><value>http://master:19888/jobhistory/logs/</value></property><!-- 配置日志过期时间,单位秒 --><property><name>yarn.log-aggregation.retain-seconds</name><value>86400</value></property>

分发

启动： mr-jobhistory-daemon.sh start historyserver

jps
在这里插入图片描述

查看日志：在8088端口点 id , 进去点log.

3.yarn配置历史日志plus->timelineservice

上聊天截图中大佬给的，他是hadoop 3.13,我当时是3.0.0 没配置成功。以后再继续研究吧

yarn-site

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.resourcemanager.hostname</name><value>bigdata1</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>bigdata1:8088</value></property><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value></property><property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value></property><property><name>yarn.scheduler.maximum-allocation-mb</name><value>4096</value></property><property><name>yarn.nodemanager.resource.memory-mb</name><value>30720</value></property><property><name>yarn.scheduler.minimum-allocation-vcores</name><value>1</value></property><property><name>yarn.scheduler.maximum-allocation-vcores</name><value>4</value></property><property><name>yarn.nodemanager.resource.cpu-vcores</name><value>12</value></property><property><name>yarn.nodemanager.pmem-check-enabled</name><value>false</value></property><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value></property><property><name>yarn.log-aggregation-enable</name><value>true</value></property><property><name>yarn.log-aggregation.retain-seconds</name><value>86400</value></property><property><name>yarn.timeline-service.enabled</name><value>true</value></property><property><name>yarn.timeline-service.hostname</name><value>${yarn.resourcemanager.hostname}</value></property><property><name>yarn.timeline-service.address</name><value>${yarn.timeline-service.hostname}:10020</value></property><property><name>yarn.timeline-service.webapp.address</name><value>${yarn.timeline-service.hostname}:8188</value></property><property><name>yarn.log.server.url</name><value>http://${yarn.timeline-service.webapp.address}/applicationhistory/logs</value></property><property><name>yarn.timeline-service.ttl-enable</name><value>true</value></property><property><name>yarn.timeline-service.ttl-ms</name><value>86400000</value></property><property><name>yarn.timeline-service.http-cross-origin.enabled</name><value>true</value></property><property><name>yarn.resourcemanager.system-metrics-publisher.enabled</name><value>true</value></property><property><name>yarn.timeline-service.generic-application-history.enabled</name><value>true</value></property></configuration>

marped-site

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.job.emit-timeline-data</name><value>true</value></property></configuration>

至此，大喊一声“优雅~”。

标签：大数据 flink yarn

本文转载自: https://blog.csdn.net/myself_ning/article/details/125520608
版权归原作者 大宁哥 所有，如有侵权，请联系我们删除。

Flink配置Yarn日志聚合、配置历史日志。