0


hadoop3.3.6集群搭建

hadoop3.3.6集群搭建

一、前置条件

  1. 服务器:3台,1主2从,centos7IPhostname说明192.168.108.137centos137master192.168.108.138centos138node192.168.108.139centos139node三台服务之间能通过hostname访问# hostname修改hostnamectl set-hostname centos137#三台修改hosts文件,添加以下命令192.168.108.137 centos137192.168.108.138 centos138192.168.108.139 centos139#重启reboot
  2. hadoop集群(版本号2.2+),集群中安装有HDFS服务
  3. JDK1.8+(推荐自己安装JDK,需要JAVA_HOME环境变量)

使用3.3.6版本

二、角色分配

节点部署角色目录
节点ipNNSNNDNRMNM****HScentos137192.168.108.137√√centos138192.168.108.138√√√√centos139192.168.108.139√√
角色说明
HDFSYARNMapReduceNameNode(NN)ResourceManager(RM)HistoryServer(HS)SecondNameNode (SNN)NodeManager(NM)DataNode (DN)
组件默认端口清单
组件端口说明HDFS8020NameNode50010,50020、50075DataNodeYARN8032ResourceManager8088Web界面8040NodeManager协议8042Web界面MapReduce10020HistoryServer协议19888Web界面Hadoop Common49152~65535Inter-Process CommunicationZooKeeper2181Hadoop集群的协调服务Hadoop Web界面9870NameNode Web界面8088ResourceManager Web界面:19888JobHistoryServer Web界面Hadoop RPC8019Remote Procedure Call
安装包:国内镜像地址:Index of /apache/hadoop/common/hadoop-3.3.6 (tsinghua.edu.cn)

2.1软件安装(所有服务器)

ssh免密登录
useradd hadoop
passwd hadoop

忽略提示密码太短的警告

# 切换用户su hadoop 

先输入密码自己登录下自己生成.ssh目录

ssh localhost

生成秘钥

ssh-keygen -t rsa -P''-f ~/.ssh/id_rsa
分发密钥(主节点上执行)
#后面是想要免密登录的节点主机名
ssh-copy-id centos137
ssh-copy-id centos138
ssh-copy-id centos139

测试centos137登录各个节点是否免密例如登录centos138

ssh centos138

在所有虚拟机根目录下新建文件夹export,export文件夹中新建data、servers和software文件

mkdir-p /export/data
mkdir-p /export/servers
mkdir-p /export/software
1.解压
tar-zxvf hadoop-3.3.6.tar.gz -C /export/servers/
2.配置环境变量
vi /etc/profile
# 文末追加exportHADOOP_HOME=/export/servers/hadoop-3.3.6
exportPATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# 环境变量生效source /etc/profile
3.验证
[root@centos137 servers]# hadoop version
Hadoop 3.3.6
Source code repository https://github.com/apache/hadoop.git -r 1be78238728da9266a4f88195058f08fd012bf9c
Compiled by ubuntu on 2023-06-18T08:22Z
Compiled on platform linux-x86_64
Compiled with protoc 3.7.1
From source with checksum 5652179ad55f76cb287d9c633bb53bbd
This command was run using /export/servers/hadoop-3.3.6/share/hadoop/common/hadoop-common-3.3.6.jar

** 注意环境切换 **

2.2主节点配置

进入hadoop安装目录
cd /export/servers/hadoop-3.3.6/etc/hadoop
修改配置hadoop-env.sh
vim hadoop-env.sh
# 添加JAVA_HOMEexportJAVA_HOME=/export/servers/jdk

修改workers

vim workers

添加
centos137
centos138
centos139

内容:

[hadoop@centos138 sbin]$ cat workers 
centos137
centos138
centos139
修改core-site.xml
vim core-site.xml
  • 配置HDFS的URI和临时目录
  • HDFS网页登录使用的静态用户
  • 添加以下配置
<configuration><!--setting HDFS--><property><name>fs.defaultFS</name><!--setting namenode--><value>hdfs://centos137:9000</value></property><!--setting temp folder,default:/tem/hadoop-${user.name}--><property><name>hadoop.tmp.dir</name><value>/export/servers/hadoop-3.3.6/tmp</value></property><!-- HDFS web loggin static  user --><property><name>hadoop.http.staticuser.user</name><value>hadoop</value></property></configuration>
修改hdfs-site.xml文件
vim hdfs-site.xml
  • 指定HDFS的数量
  • 配置secondary namenode
  • 添加以下配置
<configuration><!--setting HDFS number--><property><name>dfs.replication</name><value>3</value></property><!--setting secondary namenode--><property><name>dfs.namenode.secondary.http-address</name><value>centos138:50090</value></property></configuration>
修改mapred-site.xml文件
vim mapred-site.xml
  • 指定MapReduce运行时的框架,这里指定在YARN上,默认在local
  • 历史服务器端地址

添加配置​

<configuration><!-- 执行MapReduce的方式:yarn/local --><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>centos138:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>centos138:19888</value></property></configuration>
修改yarn-site.xml文件
  • 指定YARN集群的管理者(ResourceManager)的地址
  • 指定MR走shuffle
  • 开启日志聚集功能
  • 设置日志聚集服务器地址
  • 设置日志保留时间为 7 天
vim yarn-site.xml
<configuration><property><name>yarn.resourcemanager.hostname</name><value>centos137</value></property><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><!-- open log aggregation --><property><name>yarn.log-aggregation-enable</name><value>true</value></property><!-- log erver --><property><name>yarn.log.server.url</name><value>http://centos138:19888/jobhistory/logs</value></property><!-- log save days--><property><name>yarn.log-aggregation.retain-seconds</name><value>604800</value></property></configuration>
目录授权

将安装目录的权限赋予hadoop用户

chown-R hadoop:hadoop /export/

2.3文件分发

将master的配置分发到从node

scp-r /export/servers centos138:/export
scp-r /export/servers centos139:/export

2.4启动hadoop集群

格式化NameNode

hdfs namenode -format

格式化NameNode会产生新的集群id,导致DataNode中记录的的集群id和刚生成的NameNode的集群id不 一致,DataNode找不到NameNode。所以,格式化NameNode时,一定要先删除每个节点的data目录和logs日志,然后再格式化NameNode,一般只在搭建初期执行这一次。

在master(centos137)执行

# 启动集群
/export/servers/hadoop-3.3.6/sbin/start-all.sh
# 停止集群
/export/servers/hadoop-3.3.6/sbin/stop-all.sh
[hadoop@centos137 hadoop-3.3.6]$ /export/servers/hadoop-3.3.6/sbin/start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [centos137]
Starting datanodes
Starting secondary namenodes [centos138]
Starting resourcemanager
Starting nodemanagers

启动过程有错误输出,核对文件分发后的目录是否正确

启动历史服务(centos138节点)

 mapred --daemon start historyserver

或者HDFS和YARN单独启动

# 启动
start-dfs.sh
start-yarn.sh
# 停止
stop-dfs.sh
stop-yarn.sh

集群部署验证
每个节点执行jps命令验证hdfs集群启动的角色是否正确

2.5集群部署验证

  • 每个节点执行jps命令验证hdfs集群启动的角色是否正确执行:jpscentos138角色: NN、RM、NM、DN[hadoop@centos137 hadoop-3.3.6]$ jps34082 ResourceManager34228 NodeManager33638 DataNode33497 NameNode

37790 Jps


centos138角色:SNN、DN、NM、HS

```bash
[hadoop@centos138 sbin]$ jps
32530 SecondaryNameNode
26679 JobHistoryServer
33321 Jps
32733 NodeManager
32383 DataNode

centos139角色:NM、DN

[hadoop@centos139 hadoop]$ jps
51088 NodeManager
51685 Jps
50823 DataNode

根据组件默认端口清单访问WEB UI

34228 NodeManager
33638 DataNode
33497 NameNode
37790 Jps


centos138角色:SNN、DN、NM、HS

```bash
[hadoop@centos138 sbin]$ jps
32530 SecondaryNameNode
26679 JobHistoryServer
33321 Jps
32733 NodeManager
32383 DataNode

centos139角色:NM、DN

[hadoop@centos139 hadoop]$ jps
51088 NodeManager
51685 Jps
50823 DataNode

根据组件默认端口清单访问WEB UI

参考链接:
https://blog.csdn.net/weixin_43655425/article/details/134751084
https://blog.csdn.net/tang5615/article/details/120382513


本文转载自: https://blog.csdn.net/admin_cx/article/details/140908892
版权归原作者 admin_cx 所有, 如有侵权,请联系我们删除。

“hadoop3.3.6集群搭建”的评论:

还没有评论