0


Hadoop高可用集群部署(保姆级教程)

1.规划Hadoop高可用集群

2.部署和配置

(1)在/export/servers目录创建hadoop

(2)在/export/software目录安装Hadoop

(3)修正系统环境变量并验证

Vi /etc/profil

(4)一些基础文件配置

4.1修改hadoop-env.sh配置文件

4.2修改core-site.xml配置文件
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://ns1</value>
</property>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/export/data/hadoop-HA/hadoop/</value>
</property>
<property>
    <name>ha.zookeeper.quorum</name>
    <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
<property>
    <name>hadoop.http.staticuser.user</name>
    <value>root</value>
</property>
<property>
    <name>hadoop.proxyuser.root.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.root.groups</name>
    <value>*</value>
</property>
4.3修改hdfs-site.xml配置文件
<property>
    <name>dfs.replication</name>
    <value>3</value>
</property>
<property>
    <name>dfs.namenode.name.dir</name>
    <value>/export/data/hadoop/namenode/</value>
</property>
<property>
    <name>dfs.datanode.data.dir</name>
    <value>/export/data/hadoop/datanode/</value>
</property>
<property>
    <name>dfs.nameservices</name>
    <value>ns1</value>
</property>
<property>
    <name>dfs.ha.namenodes.ns1</name>
    <value>nn1,nn2</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns1.nn1</name>
    <value>hadoop1:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns1.nn1</name>
    <value>hadoop1:9870</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns1.nn2</name>
    <value>hadoop2:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns1.nn2</name>
    <value>hadoop2:9870</value>
</property>
<property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/ns1</value>
</property>
<property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/export/data/journaldata/</value>
</property>
<property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>
<property>
    <name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
    <name>dfs.permissions.enable</name>
    <value>false</value>
</property>
<property>
    <name>dfs.ha.fencing.methods</name>
    <value>
        sshfence
        shell(/bin/true)
    </value>
</property>
<property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/root/.ssh/id_rsa</value>
</property>
<property>
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
    <value>30000</value>
</property>
4.4修改mapred-site.xml配置文件
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>hadoop1:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>hadoop1:19888</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
4.5修改yarn-site.xml配置文件
<property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>hadoop1</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname.rm2</name>
    <value>hadoop2</value>
</property>
<property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<property>
    <name>yarn.log - aggregation-enable</name>
    <value>true</value>
</property>
<property>
    <name>yarn.log - aggregation.retain-seconds</name>
    <value>86400</value>
</property>
<property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.store.class</name>
    <value>
    org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
    </value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>hadoop1:8188</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>hadoop1:8130</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>hadoop2:8188</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>hadoop2:8130</value>
</property>
4.6修改workers配置文件

(5)分发Hadoop安装目录

(6)分发系统环境变量文件并对系统变量境进行初始化操作

(7)启动Hadoop高可用集群:

7.1确保每台主机启动:zookeeper zkServer.sh start
7.2启动JournalNode(每台主机):hdfs –-daemon start journalnode

7.3格式化HDFS文件系统:hdfs namenode -format

7.4同步NameNode:
scp -r /export/data/hadoop/namenode/hadoop2:/export/data/hadoop/

7.5格式化ZKFC:hdfs zkfc -formatZK

7.6启动HDFS:start-dfs.sh

7.7启动YARN:start-yarn.sh

(8)查看NameNode和ResourceManager状态信息

8.1jps查看进程信息

二、查看两个NameNode节点的状态信息。

1.浏览器输入hadoop1:9870

2.浏览器输入hadoop2:9870

3. 查看两个ResourceManager节点的状态信息

(1)浏览器输入hadoop1:8188

(2)浏览器输入hadoop2:8188

4. 测试Hadoop高可用集群的主备切换功能

(1)停止 hadoop1的 namenode

hdfs –-daemon stop namenode

(2)停止hadoop2的 resourcemanager

hdfs –-daemon stop resourcemanager

5.浏览器查看主备节点是否完成切换

(1)可以看出虚拟机Hadoop1的NameNode已经无法访问,虚拟机Hadoop2的NameNode由standby状态变更为active状态,因此说明HDFS中的NameNode实现了主备切换

(2)可以看出虚拟机Hadoop2的ResourceManager已经无法访问,虚拟机Hadoop1的ResourceManager由standby状态变更为active状态,因此说明YARN中的ResourceManager实现了主备切换

三、解决集群没有namenade

1.在每个主机执行以下三条命令(看自己目录,这儿是我的目录)

     rm -rf /export/data/hadoop/datanode/
     rm -rf /export/data/hadoop/namenode/
     rm -rf /export/servers/hadoop-HA/hadoop-3.3.6/logs/* 

2.进入以下目录(删除目录下ns1文件,每个主机都要执行)

 cd /export/data/journaldata

3.重新格式化namenode

     hdfs namenode -format

4.总结没有namenode大多原因(多次格式化nemenode,个人经验)

标签: hadoop 大数据 centos

本文转载自: https://blog.csdn.net/2401_82808073/article/details/143863802
版权归原作者 沉默好烦 所有, 如有侵权,请联系我们删除。

“Hadoop高可用集群部署(保姆级教程)”的评论:

还没有评论