0


Hadoop高可用集群部署(保姆级教程)

1.规划Hadoop高可用集群

2.部署和配置

(1)在/export/servers目录创建hadoop

(2)在/export/software目录安装Hadoop

(3)修正系统环境变量并验证

Vi /etc/profil

(4)一些基础文件配置

4.1修改hadoop-env.sh配置文件

4.2修改core-site.xml配置文件
  1. <property>
  2. <name>fs.defaultFS</name>
  3. <value>hdfs://ns1</value>
  4. </property>
  5. <property>
  6. <name>hadoop.tmp.dir</name>
  7. <value>/export/data/hadoop-HA/hadoop/</value>
  8. </property>
  9. <property>
  10. <name>ha.zookeeper.quorum</name>
  11. <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
  12. </property>
  13. <property>
  14. <name>hadoop.http.staticuser.user</name>
  15. <value>root</value>
  16. </property>
  17. <property>
  18. <name>hadoop.proxyuser.root.hosts</name>
  19. <value>*</value>
  20. </property>
  21. <property>
  22. <name>hadoop.proxyuser.root.groups</name>
  23. <value>*</value>
  24. </property>
4.3修改hdfs-site.xml配置文件
  1. <property>
  2. <name>dfs.replication</name>
  3. <value>3</value>
  4. </property>
  5. <property>
  6. <name>dfs.namenode.name.dir</name>
  7. <value>/export/data/hadoop/namenode/</value>
  8. </property>
  9. <property>
  10. <name>dfs.datanode.data.dir</name>
  11. <value>/export/data/hadoop/datanode/</value>
  12. </property>
  13. <property>
  14. <name>dfs.nameservices</name>
  15. <value>ns1</value>
  16. </property>
  17. <property>
  18. <name>dfs.ha.namenodes.ns1</name>
  19. <value>nn1,nn2</value>
  20. </property>
  21. <property>
  22. <name>dfs.namenode.rpc-address.ns1.nn1</name>
  23. <value>hadoop1:9000</value>
  24. </property>
  25. <property>
  26. <name>dfs.namenode.http-address.ns1.nn1</name>
  27. <value>hadoop1:9870</value>
  28. </property>
  29. <property>
  30. <name>dfs.namenode.rpc-address.ns1.nn2</name>
  31. <value>hadoop2:9000</value>
  32. </property>
  33. <property>
  34. <name>dfs.namenode.http-address.ns1.nn2</name>
  35. <value>hadoop2:9870</value>
  36. </property>
  37. <property>
  38. <name>dfs.namenode.shared.edits.dir</name>
  39. <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/ns1</value>
  40. </property>
  41. <property>
  42. <name>dfs.journalnode.edits.dir</name>
  43. <value>/export/data/journaldata/</value>
  44. </property>
  45. <property>
  46. <name>dfs.ha.automatic-failover.enabled</name>
  47. <value>true</value>
  48. </property>
  49. <property>
  50. <name>dfs.client.failover.proxy.provider.ns1</name>
  51. <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  52. </property>
  53. <property>
  54. <name>dfs.permissions.enable</name>
  55. <value>false</value>
  56. </property>
  57. <property>
  58. <name>dfs.ha.fencing.methods</name>
  59. <value>
  60. sshfence
  61. shell(/bin/true)
  62. </value>
  63. </property>
  64. <property>
  65. <name>dfs.ha.fencing.ssh.private-key-files</name>
  66. <value>/root/.ssh/id_rsa</value>
  67. </property>
  68. <property>
  69. <name>dfs.ha.fencing.ssh.connect-timeout</name>
  70. <value>30000</value>
  71. </property>
4.4修改mapred-site.xml配置文件
  1. <property>
  2. <name>mapreduce.framework.name</name>
  3. <value>yarn</value>
  4. </property>
  5. <property>
  6. <name>mapreduce.jobhistory.address</name>
  7. <value>hadoop1:10020</value>
  8. </property>
  9. <property>
  10. <name>mapreduce.jobhistory.webapp.address</name>
  11. <value>hadoop1:19888</value>
  12. </property>
  13. <property>
  14. <name>yarn.app.mapreduce.am.env</name>
  15. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  16. </property>
  17. <property>
  18. <name>mapreduce.map.env</name>
  19. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  20. </property>
  21. <property>
  22. <name>mapreduce.reduce.env</name>
  23. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  24. </property>
4.5修改yarn-site.xml配置文件
  1. <property>
  2. <name>yarn.resourcemanager.ha.enabled</name>
  3. <value>true</value>
  4. </property>
  5. <property>
  6. <name>yarn.resourcemanager.cluster-id</name>
  7. <value>yarn</value>
  8. </property>
  9. <property>
  10. <name>yarn.resourcemanager.ha.rm-ids</name>
  11. <value>rm1,rm2</value>
  12. </property>
  13. <property>
  14. <name>yarn.resourcemanager.hostname.rm1</name>
  15. <value>hadoop1</value>
  16. </property>
  17. <property>
  18. <name>yarn.resourcemanager.hostname.rm2</name>
  19. <value>hadoop2</value>
  20. </property>
  21. <property>
  22. <name>yarn.resourcemanager.zk-address</name>
  23. <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
  24. </property>
  25. <property>
  26. <name>yarn.nodemanager.aux-services</name>
  27. <value>mapreduce_shuffle</value>
  28. </property>
  29. <property>
  30. <name>yarn.log - aggregation-enable</name>
  31. <value>true</value>
  32. </property>
  33. <property>
  34. <name>yarn.log - aggregation.retain-seconds</name>
  35. <value>86400</value>
  36. </property>
  37. <property>
  38. <name>yarn.resourcemanager.recovery.enabled</name>
  39. <value>true</value>
  40. </property>
  41. <property>
  42. <name>yarn.resourcemanager.store.class</name>
  43. <value>
  44. org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
  45. </value>
  46. </property>
  47. <property>
  48. <name>yarn.resourcemanager.webapp.address.rm1</name>
  49. <value>hadoop1:8188</value>
  50. </property>
  51. <property>
  52. <name>yarn.resourcemanager.scheduler.address.rm1</name>
  53. <value>hadoop1:8130</value>
  54. </property>
  55. <property>
  56. <name>yarn.resourcemanager.webapp.address.rm2</name>
  57. <value>hadoop2:8188</value>
  58. </property>
  59. <property>
  60. <name>yarn.resourcemanager.scheduler.address.rm2</name>
  61. <value>hadoop2:8130</value>
  62. </property>
4.6修改workers配置文件

(5)分发Hadoop安装目录

(6)分发系统环境变量文件并对系统变量境进行初始化操作

(7)启动Hadoop高可用集群:

7.1确保每台主机启动:zookeeper zkServer.sh start
7.2启动JournalNode(每台主机):hdfs –-daemon start journalnode

7.3格式化HDFS文件系统:hdfs namenode -format

7.4同步NameNode:
scp -r /export/data/hadoop/namenode/hadoop2:/export/data/hadoop/

7.5格式化ZKFC:hdfs zkfc -formatZK

7.6启动HDFS:start-dfs.sh

7.7启动YARN:start-yarn.sh

(8)查看NameNode和ResourceManager状态信息

8.1jps查看进程信息

二、查看两个NameNode节点的状态信息。

1.浏览器输入hadoop1:9870

2.浏览器输入hadoop2:9870

3. 查看两个ResourceManager节点的状态信息

(1)浏览器输入hadoop1:8188

(2)浏览器输入hadoop2:8188

4. 测试Hadoop高可用集群的主备切换功能

(1)停止 hadoop1的 namenode

hdfs –-daemon stop namenode

(2)停止hadoop2的 resourcemanager

hdfs –-daemon stop resourcemanager

5.浏览器查看主备节点是否完成切换

(1)可以看出虚拟机Hadoop1的NameNode已经无法访问,虚拟机Hadoop2的NameNode由standby状态变更为active状态,因此说明HDFS中的NameNode实现了主备切换

(2)可以看出虚拟机Hadoop2的ResourceManager已经无法访问,虚拟机Hadoop1的ResourceManager由standby状态变更为active状态,因此说明YARN中的ResourceManager实现了主备切换

三、解决集群没有namenade

1.在每个主机执行以下三条命令(看自己目录,这儿是我的目录)

  1. rm -rf /export/data/hadoop/datanode/
  2. rm -rf /export/data/hadoop/namenode/
  3. rm -rf /export/servers/hadoop-HA/hadoop-3.3.6/logs/*

2.进入以下目录(删除目录下ns1文件,每个主机都要执行)

  1. cd /export/data/journaldata

3.重新格式化namenode

  1. hdfs namenode -format

4.总结没有namenode大多原因(多次格式化nemenode,个人经验)

标签: hadoop 大数据 centos

本文转载自: https://blog.csdn.net/2401_82808073/article/details/143863802
版权归原作者 沉默好烦 所有, 如有侵权,请联系我们删除。

“Hadoop高可用集群部署(保姆级教程)”的评论:

还没有评论