简单配置HDFS

1.配置映射文件（/etc/hosts）

vim /etc/hosts

2.配置hadoop-env.sh

（1）查看JAVA_HOME的安装目录

echo $JAVA_HOME

#【复制此路径】

（2）修改此文件中JAVA_HOME的值，其余不变，并保存

vim hadoop-env.sh

----修改JAVA_HOME的值为刚才得到的路径值（export JAVA_HOME=/opt/modlue/jdk）----
Esc+：wq

3. 配置core-site.xml

（1）从core-default.xml中搜索【fs.default】,将这个参数的整体复制到core-site.xml中，

<!--默认的配置文件 -->
<property>
  <name>fs.defaultFS</name>
    <!--默认本地的根目录-->
  <value>file:///</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

修改里面的参数：

<!--修改后的配置文件 -->
<property>
  <name>fs.defaultFS</name>
  <!--value配置NameNode所在的主机节点,以及NameNode与DataNode,client通信的端口号-->
  <value>hdfs://master:9000<value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

（2）配置hadoop.tmp.dir参数值

<!--- datanode存放数据，namenode存放元数据 -->
<property>
  <name>hadoop.tmp.dir</name>
    <!--modlue-->
  <value>/opt/moudle/hadoop/data/tmp</value>
  <description>A base for other temporary directories.</description>
</property>

4.配置 hdfs-site.xml

<property>
  <name>dfs.namenode.secondary.https-address</name>
  <value>slave2:50090</value>
  <description>
    The secondary namenode HTTPS server address and port.
  </description>
</property>

（2）配置副本数

<property>
  <name>dfs.replication</name>
  <!--- 配置hadoop所需的副本数 -->
  <value>3</value>
  <description>Default block replication. 
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

3.在hadoop目录下，修改slaves文件(告诉集群，从节点有几个)：

vim /slaves

加入以下内容：

master
slave1
slave2

4.配置hadoop的环境变量

（1）【查看Hadoop的安装位置】

echo $HADOOP_HOME  # 复制此路径

（2）【打开环境变量配置文件】

vim /etc/profile

（3）【修改此文件】

HADOOP_HOME=/opt/modlue/hadoop
PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_HOME PATH   # 提升为全局变量

（4）【重启配置文件】

source  etc/profile

（5）【将配置好的hadoop分发给slave1和slave2】

scp -r hadoop-2.7.7 root@slave1:$PWD
scp -r hadoop-2.7.7 root@slave2:$PWD

（6）【输入密码后开始传输】

（7）但由于slave1和slave2上没有配置HADOOP_HOME

则需在传输完成后复制master上复制etc/profile文件到slave1和slave2

scp /etc/profile root@slave1:/etc/
scp /etc/profile root@slave2:/etc/

使用source命令让slave1和slave2配置文件生效

（8）在slave1和slave2上创建hadoop快捷方式

cd /opt/modlue/   # 进入hadoop的安装目录

ln -s hadoop-2.7.7 hadoop # 创建快捷方式

五、配置免密登录

cd ~
ll -la  # a查看隐藏文件
cd .ssh
ssh-keygen -t rsa   # 生成秘钥
【三次回车】

### master->master,master->slave1,master->slave2 ###
ssh-copy-id -i id_rsa.pub master # 复制到需要免密登录的三台主机上面
yes
123456
# 将公钥复制到远程主机的auhorized_keys
cat ./id_rsa.pub >> ./authorized_keys
ssh-copy-id -i id_rsa.pub slave1
yes
123456
ssh-copy-id -i id_rsa.pub slave2
yes
123456

[! 配置免密登录最好是root---root，普通用户------普通用户]

【测试：在master上面输入】

ssh slave1
exit
ssh slave1
exit
# 不用输入密码直接可以进入表示配置成功！

将配置好的目录分发给slave1和slave2

scp -r 要发送的文件 远程主机名称:发送的位置

六、格式化HDFS分布式文件系统

hdfs namenode -format # 特别注意：这条命令只能输入一次
# 作用：生成namenode存放元数据，datanode存放数据的目录

tree # 是否生成current和name文件夹

hadoop-daemon.sh start namenode  # 启动NameNode

hadoop-daemon.sh start datanode # 启动datanode

查看是否启动

jps

在slave1和slave2上启动DataNode

data文件夹是启动datanode时生成的

使用浏览器访问集群管理页面：

http://master:50070

标签： hdfs hadoop 大数据

本文转载自: https://blog.csdn.net/weixin_64314147/article/details/127950059
版权归原作者 爱摔跤的辰 所有，如有侵权，请联系我们删除。

发表评论

“简单配置HDFS”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航