1. 前置要求
- 操作系统:建议使用 CentOS 7 或 Ubuntu 20.04(本试验使用的是CentOS Linux release 7.9.2009 (Core))
- Java 环境:建议安装 Java 8 或更高版本。
- Hadoop:Hive 需要依赖 Hadoop 进行分布式存储,建议安装 Hadoop 3.x 版本(本实验采用的是hadoop3.3.6)。
- 数据库:Hive Metastore 需要数据库支持,建议使用 MySQL、PostgreSQL 或 Oracle。本实验采用的是MySQL 。
- 本服务器IP为192.168.128.130
2. 下载与解压 Hive
- 下载 Hive 4.0.1 版本的 tar 文件:
wget https://downloads.apache.org/hive/hive-4.0.1/apache-hive-4.0.1-bin.tar.gz
- 解压文件并移动到合适的安装路径:
tar -zxvf apache-hive-4.0.1-bin.tar.gzmv apache-hive-4.0.1-bin /opt/hive
- 设置环境变量,在
~/.bashrc
文件中添加以下行:exportHIVE_HOME=/opt/hiveexportPATH=$PATH:$HIVE_HOME/bin
然后使用source ~/.bashrc
使其生效。
3. 配置 Hive Metastore 数据库
- 创建 Hive 的元数据库。以下为 MySQL 配置的示例(安装数据库请参考别的文档):- 启动 MySQL 并登录:
mysql -u root -p
- 创建数据库:CREATEDATABASE hive_metastore;
- 创建用户并授权:CREATEUSER'hive'@'%' IDENTIFIED BY'Hive_123456';GRANTALLPRIVILEGESON hive_metastore.*TO'hive'@'%';FLUSH PRIVILEGES;
- 在 Hive 配置中设置数据库连接信息:- 编辑
hive-site.xml
文件,路径为$HIVE_HOME/conf/hive-site.xml
:
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>Location of default Hive warehouse where managed tables are stored.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.128.130:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connection URL to connect to the Hive Metastore database, here with MySQL as the backend database.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>JDBC driver class name for connecting to the Hive Metastore database.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username for connecting to the Hive Metastore database.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Hive_123456</value>
<description>Password for connecting to the Hive Metastore database.</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>When set to true, DataNucleus will automatically create tables and columns if they do not already exist in the schema.</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>Disables schema verification, allowing automatic updates of the Metastore schema without manual intervention.</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
<description>Enables HiveServer2 to execute queries as the user who submitted the query, rather than the HiveServer2 service user.</description>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
<description>Specifies the authentication mode for HiveServer2 connections. Options include NONE, KERBEROS, LDAP, PAM, and CUSTOM.</description>
</property>
</configuration>
- 确保数据库驱动已放置在
$HIVE_HOME/lib
目录下:cp /path/to/mysql-connector-java.jar $HIVE_HOME/lib/
4. 初始化 Metastore
使用以下命令初始化 Hive 元数据:
schematool -initSchema -dbType mysql
5. 启动 Hiveserver2
由于4.0.1版本已经废弃hive CLI,所以只能通过beeline连接,上述配置是允许使用未知用户连接
hive --service hiveserver2 &
- 查看10000端口是否启动成功
6.配置匿名用户登录
修改core-site.xml
<configuration><property><name>fs.defaultFS</name><value>hdfs://master:8020</value></property><property><name>hadoop.tmp.dir</name><value>/var/log/hadoop/tmp</value></property><property><name>hadoop.proxyuser.root.hosts</name><value>*</value></property><property><name>hadoop.proxyuser.root.groups</name><value>*</value></property></configuration>
7. 验证部署
beeline -u jdbc:hive2://192.168.128.130:10000 -n root
[root@master opt]# beeline -u jdbc:hive2://192.168.128.130:10000 -n root
Connecting to jdbc:hive2://192.168.128.130:10000
Connected to: Apache Hive (version 4.0.1)
Driver: Hive JDBC (version 4.0.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 4.0.1 by Apache Hive
0: jdbc:hive2://192.168.128.130:10000> create database test1;
INFO : Compiling command(queryId=root_20241029145312_c6b5e83b-f5a7-488b-b2ca-b3ef3336298a): create database test1
INFO : Semantic Analysis Completed (retrial =false)
INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=root_20241029145312_c6b5e83b-f5a7-488b-b2ca-b3ef3336298a); Time taken: 2.054 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20241029145312_c6b5e83b-f5a7-488b-b2ca-b3ef3336298a): create database test1
INFO : Starting task [Stage-0:DDL]in serial mode
INFO : Completed executing command(queryId=root_20241029145312_c6b5e83b-f5a7-488b-b2ca-b3ef3336298a); Time taken: 0.169 seconds
No rows affected (2.721 seconds)0: jdbc:hive2://192.168.128.130:10000> show databases;
INFO : Compiling command(queryId=root_20241029145320_63834d7a-1027-4ca4-933e-927dcccbebb8): show databases
INFO : Semantic Analysis Completed (retrial =false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=root_20241029145320_63834d7a-1027-4ca4-933e-927dcccbebb8); Time taken: 0.236 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20241029145320_63834d7a-1027-4ca4-933e-927dcccbebb8): show databases
INFO : Starting task [Stage-0:DDL]in serial mode
INFO : Completed executing command(queryId=root_20241029145320_63834d7a-1027-4ca4-933e-927dcccbebb8); Time taken: 0.11 seconds
+----------------+
| database_name |
+----------------+
| default || test1 |
+----------------+
2 rows selected (0.605 seconds)0: jdbc:hive2://192.168.128.130:10000>
7. 两种连接方式
- 通过hive命令进行连接
[root@master opt]# hive
Beeline version 4.0.1 by Apache Hive
beeline>!connect jdbc:hive2://192.168.128.130:10000 -n root
Connecting to jdbc:hive2://192.168.128.130:10000
Connected to: Apache Hive (version 4.0.1)
Driver: Hive JDBC (version 4.0.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.128.130:10000>!quit
Closing: 0: jdbc:hive2://192.168.128.130:10000
- 通过beeline命令直接连接
[root@master opt]# beeline -u jdbc:hive2://192.168.128.130:10000 -n root
Connecting to jdbc:hive2://192.168.128.130:10000
Connected to: Apache Hive (version 4.0.1)
Driver: Hive JDBC (version 4.0.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 4.0.1 by Apache Hive
0: jdbc:hive2://192.168.128.130:10000>
本文转载自: https://blog.csdn.net/xhcx_25/article/details/143328896
版权归原作者 xhcx_25 所有, 如有侵权,请联系我们删除。
版权归原作者 xhcx_25 所有, 如有侵权,请联系我们删除。