前置依赖部署
Linux环境下Hadoop3.4.0(最新版本)集群部署-CSDN博客
Linux环境下部署MySQL8数据库-CSDN博客
官方地址:Apache Hive
重大变化:Hive4.0.0中,HiveCLI已经被弃用了,代替它的是Beeline。所以,启动Hive4.0.0时,会默认进入Beeline命令行界面,而不是HiveCLI
1、下载安装包:apache-hive-4.0.0-bin.tar.gz
下载路径:Index of /hive/hive-4.0.0
2、解压软件
将apache-hive-4.0.0-bin.tar.gz上传至linux系统/usr/local/soft/路径下
cd /usr/local/soft/
tar -zxvf apache-hive-4.0.0-bin.tar.gz
3、修改系统环境变量
vim /etc/profile
添加内容:
export HIVE_HOME=/usr/local/soft/apache-hive-4.0.0-bin
export PATH=$PATH:$HADOOP_HOME/sbin:$HIVE_HOME/bin
保存:
source /etc/profile
4、修改hive环境变量
cd /usr/local/soft/apache-hive-4.0.0-bin/bin/
编辑hive-config.sh文件
vi hive-config.sh
新增内容:
export JAVA_HOME=/usr/local/soft/jdk1.8.0_381
export HIVE_HOME=/usr/local/soft/apache-hive-4.0.0-bin
export HADOOP_HOME=/usr/local/soft/hadoop-3.4.0
export HIVE_CONF_DIR=/usr/local/soft/apache-hive-4.0.0-bin/conf
5、拷贝hive配置文件
cd /usr/local/soft/apache-hive-4.0.0-bin/conf/
cp hive-default.xml.template hive-site.xml
6、修改Hive配置文件,找到对应的位置进行修改
可以直接全部替换
<configuration>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root123</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.1.5:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/usr/local/soft/apache-hive-4.0.0-bin/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>system:java.io.tmpdir</name>
<value>/usr/local/soft/apache-hive-4.0.0-bin/iotmp</value>
<description/>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/soft/apache-hive-4.0.0-bin/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/soft/apache-hive-4.0.0-bin/tmp/${system:user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/soft/apache-hive-4.0.0-bin/tmp/${system:user.name}/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
<property>
<name>hive.metastore.db.type</name>
<value>mysql</value>
<description>
Expects one of [derby, oracle, mysql, mssql, postgres].
Type of database used by the metastore. Information schema & JDBCStorageHandler depend on it.
</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether to print the names of the columns in query output.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.1.11:9083</value>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>node11</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
</configuration>
7、上传mysql驱动包到/usr/local/soft/apache-hive-4.0.0-bin/lib/文件夹下
驱动包:mysql-connector-java-8.0.15.zip,解压后从里面获取jar包
8、确保 mysql数据库中有名称为hive的数据库,字符集须设置latin1,否则的话hive表删除会卡死
9、初始化初始化元数据库
schematool -dbType mysql -initSchema
10、确保Hadoop启动
node11上执行以下命令
start-all.sh
11、启动服务
重大变化:Hive4.0.0中,HiveCLI已经被弃用了,代替它的是Beeline。所以,启动Hive4.0.0时,会默认进入Beeline命令行界面,而不是HiveCLI
使用Beeline命令行连接Hive服务之前,需要确保以下服务已经启动和配置:
服务
说明
执行命令
Hadoop
Hive需要依赖Hadoop服务来运行,因此需要确保Hadoop服务已经启动,并且配置文件中的相关参数正确。
start-all.sh
Hive Metastore
Hive Metastore是Hive的元数据存储服务,需要确保Metastore服务已经启动,并且在Beeline的配置文件中正确配置了Metastore的地址。
hive --service metastore
HiveServer2
HiveServer2是Hive的查询服务,需要确保HiveServer2服务已经启动,并且在Beeline的配置文件中正确配置了HiveServer2的地址。
hive --service hiveserver2
启动元数据服务
hive --service metastore
或(此种方式后续不需要另起Shell)
hive --service metastore 2>&1 &
启动hiveserver2服务(另起shell窗口)
hiveserver2
或
hive --service hiveserver2
12、启动beeline客户端
在新的窗口里面执行hive 或者beeline命令
输入:
!connect jdbc:hive2://node11:10000
或者直接输入如下内容启动客户端
beeline -u jdbc:hive2://node11:10000 -n root
FAQ
Exception in thread "main" MetaException(message:JDOFatalInternalException: Index/candidate part #0 for CTLGS
already set
Root cause: org.datanucleus.exceptions.NucleusException: Index/candidate part #0 for CTLGS
already set)
解决方案:
cp /usr/local/soft/hadoop-3.4.0/share/hadoop/hdfs/lib/guava-27.0-jre.jar /usr/local/soft/apache-hive-4.0.0-bin/lib/
或检查hadoop的core-site.xml是否有如下内容
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
beeline启动失败
解决方案:可尝试多连接几次
版权归原作者 数智侠 所有, 如有侵权,请联系我们删除。