【头歌】Hive基本查询操作（二）答案

本专栏已收集头歌大数据所有答案以供参考

如果帮助您，免费您点个免费的赞

Hive排序示例

在大数据处理中，Hive 是一个常用的工具，它允许用户使用类似于 SQL 的语言进行数据查询和分析。以下是一个关于如何在 Hive 中实现数据排序的示例，该示例展示了如何创建数据库、表，加载数据，并最终对特定日期的数据按指定字段进行排序。

第1关：Hive排序

答案复制点击评测

----------禁止修改----------
create database if not exists mydb;
use mydb;
create table if not exists total(
tradedate string,
tradetime string,
securityid string,
bidpx1 string,
bidsize1 int,
offerpx1 string,
bidsize2 int)
row format delimited fields terminated by ','
stored as textfile;
truncate table total;
load data local inpath '/root/files' into table total;
----------禁止修改----------

----------begin----------
select securityid,sum(bidsize1) s from total where tradedate ='20130722' group by securityid order by s desc limit 3;
----------end----------

这个示例中，我们查询了
20130722
日期的数据，并按
bidsize1
的总和降序排列，最后取前三条记录。

Hive数据类型转换示例

当处理不同类型的数据时，正确地转换数据类型对于确保计算的准确性至关重要。下面的例子展示了如何在 Hive 中转换数据类型，并对特定日期的数据执行聚合操作。

第2关：Hive数据类型和类型转换

答案复制点击评测

----------禁止修改----------
create database if not exists mydb;
use mydb;
create table if not exists total(
tradedate string,
tradetime string,
securityid string,
bidpx1 string,
bidsize1 int,
offerpx1 string,
bidsize2 int)
row format delimited fields terminated by ','
stored as textfile;
truncate table total;
load data local inpath '/root/files' into table total;
----------禁止修改----------

----------begin----------
select securityid,sum(bidsize1*cast(bidpx1 as float)) from total where tradedate='20130725' group by securityid;
----------end----------

在这个示例中，我们将
bidpx1
从字符串类型转换为浮点类型，然后计算其与
bidsize1
的乘积，并对结果进行求和。

Hive抽样查询示例

在处理大规模数据集时，通过抽样来获取数据的概览是一种有效的方法。下面的例子说明了如何在 Hive 中创建一个桶表，并从该表中抽取样本数据进行分析。

第3关：Hive抽样查询

答案复制点击评测

----------禁止修改----------
create database if not exists mydb;
use mydb;
create table if not exists total(
tradedate string,
tradetime string,
securityid string,
bidpx1 string,
bidsize1 int,
offerpx1 string,
bidsize2 int)
row format delimited fields terminated by ','
stored as textfile;
truncate table total;
load data local inpath '/root/files' into table total;
drop table if exists total_bucket;
----------禁止修改----------

----------begin----------
create table if not exists total_bucket(
tradedate string,
securityid string,
bidsize1 int,
bidsize2 int
)clustered by(securityid) into 6 buckets
row format delimited fields terminated by ','
stored as textfile;
set hive.enforce.bucketing = true;
insert overwrite table total_bucket
select tradedate,securityid,bidsize1,bidsize2
from total;

select tradedate,securityid,sum(bidsize1+bidsize2) 
from total_bucket tablesample(bucket 2 out of 2 on securityid) 
group by tradedate,securityid;
----------end----------

在这个示例中，我们创建了一个桶表
total_bucket
，并设置了 6 个桶。然后，我们将数据插入到桶表中，并通过
TABLESAMPLE
子句从第二个桶中抽取样本数据进行分析。

标签： hive hadoop 大数据

本文转载自: https://blog.csdn.net/gjw3037109961/article/details/140727452
版权归原作者 Seven_Two2 所有，如有侵权，请联系我们删除。

【头歌】Hive基本查询操作（二）答案

第1关：Hive排序

第2关：Hive数据类型和类型转换

第3关：Hive抽样查询

发表评论

“【头歌】Hive基本查询操作（二）答案”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航

【头歌】Hive基本查询操作（二） 答案

第1关：Hive排序

第2关：Hive数据类型和类型转换

第3关：Hive抽样查询

发表评论

“【头歌】Hive基本查询操作（二） 答案”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航

【头歌】Hive基本查询操作（二）答案

“【头歌】Hive基本查询操作（二）答案”的评论: