Hive SQL常用函数

一、日期函数

1、将时间戳转化为日期

from_unixtime(bigint unixtime,string format)

举例：from_unixtime(1237573801,‘yyyy-MM-dd HH:mm:ss’)

常用string format 格式

‘yyyy-MM-dd HH:mm:ss’ 年月日时分秒格式

‘yyyy-MM-dd HH:mm’ 年月日时分格式

‘yyyy-MM-dd HH’ 年月日时格式

‘yyyy-MM-dd’ 年月日格式

2、将日期转化为时间戳

unix_timestamp(stringdate)

举例：unix_timestamp(‘2020-01-01 06:06:00’,‘yyyy-MM-dd HH:mm:ss’)

3、日期比较函数

datediff(string enddate,string startdate) 日期间隔天数，结束日期和开始日期间隔天数

举例：datediff(‘2020-02-02’,‘2020-02-01’)

结果：1

4、日期增加函数

date_add(string startdate,int days) 返回开始日期startdate增加days天后的日期。

举例：date_add(‘2020-02-12’,10)

结果：2020-02-22

5、日期减少函数

date_sub(string startdate,int days) 返回开始日期startdate减少days天后的日期。

举例：date_sub(‘2020-02-12’,10)

结果：2020-02-02

6、返回日期时间字段中的日期部分

to_date(‘yyyy-MM-dd HH:mm:ss’)

二、条件函数

1、if函数

if(条件表达式，结果1，结果2) 当条件为true时，返回结果1，否则结果2

举例：if(2 > 1,‘是’,‘否’)

结果：是

2、条件判断函数case when

case 表达式 when 条件1 then 结果1 when 条件2 then 结果2 else 结果3 end as 字段名

举例：

3、非空查找函数

coalesce(T v1,T v2) 返回参数中的第一个非空值；如果所有值都为NULL，那么返回NULL

举例：coalesce(a,b,c) 如果a为null，则选择b，如果b为null，则选择c;

如果a不为null，则选择a;

如果a b c都均为null，则返回null。

三、窗口函数

1、累计计算窗口函数

sum(A) over (partition by B order by C rows between D1 and D2)

avg(A) over (partition by B order by C rows between D1 and D2)

rows between unbounded preceding and current row 包括本行和之前所有的行

rows between current row preceding and unbounded following 包括本行和之后所有的行

rows between 3 preceding and current row 包括本行以内和前三行

rows between 3 preceding and 1 following 从前三行到下一行（5行）

2、分区排序窗口函数

row_number() over (partition by A order by B)

依次排序，序号不会重复

rank() over (partition by A order by B)

字段值相同，序号相同，跳跃排序，如果有两个第一名，接下来是第三名。

dense_rank() over (partition by A order by B)

字段值相同，序号相同，连续排序，如果有两个第一名，接下来是第二名。

3、分组排序窗口函数

ntile(n) over (partition by A order by B)

n为要分组的数量

4、偏移分析窗口函数

lag(exp_str,offset,defval) over (partition by A order by B)

同一次查询中取出同一字段的前N行数据

lead(exp_str,offset,defval) over (partition by A order by B)

同一次查询中取出同一字段的后N行数据

exp_str：字段名称

offset：偏移量，即上一个或者上N个的值，以lag函数为列子，假设当前行再表中排在第5行，则offset为3，则表示我们要找的数据行就是表中的第二行（即5-3=2），offset默认值为1。

defval：当两个函数取上N/下N个值，当在表中从当前行位置向前数N行已经超出了表的范围时，函数会将defval这个参数作为函数的返回值，若没有指定默认值，则返回null。

四、聚合函数

sum(): 求和

avg(): 平均值

max(): 最大值

min(): 最小值

count(): 计数

count(distinct …) 去重计数

五、hive解析json字符串

get_json_object(string json_string,string path)

string json_string：填写json对象变量

sting path：第二个参数使用
      表示 
     
    
      j 
     
    
      s 
     
    
      o 
     
    
      n 
     
    
      变量标识，通常为 
     
    
   
     表示json变量标识，通常为 
    
   
 表示json变量标识，通常为.key形式
举例：假设fruit为fruits表中的字段，其结构为fruit:{“type”:“apple”,“weight”:2,“price”:8,“color”:“red”} select get_json_object(fruit,‘$.type’)

from fruits

结果：apple

六、截取字符串函数

substr(string,int start,int len)

举例：substr(‘2020-01-01’,1,7)

t":2,“price”:8,“color”:“red”} select get_json_object(fruit,‘$.type’)

from fruits

结果：apple

六、截取字符串函数

substr(string,int start,int len)

举例：substr(‘2020-01-01’,1,7)

结果：2020-01

标签： hive sql

本文转载自: https://blog.csdn.net/m0_58725148/article/details/127594109
版权归原作者 _自有天意 所有，如有侵权，请联系我们删除。

Hive SQL常用函数

一、日期函数

二、条件函数

三、窗口函数

四、聚合函数

五、hive解析json字符串

六、截取字符串函数

六、截取字符串函数

发表评论

“Hive SQL常用函数”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航