0


Hbase 查询命令 条件筛选

Hbase 查询命令 条件筛选

方便测试

建一下表

hbase(main):001:0> create 'student','c1'

不写namespace的话就是默认在default里

查询有哪些namespace

hbase(main):001:0> list_namespace

查看表的全量数据

hbase(main):002:0> scan 'default:student'

放入一些测试数据

put 'student','1001','c1:id','1001'
put 'student','1002','c1:id','1002'
put 'student','1003','c1:id','1003'
put 'student','1004','c1:id','1004'
put 'student','1005','c1:id','1005'

只查询一行

hbase(main):025:0> scan 'student',LIMIT=>1
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001

查询表的总记录数

count 'student'

按写入的时间戳查询数据

scan 'student', {COLUMN =>'c1', TIMERANGE =>[1658827317000,1658913717000]}

查询值为1002的记录

hbase(main):004:0> scan 'student',FILTER=>"ValueFilter(=,'binary:1002')"
ROW                        COLUMN+CELL
 1002column=c1:id, timestamp=1658911989184, value=10021 row(s)in0.1060 seconds

查询c1:id列的值为1002的

hbase(main):006:0> scan 'student',COLUMNS=>'c1:id',FILTER=>"ValueFilter(=,'binary:1002')"
ROW                        COLUMN+CELL
 1002column=c1:id, timestamp=1658911989184, value=10021 row(s)in0.0340 seconds

查询值包含100的记录,就跟sql的模糊匹配一样

hbase(main):007:0> scan 'student',FILTER=>"ValueFilter(=,'substring:100')"
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1005                      column=c1:id, timestamp=1658911989788, value=1005
5 row(s) in 0.0470 seconds

为了方便列的其他查询,多放入一个列

put 'student','1001','c1:sex','1'
put 'student','1002','c1:sex','2'
put 'student','1003','c1:sex','1'
put 'student','1004','c1:sex','2'
put 'student','1005','c1:sex','1'

hbase(main):015:0* scan 'student'
ROW                        COLUMN+CELL
 1001column=c1:id, timestamp=1658911986336, value=10011001column=c1:sex, timestamp=1658914149713, value=11002column=c1:id, timestamp=1658911989184, value=10021002column=c1:sex, timestamp=1658914152500, value=21003column=c1:id, timestamp=1658911989217, value=10031003column=c1:sex, timestamp=1658914152535, value=11004column=c1:id, timestamp=1658911989243, value=10041004column=c1:sex, timestamp=1658914152563, value=21005column=c1:id, timestamp=1658911989788, value=10051005column=c1:sex, timestamp=1658914153242, value=15 row(s)in0.0390 seconds

查询列为id打头的值

hbase(main):019:0> scan 'student',FILTER=>"ColumnPrefixFilter('id')"
ROW                        COLUMN+CELL
 1001column=c1:id, timestamp=1658911986336, value=10011002column=c1:id, timestamp=1658911989184, value=10021003column=c1:id, timestamp=1658911989217, value=10031004column=c1:id, timestamp=1658911989243, value=10041005column=c1:id, timestamp=1658911989788, value=10055 row(s)in0.0270 seconds

各项查询的条件是可以叠加的,比如下面这个

查询列为id打头且值为1003的

hbase(main):020:0> scan 'student',FILTER=>"ColumnPrefixFilter('id') AND ValueFilter(=,'binary:1003')"
ROW                        COLUMN+CELL
 1003column=c1:id, timestamp=1658911989217, value=10031 row(s)in0.0550 seconds

查询rowkey为100打头的

hbase(main):021:0> scan 'student',FILTER=>"PrefixFilter('100')"
ROW                        COLUMN+CELL
 1001column=c1:id, timestamp=1658911986336, value=10011001column=c1:sex, timestamp=1658914149713, value=11002column=c1:id, timestamp=1658911989184, value=10021002column=c1:sex, timestamp=1658914152500, value=21003column=c1:id, timestamp=1658911989217, value=10031003column=c1:sex, timestamp=1658914152535, value=11004column=c1:id, timestamp=1658911989243, value=10041004column=c1:sex, timestamp=1658914152563, value=21005column=c1:id, timestamp=1658911989788, value=10051005column=c1:sex, timestamp=1658914153242, value=1

查询rowkey为100打头的且不同返回列信息

hbase(main):022:0> scan 'student',FILTER=>"PrefixFilter('100') AND KeyOnlyFilter()"
ROW                        COLUMN+CELL
 1001column=c1:id, timestamp=1658911986336, value=1001column=c1:sex, timestamp=1658914149713, value=1002column=c1:id, timestamp=1658911989184, value=1002column=c1:sex, timestamp=1658914152500, value=1003column=c1:id, timestamp=1658911989217, value=1003column=c1:sex, timestamp=1658914152535, value=1004column=c1:id, timestamp=1658911989243, value=1004column=c1:sex, timestamp=1658914152563, value=1005column=c1:id, timestamp=1658911989788, value=1005column=c1:sex, timestamp=1658914153242, value=5 row(s)in0.0670 seconds

从特定行开始查三行

hbase(main):006:0> scan 'student',{STARTROW=>'1002',LIMIT=>3}
ROW                        COLUMN+CELL
 1002column=c1:id, timestamp=1658911989184, value=10021002column=c1:sex, timestamp=1658914152500, value=21003column=c1:id, timestamp=1658911989217, value=10031003column=c1:sex, timestamp=1658914152535, value=11004column=c1:id, timestamp=1658911989243, value=10041004column=c1:sex, timestamp=1658914152563, value=23 row(s)in0.0300 seconds

获取特定的行

hbase(main):007:0> get 'student','1001'
COLUMN                     CELL
 c1:id                     timestamp=1658911986336, value=1001
 c1:sex                    timestamp=1658914149713, value=12 row(s)in0.0170 seconds

默认的查询是正序,倒叙使用REVERSED => TRUE

scan 'student',{REVERSED => TRUE,LIMIT=>1}

以上这些命令基本满足大部分的查询需求了

标签: hbase 大数据

本文转载自: https://blog.csdn.net/qq_38151907/article/details/126017298
版权归原作者 MIDSUMMER_yy 所有, 如有侵权,请联系我们删除。

“Hbase 查询命令 条件筛选”的评论:

还没有评论