实验三 熟悉常用的HBase操作
一、实验目的
(1)理解HBase在Hadoop体系结构中的角色;
(2)熟练使用HBase操作常用的Shell命令;
(3)熟悉HBase操作常用的Java API。
二、实验平台
操作系统:centos7;
Hadoop版本:3.3;
HBase版本:2.2.2;
JDK版本:1.8;
Java IDE:IDEA。
三、实验内容和要求
(一)编程实现以下指定功能,并用Hadoop提供的HBase Shell命令完成相同任务:
(1) 列出HBase所有的表的相关信息,例如表名、创建时间等;
(2) 在终端打印出指定的表的所有记录数据;
(3) 向已经创建好的表添加和删除指定的列族或列;
(4) 清空指定的表的所有记录数据;
(5) 统计表的行数。
(二)HBase数据库操作
1 现有以下关系型数据库中的表和数据,要求将其转换为适合于HBase存储的表并插入数据:
学生表(Student)
学号(S_No)姓名(S_Name)性别(S_Sex)年龄(S_Age)2015001Zhangsanmale232015003Maryfemale222015003Lisimale24
课程表(Course)
课程号(C_No)课程名(C_Name)学分(C_Credit)123001Math2.0123002Computer5.0123003English3.0
选课表(SC)
学号(SC_Sno)课程号(SC_Cno)成绩(SC_Score)201500112300186201500112300369201500212300277201500212300399201500312300198201500312300295
2 请编程实现以下功能:
(1) createTable(String tableName, String[] fields)
创建表,参数tableName为表的名称,字符串数组fields为存储记录各个字段名称的数组。要求当HBase已经存在名为tableName的表的时候,先删除原有的表,然后再创建新的表。
packageMain;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.TableName;importorg.apache.hadoop.hbase.client.ColumnFamilyDescriptor;importorg.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;importorg.apache.hadoop.hbase.client.Connection;importorg.apache.hadoop.hbase.client.Admin;importorg.apache.hadoop.hbase.client.ConnectionFactory;importorg.apache.hadoop.hbase.client.TableDescriptorBuilder;importorg.apache.hadoop.hbase.util.Bytes;importjava.io.IOException;publicclass main {publicstaticConfiguration configuration;publicstaticConnection connection;publicstaticAdmin admin;publicstaticvoidinit(){//建立连接
configuration =HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://127.0.0.1:8020/hbase");try{
connection =ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();}catch(IOException e){
e.printStackTrace();}}publicstaticvoidclose(){//关闭连接try{if(admin !=null){
admin.close();}if(connection !=null){
connection.close();}}catch(IOException e){
e.printStackTrace();}}publicstaticvoidcreateTable(String tableName,String[] fields)throwsIOException{init();TableName tablename =TableName.valueOf(tableName);//定义表名if(admin.tableExists(tablename)){System.out.println("table is exists!");
admin.disableTable(tablename);
admin.deleteTable(tablename);}TableDescriptorBuilder tableDescriptor =TableDescriptorBuilder.newBuilder(tablename);for(int i=0;i<fields.length;i++){ColumnFamilyDescriptor family =ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes(fields[i])).build();
tableDescriptor.setColumnFamily(family);}
admin.createTable(tableDescriptor.build());close();}publicstaticvoidmain(String[] args){String[] fields ={"id","score"};try{createTable("test",fields);}catch(IOException e){
e.printStackTrace();}}}
运行结果
(2)addRecord(String tableName, String row, String[] fields, String[] values)
向表tableName、行row(用S_Name表示)和字符串数组fields指定的单元格中添加对应的数据values。其中,fields中每个元素如果对应的列族下还有相应的列限定符的话,用“columnFamily:column”表示。例如,同时向“Math”、“Computer Science”、“English”三列添加成绩时,字符串数组fields为{“Score:Math”, ”Score:Computer Science”, ”Score:English”},数组values存储这三门课的成绩。
packageMain;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.TableName;importorg.apache.hadoop.hbase.client.Admin;importorg.apache.hadoop.hbase.client.Connection;importorg.apache.hadoop.hbase.client.ConnectionFactory;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.client.Table;publicclass main {publicstaticConfiguration configuration;publicstaticConnection connection;publicstaticAdmin admin;publicstaticvoidinit(){//建立连接
configuration =HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://127.0.0.1:8020/hbase");try{
connection =ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();}catch(IOException e){
e.printStackTrace();}}publicstaticvoidclose(){//关闭连接try{if(admin !=null){
admin.close();}if(connection !=null){
connection.close();}}catch(IOException e){
e.printStackTrace();}}publicstaticvoidaddRecord(String tableName,String row,String[] fields,String[] values)throwsIOException{init();//连接HbaseTable table = connection.getTable(TableName.valueOf(tableName));//表连接Put put =newPut(row.getBytes());//创建put对象for(int i=0;i<fields.length;i++){String[] cols = fields[i].split(":");if(cols.length ==1){
put.addColumn(fields[i].getBytes(),"".getBytes(),values[i].getBytes());}else{
put.addColumn(cols[0].getBytes(),cols[1].getBytes(),values[i].getBytes());}
table.put(put);//向表中添加数据}close();//关闭连接}publicstaticvoidmain(String[] args){String[] fields ={"Score:Math","Score:Computer Science","Score:English"};String[] values ={"85","80","90"};try{addRecord("grade","S_Name",fields,values);}catch(IOException e){
e.printStackTrace();}}}
3)scanColumn(String tableName, String column)
浏览表tableName某一列的数据,如果某一行记录中该列数据不存在,则返回null。要求当参数column为某一列族名称时,如果底下有若干个列限定符,则要列出每个列限定符代表的列的数据;当参数column为某一列具体名称(例如“Score:Math”)时,只需要列出该列
packageMain;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.Cell;importorg.apache.hadoop.hbase.CellUtil;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.TableName;importorg.apache.hadoop.hbase.client.Admin;importorg.apache.hadoop.hbase.client.Connection;importorg.apache.hadoop.hbase.client.ConnectionFactory;importorg.apache.hadoop.hbase.client.Result;importorg.apache.hadoop.hbase.client.ResultScanner;importorg.apache.hadoop.hbase.client.Scan;importorg.apache.hadoop.hbase.client.Table;importorg.apache.hadoop.hbase.util.Bytes;publicclass main {publicstaticConfiguration configuration;publicstaticConnection connection;publicstaticAdmin admin;publicstaticvoidinit(){//建立连接
configuration =HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:8020/hbase");try{
connection =ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();}catch(IOException e){
e.printStackTrace();}}publicstaticvoidclose(){//关闭连接try{if(admin !=null){
admin.close();}if(connection !=null){
connection.close();}}catch(IOException e){
e.printStackTrace();}}publicstaticvoidshowResult(Result result){Cell[] cells = result.rawCells();for(int i=0;i<cells.length;i++){System.out.println("RowName:"+newString(CellUtil.cloneRow(cells[i])));//打印行键System.out.println("ColumnName:"+newString(CellUtil.cloneQualifier(cells[i])));//打印列名System.out.println("Value:"+newString(CellUtil.cloneValue(cells[i])));//打印值System.out.println("Column Family:"+newString(CellUtil.cloneFamily(cells[i])));//打印列簇System.out.println();}}publicstaticvoidscanColumn(String tableName,String column){init();try{Table table = connection.getTable(TableName.valueOf(tableName));Scan scan =newScan();
scan.addFamily(Bytes.toBytes(column));ResultScanner scanner = table.getScanner(scan);for(Result result = scanner.next();result !=null;result = scanner.next()){showResult(result);}}catch(IOException e){
e.printStackTrace();}finally{close();}}publicstaticvoidmain(String[] args){scanColumn("test","id");}}
运行结果
(4)modifyData(String tableName, String row, String column)
修改表tableName,行row(可以用学生姓名S_Name表示),列column指定的单元格的数据。
packageMain;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.TableName;importorg.apache.hadoop.hbase.client.Admin;importorg.apache.hadoop.hbase.client.Connection;importorg.apache.hadoop.hbase.client.ConnectionFactory;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.client.Table;publicclass main{publicstaticConfiguration configuration;publicstaticConnection connection;publicstaticAdmin admin;publicstaticvoidinit(){//建立连接
configuration =HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");try{
connection =ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();}catch(IOException e){
e.printStackTrace();}}publicstaticvoidclose(){//关闭连接try{if(admin !=null){
admin.close();}if(connection !=null){
connection.close();}}catch(IOException e){
e.printStackTrace();}}publicstaticvoidmodifyData(String tableName,String row,String column,String value)throwsIOException{init();Table table = connection.getTable(TableName.valueOf(tableName));Put put =newPut(row.getBytes());String[] cols = column.split(":");if(cols.length ==1){
put.addColumn(column.getBytes(),"".getBytes(), value.getBytes());}else{
put.addColumn(cols[0].getBytes(), cols[1].getBytes(), value.getBytes());}
table.put(put);close();}publicstaticvoidmain(String[] args){try{modifyData("test","1","score","100");}catch(Exception e){
e.printStackTrace();}}}
运行结果
此时row为1的score已经改为100
(5)deleteRow(String tableName, String row)
删除表tableName中row指定的行的记录。
packageMain;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.TableName;importorg.apache.hadoop.hbase.client.Admin;importorg.apache.hadoop.hbase.client.Connection;importorg.apache.hadoop.hbase.client.ConnectionFactory;importorg.apache.hadoop.hbase.client.Delete;importorg.apache.hadoop.hbase.client.Table;publicclass main {publicstaticConfiguration configuration;publicstaticConnection connection;publicstaticAdmin admin;publicstaticvoidinit(){//建立连接
configuration =HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:8020/hbase");try{
connection =ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();}catch(IOException e){
e.printStackTrace();}}publicstaticvoidclose(){//关闭连接try{if(admin !=null){
admin.close();}if(connection !=null){
connection.close();}}catch(IOException e){
e.printStackTrace();}}publicstaticvoiddeleteRow(String tableName,String row)throwsIOException{init();Table table = connection.getTable(TableName.valueOf(tableName));Delete delete =newDelete(row.getBytes());
table.delete(delete);close();}publicstaticvoidmain(String[] args){try{deleteRow("test","2");}catch(Exception e){
e.printStackTrace();}}}
此时row=2已经被删除
出现的问题
问题一
在安装hbase后master-status用浏览器无法打开,而此时Hmaster和HregionServer,QuorumPeerMain已经启动
问题二
解决方法
问题一
在连接Hbase后使用list命令后发现如下
发现原因是因为我启动Hbase使用的不是hbase自带的zookeeper,而是自己独立安装的,在hbase-env.sh下增加
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
重新启动后
问题解决
问题二
发现是maven导入包出现问题,再将hbase-client包换为2.5.3后问题解决
版权归原作者 ADBOEX 所有, 如有侵权,请联系我们删除。