0


HBase Java API 开发:批量操作 第3关:批量导入数据至HBase

每一次只添加一个数据显然不像是大数据开发,在开发项目的时候也肯定会涉及到大量的数据操作。

使用

Java

进行批量数据操作,其实就是循环的在

Put

对象中添加数据最后在通过

Table

对象提交。

如何进行批量操作呢,讲到批量操作,相信大家肯定第一时间会想到循环?

没错,使用循环确实就可以添加多个数据了,示例:

Table tableStep3 = connection.getTable(tableStep3Name);
// 循环添加数据
byte[] row = Bytes.toBytes("20001");
Put put = new Put(row);
for (int i = 1; i <= 4; i++) {
byte[] columnFamily = Bytes.toBytes("data");
byte[] qualifier = Bytes.toBytes(String.valueOf(i));
byte[] value = Bytes.toBytes("value" + i);
put.addColumn(columnFamily, qualifier, value);
}

tableStep3.put(put);

代码执行结果:

可以发现,这一段代码向同一个行中添加了四列数据。

我们要添加多行数据应该如何处理呢,我猜你肯定想到了:使用集合!

List<Put> puts = new ArrayList<>();
// 循环添加数据
for (int i = 1; i <= 4; i++) {
byte[] row = Bytes.toBytes("row" + i);
Put put = new Put(row);
byte[] columnFamily = Bytes.toBytes("data");
byte[] qualifier = Bytes.toBytes(String.valueOf(i));
byte[] value = Bytes.toBytes("value" + i);
put.addColumn(columnFamily, qualifier, value);
puts.add(put);
}
Table table = connection.getTable(tableName);
table.put(puts);

上述代码向

HBase

中添加了四行数据,结合上次实训,可以发现

table

对象的

put()

方法是一个重载方法既可以接收

Put

对象也可以接收

Put

集合

添加完数据的表结构:

编程要求

好了,到你啦,在右侧编辑器

begin-end

中编写

Java

代码向

HBase

stu

表(表需要自己创建)中添加数据如下:
表名行键列族:列值stu20181122basic_info:name阿克蒙德stu20181122basic_info:gendermalestu20181122basic_info:birthday1987-05-23stu20181122basic_info:connecttel:13974036666stu20181122basic_info:addressHuNan-ChangShastu20181122school_info:collegeChengXingstu20181122school_info:classclass 1 grade 2stu20181122school_info:objectSoftwarestu20181123basic_info:name萨格拉斯stu20181123basic_info:gendermalestu20181123basic_info:birthday1986-05-23stu20181123basic_info:connecttel:18774036666stu20181123basic_info:addressHuNan-ChangShastu20181123school_info:collegeChengXingstu20181123school_info:classclass 2 grade 2stu20181123school_info:objectSoftware
可以发现这里有两个列族,如何添加多个列族呢?

在我们之前讲到的建表中

setColumnFamily(family)

方法,这个方法是可以调用多次的。

package step3;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableDescriptors;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.client.TableDescriptor;
import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
import org.apache.hadoop.hbase.util.Bytes;
public class Task {
 public void batchPut()throws Exception{
   /********* Begin *********/
   Configuration config = new Configuration();
   Connection conn = ConnectionFactory.createConnection(config);
   Admin admin = conn.getAdmin();
   // 建表
   TableName tableName = TableName.valueOf(Bytes.toBytes("stu"));
   TableDescriptorBuilder builder = TableDescriptorBuilder.newBuilder(tableName);
   ColumnFamilyDescriptor family = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("basic_info")).build();
   ColumnFamilyDescriptor family2 = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("school_info")).build();
   builder.setColumnFamily(family);
   builder.setColumnFamily(family2);
   admin.createTable(builder.build());
   List<Put> puts = new ArrayList<>();
   String[] rows = {"20181122","20181123"};
   String[][] basic_infos = {{"阿克蒙德","male","1987-05-23","tel:139********","HUNan-ChangSha"},{"萨格拉斯","male","1986-05-23","tel:187********","HUNan-ChangSha"}};
   String[] basic_colums = {"name","gender","birthday","connect","address"};
   String[][] school_infos = {{"ChengXing","class 1 grade 2","Software"},{"ChengXing","class 2 grade 2","Software"}};
   String[] school_colums = {"college","class","object"};
   for (int x = 0; x < rows.length; x++) {
     // 循环添加数据
     Put put = new Put(Bytes.toBytes(rows[x]));
     for (int i = 0; i < basic_infos.length; i++) {
       byte[] columFamily = Bytes.toBytes("basic_info");
       byte[] qualifier = Bytes.toBytes(basic_colums[i]);
       byte[] value = Bytes.toBytes(basic_infos[x][i]);
       put.addColumn(columFamily, qualifier, value);
     }
     for (int i = 0; i < school_infos.length; i++) {
       byte[] columFamily = Bytes.toBytes("school_info");
       byte[] qualifier = Bytes.toBytes(school_colums[i]);
       byte[] value = Bytes.toBytes(school_infos[x][i]);
       put.addColumn(columFamily, qualifier, value);
     }
     puts.add(put);
   }
   Table table = conn.getTable(tableName);
   table.put(puts);
   /********* End *********/
 }
}

本文转载自: https://blog.csdn.net/qq_61604164/article/details/128318122
版权归原作者 是草莓熊吖 所有, 如有侵权,请联系我们删除。

“HBase Java API 开发:批量操作 第3关:批量导入数据至HBase”的评论:

还没有评论