0


大数据bug-sqoop(二:sqoop同步mysql数据到hive进行字段限制。)

一:sqoop脚本解析。

#!/bin/sh
mysqlHost=$1
mysqlUserName=$2
mysqlUserPass=$3
mysqlDbName=$4sql=$5
split=$6
target=$7
hiveDbName=$8
hiveTbName=$9
partFieldName=${10}
inputDate=${11}
 
echo ${mysqlHost}
echo ${mysqlUserName}
echo ${mysqlUserPass}
echo ${mysqlDbName}
echo ${sql}
echo ${split}
echo ${target}
echo ${hiveDbName}
echo ${hiveTbName}
echo ${partFieldName}
echo ${inputDate}
 
 
sqoop import"-Dorg.apache.sqoop.splitter.allow_text_splitter=true" \
--connect jdbc:mysql://${mysqlHost}/${mysqlDbName}?tinyInt1isBit=false \--username ${mysqlUserName} \--password ${mysqlUserPass} \--query "${sql}" \--split-by ${split}  \--target-dir ${target}  \--hive-overwrite \--delete-target-dir \--fields-terminated-by '\t' \--null-string "" \--hive-import \--null-non-string "false" \--hive-database ${hiveDbName} \--hive-table ${hiveTbName} \--hive-drop-import-delims \--hive-partition-key ${partFieldName} \--hive-partition-value ${inputDate}
  1. 新增加三个参数 1. –query “${sql}” \ 这个参数添加对应表的sql语句。注意结尾必须添加 $CONDITIONS ,必须添加where 条件,如果没有where条件,写成where 1=1。案例如下:"select id,key_id,key_type,'' as encryption_cert_chain,device_type,account_id_hash,user_identifier,user_id,request_id,device_id,vehicle_id,vehicle_identifier,device_info,device_oem_id,key_data,import_immobilizer_token_request_data,friendly_name,digital_key_status,digital_key_status_in_vehicle,digital_key_status_in_device,key_valid_from,key_valid_to,shared_keys,shareable_keys,manufacturer,state_in_vehicle,state_in_device,key_status_for_vehicle,'' as device_enc_public_key,'' as digital_key_public_key,'' as digital_key_cert,'' as instance_ca_cert,entitlement,rights,slot_id,protocol_type,group_identifier,verify_result,deleted,create_time,update_time,fsn,action_type from kts_key where 1=1 and \$CONDITIONS" 2. –split-by ${split} \ 这个参数是切分数据的分割字段,一般来讲是mysql的主键。3. –target-dir ${target} \ 这个参数指一个路径。可以随意指定一个目录,

二:命令。

  sh  test.sh 99.99.99.99:3306 \
bigdata 123222 ssss  "select  id,key_id,key_type,'' as encryption_cert_chain,device_type,account_id_hash,user_identifier,user_id,request_id,device_id,vehicle_id,vehicle_identifier,device_info,device_oem_id,key_data,import_immobilizer_token_request_data,friendly_name,digital_key_status,digital_key_status_in_vehicle,digital_key_status_in_device,key_valid_from,key_valid_to,shared_keys,shareable_keys,manufacturer,state_in_vehicle,state_in_device,key_status_for_vehicle,'' as device_enc_public_key,'' as digital_key_public_key,'' as digital_key_cert,'' as instance_ca_cert,entitlement,rights,slot_id,protocol_type,group_identifier,verify_result,deleted,create_time,update_time,fsn,action_type from kts_key where 1=1 and  \$CONDITIONS"  id "/tmp/test" ods ods_okp p_dt 2023-08-15
标签: 大数据 hive bug

本文转载自: https://blog.csdn.net/weixin_43446246/article/details/132302242
版权归原作者 宇智波云 所有, 如有侵权,请联系我们删除。

“大数据bug-sqoop(二:sqoop同步mysql数据到hive进行字段限制。)”的评论:

还没有评论