0


Spark SQL----INSERT TABLE

Spark SQL----INSERT TABLE

一、描述

INSERT语句将新行插入表中或覆盖表中的现有数据。插入的行可以由值表达式指定,也可以由查询结果指定。

二、语法

INSERT[INTO| OVERWRITE ][TABLE] table_identifier [ partition_spec ][( column_list )]
    { VALUES( { value|NULL } [,...])[,(...)]| query }

INSERTINTO[TABLE] table_identifier REPLACEWHERE boolean_expression query

三、参数

  • table_identifier 指定一个表名称,可以选择使用数据库名称对其进行限定。 语法:[ database_name. ] table_name
  • partition_spec 一个可选参数,用于指定分区的键值对的逗号分隔列表。请注意,可以在分区规范中使用类型化的文字(例如,date’2019-01-02’)。 语法:PARTITION ( partition_col_name = partition_col_val [ , … ] )
  • column_list 一个可选参数,用于指定属于table_identifier表的列的逗号分隔列表。Spark会根据指定的column列表对输入查询的列进行重新排序,以匹配表schema。 注:当前行为有一些局限性: - 所有指定的列都应存在于表中,并且不能相互重复。它包括除静态分区列之外的所有列。- column列表的大小应该与VALUES子句或查询中的数据大小完全相同。
  • VALUES ( { value | NULL } [ , … ] ) [ , ( … ) ] 指定要插入的值。可以插入显式指定的值或NULL。必须使用逗号分隔子句中的每个值。可以指定多个值集来插入多行。
  • boolean_expression 指定计算结果为布尔型的任何表达式。可以使用逻辑运算符(AND, OR)将两个或多个表达式组合在一起。
  • query 生成要插入的行的查询。它可以是以下格式之一: - SELECT语句- Inline Table语句- FROM语句

四、例子

4.1 Insert Into

使用VALUES子句进行单行插入

CREATETABLE students (name VARCHAR(64), address VARCHAR(64))USING PARQUET PARTITIONED BY(student_id INT);INSERTINTO students VALUES('Amy Smith','123 Park Ave, San Jose',111111);SELECT*FROM students;+---------+----------------------+----------+|     name|               address|student_id|+---------+----------------------+----------+|Amy Smith|123 Park Ave, San Jose|111111|+---------+----------------------+----------+

使用VALUES子句进行多行插入

INSERTINTO students VALUES('Bob Brown','456 Taylor St, Cupertino',222222),('Cathy Johnson','789 Race Ave, Palo Alto',333333);SELECT*FROM students;+-------------+------------------------+----------+|         name|                 address|student_id|+-------------+------------------------+----------+|    Amy Smith|123 Park Ave, San Jose|111111|+-------------+------------------------+----------+|    Bob Brown|456 Taylor St, Cupertino|222222|+-------------+------------------------+----------+|Cathy Johnson|789 Race Ave, Palo Alto|333333|+--------------+-----------------------+----------+

使用SELECT语句插入

-- Assuming the persons table has already been created and populated.SELECT*FROM persons;+-------------+--------------------------+---------+|         name|                   address|      ssn|+-------------+--------------------------+---------+|Dora Williams|134 Forest Ave, Menlo Park|123456789|+-------------+--------------------------+---------+|  Eddie Davis|245 Market St, Milpitas|345678901|+-------------+--------------------------+---------+INSERTINTO students PARTITION(student_id =444444)SELECT name, address FROM persons WHERE name ="Dora Williams";SELECT*FROM students;+-------------+--------------------------+----------+|         name|                   address|student_id|+-------------+--------------------------+----------+|    Amy Smith|123 Park Ave, San Jose|111111|+-------------+--------------------------+----------+|    Bob Brown|456 Taylor St, Cupertino|222222|+-------------+--------------------------+----------+|Cathy Johnson|789 Race Ave, Palo Alto|333333|+-------------+--------------------------+----------+|Dora Williams|134 Forest Ave, Menlo Park|444444|+-------------+--------------------------+----------+

使用TABLE语句插入

-- Assuming the visiting_students table has already been created and populated.SELECT*FROM visiting_students;+-------------+---------------------+----------+|         name|              address|student_id|+-------------+---------------------+----------+|Fleur Laurent|345 Copper St, London|777777|+-------------+---------------------+----------+|Gordon Martin|779 Lake Ave, Oxford|888888|+-------------+---------------------+----------+INSERTINTO students TABLE visiting_students;SELECT*FROM students;+-------------+--------------------------+----------+|         name|                   address|student_id|+-------------+--------------------------+----------+|    Amy Smith|123 Park Ave, San Jose|111111|+-------------+--------------------------+----------+|    Bob Brown|456 Taylor St, Cupertino|222222|+-------------+--------------------------+----------+|Cathy Johnson|789 Race Ave, Palo Alto|333333|+-------------+--------------------------+----------+|Dora Williams|134 Forest Ave, Menlo Park|444444|+-------------+--------------------------+----------+|Fleur Laurent|345 Copper St, London|777777|+-------------+--------------------------+----------+|Gordon Martin|779 Lake Ave, Oxford|888888|+-------------+--------------------------+----------+

使用FROM语句插入

-- Assuming the applicants table has already been created and populated.SELECT*FROM applicants;+-----------+--------------------------+----------+---------+|       name|                   address|student_id|qualified|+-----------+--------------------------+----------+---------+|Helen Davis|469 Mission St, San Diego|999999|true|+-----------+--------------------------+----------+---------+|   Ivy King|367 Leigh Ave, Santa Clara|101010|false|+-----------+--------------------------+----------+---------+| Jason Wang|908 Bird St, Saratoga|121212|true|+-----------+--------------------------+----------+---------+INSERTINTO students
     FROM applicants SELECT name, address, student_id WHERE qualified =true;SELECT*FROM students;+-------------+--------------------------+----------+|         name|                   address|student_id|+-------------+--------------------------+----------+|    Amy Smith|123 Park Ave, San Jose|111111|+-------------+--------------------------+----------+|    Bob Brown|456 Taylor St, Cupertino|222222|+-------------+--------------------------+----------+|Cathy Johnson|789 Race Ave, Palo Alto|333333|+-------------+--------------------------+----------+|Dora Williams|134 Forest Ave, Menlo Park|444444|+-------------+--------------------------+----------+|Fleur Laurent|345 Copper St, London|777777|+-------------+--------------------------+----------+|Gordon Martin|779 Lake Ave, Oxford|888888|+-------------+--------------------------+----------+|  Helen Davis|469 Mission St, San Diego|999999|+-------------+--------------------------+----------+|   Jason Wang|908 Bird St, Saratoga|121212|+-------------+--------------------------+----------+

为分区列值使用类型化日期文字插入

CREATETABLE students (name STRING, address  STRING) PARTITIONED BY(birthday DATE);INSERTINTO students PARTITION(birthday =date'2019-01-02')VALUES('Amy Smith','123 Park Ave, San Jose');SELECT*FROM students;+-------------+-------------------------+-----------+|         name|                  address|   birthday|+-------------+-------------------------+-----------+|    Amy Smith|123 Park Ave, San Jose|2019-01-02|

使用column列表插入

INSERTINTO students (address, name, student_id)VALUES('Hangzhou, China','Kent Yao',11215016);SELECT*FROM students WHERE name ='Kent Yao';+---------+----------------------+----------+|     name|               address|student_id|+---------+----------------------+----------+|Kent Yao |       Hangzhou, China|11215016|+---------+----------------------+----------+

同时使用分区spec和column列表进行插入

INSERTINTO students PARTITION(student_id =11215017)(address, name)VALUES('Hangzhou, China','Kent Yao Jr.');SELECT*FROM students WHERE student_id =11215017;+------------+----------------------+----------+|        name|               address|student_id|+------------+----------------------+----------+|Kent Yao Jr.|       Hangzhou, China|11215017|+------------+----------------------+----------+

4.2 Insert Overwrite

使用VALUES子句插入

-- Assuming the students table has already been created and populated.SELECT*FROM students;+-------------+--------------------------+----------+|         name|                   address|student_id|+-------------+--------------------------+----------+|    Amy Smith|123 Park Ave, San Jose|111111||    Bob Brown|456 Taylor St, Cupertino|222222||Cathy Johnson|789 Race Ave, Palo Alto|333333||Dora Williams|134 Forest Ave, Menlo Park|444444||Fleur Laurent|345 Copper St, London|777777||Gordon Martin|779 Lake Ave, Oxford|888888||  Helen Davis|469 Mission St, San Diego|999999||   Jason Wang|908 Bird St, Saratoga|121212|+-------------+--------------------------+----------+INSERT OVERWRITE students VALUES('Ashua Hill','456 Erica Ct, Cupertino',111111),('Brian Reed','723 Kern Ave, Palo Alto',222222);SELECT*FROM students;+----------+-----------------------+----------+|      name|                address|student_id|+----------+-----------------------+----------+|Ashua Hill|456 Erica Ct, Cupertino|111111||Brian Reed|723 Kern Ave, Palo Alto|222222|+----------+-----------------------+----------+

使用SELECT语句插入

-- Assuming the persons table has already been created and populated.SELECT*FROM persons;+-------------+--------------------------+---------+|         name|                   address|      ssn|+-------------+--------------------------+---------+|Dora Williams|134 Forest Ave, Menlo Park|123456789|+-------------+--------------------------+---------+|  Eddie Davis|245 Market St, Milpitas|345678901|+-------------+--------------------------+---------+INSERT OVERWRITE students PARTITION(student_id =222222)SELECT name, address FROM persons WHERE name ="Dora Williams";SELECT*FROM students;+-------------+--------------------------+----------+|         name|                   address|student_id|+-------------+--------------------------+----------+|   Ashua Hill|456 Erica Ct, Cupertino|111111|+-------------+--------------------------+----------+|Dora Williams|134 Forest Ave, Menlo Park|222222|+-------------+--------------------------+----------+

使用REPLACE WHERE语句插入

-- Assuming the persons and persons2 table has already been created and populated.SELECT*FROM persons;+-------------+--------------------------+---------+|         name|                   address|      ssn|+-------------+--------------------------+---------+|Dora Williams|134 Forest Ave, Menlo Park|123456789|+-------------+--------------------------+---------+|  Eddie Davis|245 Market St, Milpitas|345678901|+-------------+--------------------------+---------+SELECT*FROM persons2;+-------------+--------------------------+---------+|         name|                   address|      ssn|+-------------+--------------------------+---------+|   Ashua Hill|456 Erica Ct, Cupertino|432795921|+-------------+--------------------------+---------+-- in an atomic operation, 1) delete rows with ssn = 123456789 and 2) insert rows from persons2 INSERTINTO persons REPLACEWHERE ssn =123456789SELECT*FROM persons2

SELECT*FROM persons;+-------------+--------------------------+---------+|         name|                   address|      ssn|+-------------+--------------------------+---------+|  Eddie Davis|245 Market St, Milpitas|345678901|+-------------+--------------------------+---------+|   Ashua Hill|456 Erica Ct, Cupertino|432795921|+-------------+--------------------------+---------+

使用TABLE语句插入

-- Assuming the visiting_students table has already been created and populated.SELECT*FROM visiting_students;+-------------+---------------------+----------+|         name|              address|student_id|+-------------+---------------------+----------+|Fleur Laurent|345 Copper St, London|777777|+-------------+---------------------+----------+|Gordon Martin|779 Lake Ave, Oxford|888888|+-------------+---------------------+----------+INSERT OVERWRITE students TABLE visiting_students;SELECT*FROM students;+-------------+---------------------+----------+|         name|              address|student_id|+-------------+---------------------+----------+|Fleur Laurent|345 Copper St, London|777777|+-------------+---------------------+----------+|Gordon Martin|779 Lake Ave, Oxford|888888|+-------------+---------------------+----------+

使用FROM语句插入

-- Assuming the applicants table has already been created and populated.SELECT*FROM applicants;+-----------+--------------------------+----------+---------+|       name|                   address|student_id|qualified|+-----------+--------------------------+----------+---------+|Helen Davis|469 Mission St, San Diego|999999|true|+-----------+--------------------------+----------+---------+|   Ivy King|367 Leigh Ave, Santa Clara|101010|false|+-----------+--------------------------+----------+---------+| Jason Wang|908 Bird St, Saratoga|121212|true|+-----------+--------------------------+----------+---------+INSERT OVERWRITE students
    FROM applicants SELECT name, address, student_id WHERE qualified =true;SELECT*FROM students;+-----------+-------------------------+----------+|       name|                  address|student_id|+-----------+-------------------------+----------+|Helen Davis|469 Mission St, San Diego|999999|+-----------+-------------------------+----------+| Jason Wang|908 Bird St, Saratoga|121212|+-----------+-------------------------+----------+

为分区列值使用类型化日期文字插入

CREATETABLE students (name STRING, address  STRING) PARTITIONED BY(birthday DATE);INSERTINTO students PARTITION(birthday =date'2019-01-02')VALUES('Amy Smith','123 Park Ave, San Jose');SELECT*FROM students;+-------------+-------------------------+-----------+|         name|                  address|   birthday|+-------------+-------------------------+-----------+|    Amy Smith|123 Park Ave, San Jose|2019-01-02|+-------------+-------------------------+-----------+INSERT OVERWRITE students PARTITION(birthday =date'2019-01-02')VALUES('Jason Wang','908 Bird St, Saratoga');SELECT*FROM students;+-----------+-------------------------+-----------+|       name|                  address|   birthday|+-----------+-------------------------+-----------+| Jason Wang|908 Bird St, Saratoga|2019-01-02|+-----------+-------------------------+-----------+

使用column列表插入

INSERT OVERWRITE students (address, name, student_id)VALUES('Hangzhou, China','Kent Yao',11215016);SELECT*FROM students WHERE name ='Kent Yao';+---------+----------------------+----------+|     name|               address|student_id|+---------+----------------------+----------+|Kent Yao |       Hangzhou, China|11215016|+---------+----------------------+----------+

同时使用分区spec和column列表进行插入

INSERT OVERWRITE students PARTITION(student_id =11215016)(address, name)VALUES('Hangzhou, China','Kent Yao Jr.');SELECT*FROM students WHERE student_id =11215016;+------------+----------------------+----------+|        name|               address|student_id|+------------+----------------------+----------+|Kent Yao Jr.|       Hangzhou, China|11215016|+------------+----------------------+----------+
标签: spark sql 大数据

本文转载自: https://blog.csdn.net/gabriel_wang_sh/article/details/137492784
版权归原作者 85程序员老王 所有, 如有侵权,请联系我们删除。

“Spark SQL----INSERT TABLE”的评论:

还没有评论