0


【数据库内核分析系列】:数据库索引的创建过程

文章目录

数据库索引可以提高数据的访问速度,openGauss支持唯一索引、多字段索引、部分索引和表达式索引。

  • 行存表(ASTORE存储引擎)支持的索引类型:btree(行存表缺省值)、hash、gin、gist。行存表(USTORE存储引擎)支持的索引类型:ubtree。
  • 列存表支持的索引类型:Psort(列存表缺省值)、btree、gin。
  • 全局临时表不支持GIN索引和Gist索引。

如上一篇分析数据库表的创建过程,standard_ProcessUtility函数会根据nodeTag(parsetree)的值来确定sql的操作类型,create table一般都是进入T_CreateStmt分支调用CreateCommand函数。create index则进入T_IndexStmt分支调用DefineIndex函数。在调用DefineIndex前会首先执行函数transformIndexStmt,如果语句没有指定索引类型则会使用缺省值。

if(stmt->accessMethod ==NULL){if(!isColStore){/* row store using btree index by default */if(!RelationIsUstoreFormat(rel)){
                stmt->accessMethod = DEFAULT_INDEX_TYPE;//  行存表ASTORE存储引擎缺省值btree}else{
                stmt->accessMethod = DEFAULT_USTORE_INDEX_TYPE;// 行存表USTORE存储引擎缺省值ubtree}
……
            }else{/* column store using psort index by default */
                stmt->accessMethod = DEFAULT_CSTORE_INDEX_TYPE;// 列存表缺省值psort}}
#defineDEFAULT_INDEX_TYPE"btree"#defineDEFAULT_HASH_INDEX_TYPE"hash"#defineDEFAULT_CSTORE_INDEX_TYPE"psort"#defineDEFAULT_GIST_INDEX_TYPE"gist"#defineCSTORE_BTREE_INDEX_TYPE"cbtree"#defineDEFAULT_GIN_INDEX_TYPE"gin"#defineCSTORE_GINBTREE_INDEX_TYPE"cgin"#defineDEFAULT_USTORE_INDEX_TYPE"ubtree"

普通表索引

DefineInde

DefineIndex为创建索引主入口函数。通常创建索引以Share锁锁定表,允许并发查询,但禁上对表进行修改。如果创建索引时指定关键字CONCURRENTLY以不阻塞DML的方式创建索引,即允许读取和更新表,以ShareUpdateExclusiveLock锁锁定表。

在这里插入图片描述

    lockmode = concurrent ? ShareUpdateExclusiveLock : ShareLock;
    rel =heap_open(relationId, lockmode);

如果没有指定索引名,ChooseIndexName根据规则生成索引名:

/*
     * Select name for index if caller didn't specify.
     */
    indexRelationName = stmt->idxname;if(indexRelationName ==NULL){
        indexRelationName =ChooseIndexName(RelationGetRelationName(rel),
            namespaceId,
            indexColNames,
            stmt->excludeOpNames,
            stmt->primary,
            stmt->isconstraint);

为index_create函数构造参数IndexInfo结构体:

/*
 * Prepare arguments for index_create, primarily an IndexInfo structure.
 * Note that ii_Predicate must be in implicit-AND format.
 */
indexInfo = makeNode(IndexInfo);

Index_create函数创建索引:
/*

  • Make the catalog entries for the index, including constraints. Then, if
  • not skip_build || concurrent, actually build the index.

*/
indexRelationId = index_create(rel,
……
关闭表并返回索引表id:

heap_close(rel, NoLock);return indexRelationId;

Index_create函数

打开系统表pg_class:

pg_class = heap_open(RelationRelationId, RowExclusiveLock);

heap_create创建relcache和索引物理文件:

/*
     * create the index relation's relcache entry and physical disk file. (If
     * we fail further down, it's the smgr's responsibility to remove the disk
     * file again.)
     */
    StorageType storage_type =RelationGetStorageType(heapRelation);
    indexRelation =heap_create(indexRelationName, namespaceId, tableSpaceId, indexRelationId, relFileNode,RELATION_CREATE_BUCKET(heapRelation)? heapRelation->rd_bucketoid : InvalidOid, indexTupDesc, relKind,
        relpersistence, isLocalPart, false, shared_relation, mapped_relation, allow_system_table_mods,
        REL_CMPRS_NOT_SUPPORT,(Datum)reloptions, heapRelation->rd_rel->relowner, skip_create_storage,
        isUstore ? TAM_USTORE : TAM_HEAP,/* XXX: Index tables are by default HEAP Table Type */
        relindexsplit, storage_type, extra->crossBucket, accessMethodObjectId);

将索引表元信息存入系统表pg_class:

   /*
     * store index's pg_class entry
     */
    InsertPgClassTuple(
        pg_class, indexRelation, RelationGetRelid(indexRelation), (Datum)0, reloptions, relKind, NULL);
 
    /* done with pg_class */
    heap_close(pg_class, RowExclusiveLock);

将索引表元信息存入系统表pg_index:

UpdateIndexRelation(indexRelationId,
        heapRelationId,
        indexInfo,
……

Index_build建立索引:

}elseif(extra &&(!extra->isPartitionedIndex || extra->isGlobalPartitionedIndex)){/* support regular index or GLOBAL partition index */index_build(heapRelation,NULL, indexRelation,NULL, indexInfo, isprimary, false,PARTITION_TYPE(extra));}

index_build

index_build调用index_build_storage,如果创建的是btree索引最终调用btbuild,如果是hash索引最终调用hashbuild,如果是psort则最终调用psortbuild,更多索引访问方法的信息可查看系统表pg_am。

stats = index_build_storage(targetHeapRelation, targetIndexRelation, indexInfo);

static IndexBuildResult*index_build_storage(Relation heapRelation, Relation indexRelation, IndexInfo* indexInfo){
    RegProcedure procedure = indexRelation->rd_am->ambuild;Assert(RegProcedureIsValid(procedure));
 
    IndexBuildResult* stats =(IndexBuildResult*)DatumGetPointer(OidFunctionCall3(
        procedure,PointerGetDatum(heapRelation),PointerGetDatum(indexRelation),PointerGetDatum(indexInfo)));Assert(PointerIsValid(stats));if(RELPERSISTENCE_UNLOGGED == heapRelation->rd_rel->relpersistence){index_build_init_fork(heapRelation, indexRelation);}return stats;}
btree索引的procedure为btbuild。

#0btbuild(fcinfo=0x7fb4f9c63920) at nbtree.cpp:63
#10x00000000011fc07d in OidFunctionCall3Coll(functionId=338, collation=0, arg1=140415366247440, arg2=140415366237480, arg3=140415402419864) at fmgr.cpp:1857
#20x0000000000c16b77 in index_build_storage(heapRelation=0x7fb50006b410, indexRelation=0x7fb500068d28, indexInfo=0x7fb5022ea698) at index.cpp:2475
#30x0000000000c18097 in index_build(heapRelation=0x7fb50006b410, heapPartition=0x0, indexRelation=0x7fb500068d28, indexPartition=0x0, indexInfo=0x7fb5022ea698, isPrimary=false, isreindex=false,
    partitionType=INDEX_CREATE_NONE_PARTITION, parallel=true) at index.cpp:2834

以上函数调用栈如下:

#0 index_build_storage
#1 index_build
#2 index_create
#3 DefineIndex
#4 standard_ProcessUtility
#5 gsaudit_ProcessUtility_hook
#6 pgaudit_ProcessUtility
#7 hypo_utility_hook
#8 ProcessUtility
#9 PortalRunUtility
#10 PortalRun
#11 exec_simple_query
#12 PostgresMain

分区表索引

创建普通表索引语法如下:

CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [schema_name.]index_name ]
{ ON table_name [ USING method ] | [ USING method ] ON table_name }
({ { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] }[, …] )
[ index_option ]
[ WHERE predicate ];

创建分区表索引语法:

CREATE [ UNIQUE ] INDEX [ [schema_name.]index_name ]
{ ON table_name [ USING method ] | [ USING method ] ON table_name }
( {{ column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS LAST ] }[, …] )
[ LOCAL [ ( { PARTITION index_partition_name [ TABLESPACE index_partition_tablespace ] } [, …] ) ] | GLOBAL ]
[ index_option ]

两者执行流程基本一致,分区表索引在DefineIndex中是遍历每个分区调用partition_index_create。

openGauss: 一款高性能、高安全、高可靠的企业级开源关系型数据库。

🍒如果您觉得博主的文章还不错或者有帮助的话,请关注一下博主,如果三连点赞评论收藏就更好啦!谢谢各位大佬给予的支持!

标签: 数据库 云原生 sql

本文转载自: https://blog.csdn.net/GaussDB/article/details/128416478
版权归原作者 Gauss松鼠会 所有, 如有侵权,请联系我们删除。

“【数据库内核分析系列】:数据库索引的创建过程”的评论:

还没有评论