前言
在项目开发中,需要插入批量插入20多万条数据,通过日志观察,发现在调用MyBatis-Plus中的saveBatch()方法性能非常的差,本篇文章主要分享一下saveBatch()的原理以及使用的注意事项
原理
我们通过源码的形式进行解析saveBatch()方法的原理
@Transactional(rollbackFor =Exception.class)defaultbooleansaveBatch(Collection<T> entityList){//DEFAULT_BATCH_SIZE 默认是1000returnsaveBatch(entityList,DEFAULT_BATCH_SIZE);}
@Transactional(rollbackFor =Exception.class)@OverridepublicbooleansaveBatch(Collection<T> entityList,int batchSize){String sqlStatement =getSqlStatement(SqlMethod.INSERT_ONE);//分批执行SQLreturnexecuteBatch(entityList, batchSize,(sqlSession, entity)-> sqlSession.insert(sqlStatement, entity));}
我们看下saveBatch是怎么批量执行的
publicstatic<E>booleanexecuteBatch(Class<?> entityClass,Log log,Collection<E> list,int batchSize,BiConsumer<SqlSession,E> consumer){Assert.isFalse(batchSize <1,"batchSize must not be less than one");return!CollectionUtils.isEmpty(list)&&executeBatch(entityClass, log, sqlSession ->{int size = list.size();int i =1;for(E element : list){//数据最终保存在StatementImpl.batchArgs中,用于批量保存
consumer.accept(sqlSession, element);if((i % batchSize ==0)|| i == size){//批量保存StatementImpl.batchArgs中数据
sqlSession.flushStatements();}
i++;}});}
通过flushStatements()方法我们可以看到最终调用的是StatementImpl中的executeBatchInternal()方法。注意:代码过长,下面方法做了删减。
protectedlong[]executeBatchInternal()throwsSQLException{synchronized(checkClosed().getConnectionMutex()){if(this.connection.isReadOnly()){thrownewSQLException(Messages.getString("PreparedStatement.25")+Messages.getString("PreparedStatement.26"),MysqlErrorNumbers.SQL_STATE_ILLEGAL_ARGUMENT);}if(this.query.getBatchedArgs()==null||this.query.getBatchedArgs().size()==0){returnnewlong[0];}// we timeout the entire batch, not individual statementsint batchTimeout =getTimeoutInMillis();setTimeoutInMillis(0);resetCancelledState();try{statementBegins();clearWarnings();// 如果配置rewriteBatchedStatements 开启多SQL执行if(!this.batchHasPlainStatements &&this.rewriteBatchedStatements.getValue()){if(getQueryInfo().isRewritableWithMultiValuesClause()){returnexecuteBatchWithMultiValuesClause(batchTimeout);}if(!this.batchHasPlainStatements &&this.query.getBatchedArgs()!=null&&this.query.getBatchedArgs().size()>3/* cost of option setting rt-wise */){returnexecutePreparedBatchAsMultiStatement(batchTimeout);}}returnexecuteBatchSerially(batchTimeout);}finally{this.query.getStatementExecuting().set(false);clearBatch();}}}
我们再看下insert做了什么事情
publicintinsert(String statement,Object parameter){returnupdate(statement, parameter);}
publicintupdate(String statement,Object parameter){try{
dirty =true;MappedStatement ms = configuration.getMappedStatement(statement);return executor.update(ms,wrapCollection(parameter));}catch(Exception e){throwExceptionFactory.wrapException("Error updating database. Cause: "+ e, e);}finally{ErrorContext.instance().reset();}}
publicintupdate(MappedStatement ms,Object parameter)throwsSQLException{ErrorContext.instance().resource(ms.getResource()).activity("executing an update").object(ms.getId());if(closed){thrownewExecutorException("Executor was closed.");}clearLocalCache();returndoUpdate(ms, parameter);}
重点方法在doUpdate(ms,parameter). 完成SQL的拼装
@OverridepublicintdoUpdate(MappedStatement ms,Object parameterObject)throwsSQLException{finalConfiguration configuration = ms.getConfiguration();finalStatementHandler handler = configuration.newStatementHandler(this, ms, parameterObject,RowBounds.DEFAULT,null,null);finalBoundSql boundSql = handler.getBoundSql();finalString sql = boundSql.getSql();finalStatement stmt;// 数据的SQL语句必须完全一致,包括表名和列if(sql.equals(currentSql)&& ms.equals(currentStatement)){int last = statementList.size()-1;
stmt = statementList.get(last);applyTransactionTimeout(stmt);
handler.parameterize(stmt);// fix Issues 322BatchResult batchResult = batchResultList.get(last);
batchResult.addParameterObject(parameterObject);}else{Connection connection =getConnection(ms.getStatementLog());
stmt = handler.prepare(connection, transaction.getTimeout());
handler.parameterize(stmt);// fix Issues 322
currentSql = sql;
currentStatement = ms;
statementList.add(stmt);
batchResultList.add(newBatchResult(ms, sql, parameterObject));}
handler.batch(stmt);returnBATCH_UPDATE_RETURN_VALUE;}
以上就是saveBatch的原理。
总结
1: 想要批量执行操作 数据库链接参数加上rewriteBatchedStatements=true
rewriteBatchedStatements参数需要保证5.1.13以上版本的驱动才能实现高性能的批量插入
2: 根据doUpdate(ms,parameter). 完成SQL的拼装的原理可以得出,如果批量插入的数据,有些数据字段值为null,不会批量查询,而是单独拼装一个SQL执行。
例如:
publicclassStudent{privateString name;privateString address;}
100个Student,其中 20个name=null,其中 50个address==null。通过日志我们看下这种不会批量插入。
版权归原作者 大伟攀高峰 所有, 如有侵权,请联系我们删除。