问题背景
在项目启动时出现大量
c.a.d.pool.DruidAbstractDataSource: discard long time none received connection.
明显是Druid管理的数据库连接因为太长时间没有收到数据库发来的数据,把连接给回收掉了,这导致服务在启动时因为要重复创建连接让服务启动时间延长。
定位原因
根据错误信息,找到Druid源码
com.alibaba.druid.pool.DruidAbstractDataSource#testConnectionInternal(com.alibaba.druid.pool.DruidConnectionHolder, java.sql.Connection)
if(validConnectionChecker !=null){// 验证连接的有效性 mysql下实际调用代码在下面那块boolean valid = validConnectionChecker.isValidConnection(conn, validationQuery, validationQueryTimeout);long currentTimeMillis =System.currentTimeMillis();if(holder !=null){
holder.lastValidTimeMillis = currentTimeMillis;
holder.lastExecTimeMillis = currentTimeMillis;}if(valid && isMySql){// unexcepted branchlong lastPacketReceivedTimeMs =MySqlUtils.getLastPacketReceivedTimeMs(conn);if(lastPacketReceivedTimeMs >0){long mysqlIdleMillis = currentTimeMillis - lastPacketReceivedTimeMs;if(lastPacketReceivedTimeMs >0//&& mysqlIdleMillis >= timeBetweenEvictionRunsMillis){discardConnection(holder);// 警告信息位置String errorMsg ="discard long time none received connection. "+", jdbcUrl : "+ jdbcUrl
+", version : "+ VERSION.getVersionNumber()+", lastPacketReceivedIdleMillis : "+ mysqlIdleMillis;
LOG.warn(errorMsg);returnfalse;}}}// ... 省略}// com.alibaba.druid.pool.vendor.MySqlValidConnectionChecker#isValidConnectionpublicbooleanisValidConnection(Connection conn,String validateQuery,int validationQueryTimeout)throwsException{if(conn.isClosed()){returnfalse;}if(usePingMethod){// 以ping的方式检测连接的有效性if(conn instanceofDruidPooledConnection){
conn =((DruidPooledConnection) conn).getConnection();}if(conn instanceofConnectionProxy){
conn =((ConnectionProxy) conn).getRawObject();}if(clazz.isAssignableFrom(conn.getClass())){if(validationQueryTimeout <=0){
validationQueryTimeout = DEFAULT_VALIDATION_QUERY_TIMEOUT;}try{
ping.invoke(conn,true, validationQueryTimeout *1000);}catch(InvocationTargetException e){Throwable cause = e.getCause();if(cause instanceofSQLException){throw(SQLException) cause;}throw e;}returntrue;}}String query = validateQuery;if(validateQuery ==null|| validateQuery.isEmpty()){// 以 sql SELECT 1 的方式验证连接有效性
query = DEFAULT_VALIDATION_QUERY;}Statement stmt =null;ResultSet rs =null;try{
stmt = conn.createStatement();if(validationQueryTimeout >0){
stmt.setQueryTimeout(validationQueryTimeout);}
rs = stmt.executeQuery(query);returntrue;}finally{JdbcUtils.close(rs);JdbcUtils.close(stmt);}}}
这是调用
testConnectionInternal
方法的上层.
可以看到,因为我们开启了
testOnBorrow
开关,所以数据库连接会在申请成功后,立即进行一次测试,然后根据数据库连接的最后一次心跳时间,判断是否闲置过长要丢弃掉该数据库连接。
该开关主要在从连接池获取时立即检查连接的有效性。
而不开启
testOnBorrow
则会在保持连接过程中不断检查连接的闲置情况,对闲置过长的连接回收。
com.alibaba.druid.util.MySqlUtils#getLastPacketReceivedTimeMs
这个方法会返回连接最后一次收到消息的时间.
// 以mysql6的 com.mysql.cj.jdbc.ConnectionImpl 为栗子// getLastPacketReceivedTimeMs 方法中获取链接时间的实际方法publiclonggetIdleFor(){returnthis.lastQueryFinishedTime ==0?0:System.currentTimeMillis()-this.lastQueryFinishedTime;}// com.mysql.cj.NativeSession#execSQLpublic<TextendsResultset>TexecSQL(Query callingQuery,String query,int maxRows,NativePacketPayload packet,boolean streamResults,ProtocolEntityFactory<T,NativePacketPayload> resultSetFactory,ColumnDefinition cachedMetadata,boolean isBatch){long queryStartTime =this.gatherPerfMetrics.getValue()?System.currentTimeMillis():0;int endOfQueryPacketPosition = packet !=null? packet.getPosition():0;this.lastQueryFinishedTime =0;// we're busy!if(this.autoReconnect.getValue()&&(getServerSession().isAutoCommit()||this.autoReconnectForPools.getValue())&&this.needsPing &&!isBatch){try{ping(false,0);this.needsPing =false;}catch(ExceptionEx){invokeReconnectListeners();}}try{return packet ==null?((NativeProtocol)this.protocol).sendQueryString(callingQuery, query,this.characterEncoding.getValue(), maxRows, streamResults,
cachedMetadata, resultSetFactory):((NativeProtocol)this.protocol).sendQueryPacket(callingQuery, packet, maxRows, streamResults, cachedMetadata, resultSetFactory);}catch(CJException sqlE){if(getPropertySet().getBooleanProperty(PropertyKey.dumpQueriesOnException).getValue()){String extractedSql =NativePacketPayload.extractSqlFromPacket(query, packet, endOfQueryPacketPosition,getPropertySet().getIntegerProperty(PropertyKey.maxQuerySizeToLog).getValue());StringBuilder messageBuf =newStringBuilder(extractedSql.length()+32);
messageBuf.append("\n\nQuery being executed when exception was thrown:\n");
messageBuf.append(extractedSql);
messageBuf.append("\n\n");
sqlE.appendMessage(messageBuf.toString());}if((this.autoReconnect.getValue())){if(sqlE instanceofCJCommunicationsException){// IO may be dirty or damaged beyond repair, force close it.this.protocol.getSocketConnection().forceClose();}this.needsPing =true;}elseif(sqlE instanceofCJCommunicationsException){invokeCleanupListeners(sqlE);}throw sqlE;}catch(Throwable ex){if(this.autoReconnect.getValue()){if(ex instanceofIOException){// IO may be dirty or damaged beyond repair, force close it.this.protocol.getSocketConnection().forceClose();}elseif(ex instanceofIOException){invokeCleanupListeners(ex);}this.needsPing =true;}throwExceptionFactory.createException(ex.getMessage(), ex,this.exceptionInterceptor);}finally{// 需要开启数据库连接的jdbc参数 maintainTimeStats=trueif(this.maintainTimeStats.getValue()){// 连接的最后查询时间被更新this.lastQueryFinishedTime =System.currentTimeMillis();}if(this.gatherPerfMetrics.getValue()){((NativeProtocol)this.protocol).getMetricsHolder().registerQueryExecutionTime(System.currentTimeMillis()- queryStartTime);}}}
解决
通过源码分析,就大致清楚问题的原因。
druid会从数据库获取一批连接持有在本地,以便快速使用。
为了检查连接的可用(如连接超时被数据库回收了,网络异常等),所以当开启
testOnBorrow
开关后,会在客户端从druid获取连接时进行闲置连接检查。
而闲置检查时比较连接当前时间与最后一次执行sql的时间的差值。
我们的服务在启动时没有进行数据查询,并且连接保活维持是通过ping的方式,所以当启动时间超过之前设置的15s后,再使用最开始池化的数据库
借入
连接时检测不过而抛出文章开头的异常信息。
我们可以通过调大闲置连接剔除时间和保活时间,让连接闲置能够撑过服务启动的无数据查询时间。
此外,如果服务的活跃情况很低,也就是执行sql的频率很低,可以设置环境变量
druid.mysql.usePingMethod
为
false
,让druid以执行
SELECT 1
sql的方式来保活连接,如此就会顺带刷新
getLastPacketReceivedTimeMs
属性。
// com.alibaba.druid.pool.vendor.MySqlValidConnectionChecker#configFromPropertiespublicvoidconfigFromProperties(Properties properties){if(properties ==null){return;}String property = properties.getProperty("druid.mysql.usePingMethod");if("true".equals(property)){setUsePingMethod(true);}elseif("false".equals(property)){setUsePingMethod(false);}}
当然通过源码还有其他方式,可以自行发现。
spring:datasource:druid:# 让底层的jdbc维护连接的状态的时间url: jdck:mysql://xxx?maintainTimeStats=true
# 连接闲置剔除时间time-between-eviction-runs-millis:300000# 必须大于 time-between-eviction-runs-millis 时间keep-alive-between-time-millis:450000
// 启动代码添加系统属性// 或者通过 -Ddruid.mysql.usePingMethod=false 的命令参数// 或者通过环境变量publicstaticvoidmain(String[] args){Properties properties =System.getProperties();// 用 select 1 替换 ping 来检测连接保活
properties.setProperty("druid.mysql.usePingMethod","false");SpringApplication.run(App.class, args);}
版权归原作者 咕咕咕zhou 所有, 如有侵权,请联系我们删除。