0


Spring Cloud Netflix Eureka的参数调优

下面主要分为Client端和Server端两大类进行简述,Eureka的几个核心参数

客户端参数

Client端的核心参数

参数默认值说明eureka.client.availability-zones告知Client有哪些region以及availability-zones,支持配置修改运行时生效eureka.client.filter-only-up-instancestrue是否过滤出InstanceStatus为UP的实例eureka.client.regionus-east-1指定该应用实例所在的region,AWS datacenters适用eureka.client.register-with-eurekatrue是否将该应用实例注册到Eureka Servereureka.client.prefer-szme-zone-eurekatrue是否优先使用与该实例处于相同zone的Eureka Servereureka.client.on-demand-update-status-changetrue是够将本地实例状态更新通过ApplicationInfoManager实时触发同步到Eureka Servereureka.instance.metadata-map指定应用实例的元数据信息eureka.instance.prefer-ip-addressfalse指定优先使用ip地址替代host name作为实例的hostName字段值eureka.instance.lease-expiration-duration-in-seconds90指定Eureka Client间隔多久需要向Eureka Server发送心跳来告知Eureka Server该实例还存活

定时任务参数

参数默认值说明eureka.client.cache-refresh-executor-thread-pool-size2刷新缓存的CacheRefreshThread的线程池大小eureka.client.cache-refresh-executor-exponential-back-off-bound10(刷新缓存)调度任务执行超时时下次的调度的延迟时间reka.client.heartbeat-executor-thread-pool-size2心跳线程HeartBeatThread的线程池大小eureka.client.heartbeat-executor-exponential-back-off-bound10(心跳执行)调度任务超时时下次的调度的延时时间eureka.client.registry-fetch-interval-seconds30CacheRefreshThread线程调度频率eureka.client.eureka-service-url-poll-interval-seconds5*60AsyncResolver.updateTask刷新Eureka Server地址的时间间隔eureka.client.initial-instance-info-replication-interval-seconds40InstanceInfoReplicator将实例信息变更同步到Eureka Server的初始延时时间eureka.client.instance-info-replication-interval-seconds30InstanceInfoReplicator将实例信息变更同步到Eureka Server的时间间隔ureka.instance.lease-renewal-interval-in-seconds30Eureka Client向Eureka Server发送心跳的时间间隔

http参数

Eureka Client底层httpClient与Eureka Server通信,提供的先关参数
参数默认值说明eureka.client.eureka-server-connect-timeout-seconds5连接超时时间eureka.client.eureka-server-read-timeout-seconds8读超时时间eureka.client.eureka-server-total-connections200连接池最大活动连接数eureka.client.eureka-server-total-connections-per-host50每个host能使用的最大链接数eureka.client.eureka-connection-idle-timeout-seconds30连接池中链接的空闲时间

服务端端参数

主要包含这几类:基本参数、response cache参数、peer相参数、http参数

基本参数

参数默认值说明eureka.server.enable-self-perservationtrue是否开启自我保护模式eureka.server.renewal-percent-threshold0.85指定每分钟需要收到续约次数的阈值eureka.instance.registry.expected-number-of-renews-per-min1指定每分钟需要接收到的续约次数值,实际该值在其中被写死为count2,另外也会被更新eureka.server.renewal-threshold-update-interval-ms15分钟指定updateRenewalThreshold定时任务的调度频率,来动态更新expectedNumberOfRenewsPerMin及numberOfRenewsPerminThreshold值eureka.server.eviction-interval-timer-in-ms601000指定EvictionTask定时任务的调度频率,用于剔除过期的实例

response cache参数

Eureka Server为了提升自身REST API接口的性能,提供了两个缓存:一个是基于ConcurrentMap的readOnlyCacheMap,一个是基于Guava Cache的readWriteCacheMap。其相关参数如下:
参数默认值说明eureka.server.use-read-only-response-cachetrue是否使用只读的response-cacheeureka.server.response-cache-update-interval-ms30*1000设置CacheUpdateTask的调度时间间隔,用于从readWriteCacheMap更新数据到readOnlyCacheMap。仅仅在eureka.server.use-read-only-response-cache为true的时候生效eureka.server.response-cache-auto-expiration-in-seconds180设置readWriteCacheMap的expireAfterWrite参数,指定写入多长时间过过期

peer相关参数

参数默认值说明eureka.server.peer-eureka-nodes-update-interval-ms10分钟指定peersUpdateTask调度的时间间隔,用于从配置文件刷新peerEurekaNodes节点的配置信息(‘eureka.client.serviceUrl相关zone的配置’)eureka.server.peer-eureka-status-refresh-time-interval-ms30*1000指定更新peer node状态信息的时间间隔

http参数

Eureka Server需要与其他peer节点进行通信,复制实例信息,其底层使用httpClient,提供相关的参数
参数默认值说明eureka.server.peer-node-connect-timeout-ms200连接超时时间eureka.server.peer-node-read-timeout-ms200读超时时间eureka.server.peer-node-total-connections1000连接池最大活动连接数eureka.server.peer-node-total-connections-per-host500每个host能使用的最大连接数eureka.server.peer-node-connection-idle-timeout-seconds30连接池中连接的空闲时间

参数调优

常见问题

1.为什么服务下线了,Eureka Server接口返回的信息还会存在?

2.为什么服务上线了,Eureka Client不能及时获取到?

3.为什么会有一下提示:

EMERGENCY!EUREKA MAY BE INCORRECTLY CLAIMING INSTANCES ARE UP WHEN THEY’RE NOT. RENEWALS ARE LESSER THAN THRESHOLD AND HENCE THE INSTANCES ARE NOT BEING EXPIRED JUST TO BE SAFE

解决方法:

1.Eureka Server并不是强一致的,因此registry中会议保留过期的实例信息。原因如下:

  • 应用实例异常挂掉,没能在挂掉之前告知Eureka Server要下线掉该服务实例信息。这个就需要依赖Eureka Server的EvictionTask去剔除。
  • 应用实例下线是有告知Eureka Server下线,但是由于Eureka Server的REST API有response cache,因此需要等待缓存过期才能更新。
  • 由于Eureka Server开启并以入了SELF PRESERVATION(自我保护)模式,导致registry的信息不会因为过期而被剔除掉,直到退出SELF PRESERVATION(自我保护)模式。

针对Client下线而没有通知Eureka Server的问题,可以调整EvictionTask的调度频率,比如把默认的时间间隔60s,调整为5s:

eureka:server:eviction-interval-timer-in-ms:5000

针对response cache的问题,可以根据情况考虑关闭readOnlyCacheMap:

eureka:server:use-read-only-response-cache:false

或者调整readWriteCacheMap的过期时间:

eureka:server:response-cache-auto-expiration-in-seconds:60

针对SELF PRESERVATION(自我保护)的问题,在测试环境可以将enable-self-preservation设置为false:

eureka:server:enable-self-preservation:false

关闭之后会提示:

THE SELF PRESER VAT ION MODE IS TURNED OFF.  THIS MAY NOT PRO TECT INSTANCE EXPIRY IN CASE OF NETWORK/OTHER PROBLEMS.

或者:

RENEWALS ARE LESSER THAN THE THRESHOLD.THE SELF PRESERVATION MODE IS TURNED OFF.THIS MAY  NOT PROTECT INSTANCE EXPIRY IN CASE OF NETWORK/OTHER PROBLEMS.

2.针对新服务上线,Eureka Client获取不及时的问题,在测试环境,可以适当提高client端拉取Server注册信息的频率,例如下面将默认的30s改为5s:

eureka:client:registry-fetch-interval-seconds:5

3.在实际生产过程中,经常会有网络抖动等问题造成服务实例与Eureka Server的心跳未能如期保持,但是服务实例本身是健康的,这个时候如果按照租约剔除机制剔除的话,会造成误判无果大范围误判的话,可能导致整个服务注册列表的大部分注册信息被删除,从而没有可用服务。Eureka为了解决这个问题引入了SELF PRESERVATION机制,当最近一分钟接收到的租约次数小于等于指定阈值的话,则关闭租约失效剔除,禁止定时任务失效的实例,从而保护注册信息。

在生产环境下,可以吧renewwalPercentThreshold及leaseRenewalIntervalInSeconds参数调小一点,从而提高触发SELF PRESERVATION机制的阈值。

eureka:instance:lease-renewal-interval-in-seconds:10#默认是30renewal-percent-threshold:0.49#默认是0.85

监控指标

Eureka内置了基于servo的指标统计,具体在

com.netflix.eureka.util.EurekaMonitors

。Spring Boot 2.x版本改为使用Micrometer,不再支持Neflix Servo,转而支持Neflix Servo的替代品

Neflix Spectator

。不过对于Servo,可以通过

DefaultMonitorRegistry.getInstance().getRegisteredMonitors

来获取所有注册了的Monitor,进而获取其指标值。

//// Source code recreated from a .class file by IntelliJ IDEA// (powered by Fernflower decompiler)//packagecom.netflix.eureka.util;importcom.netflix.appinfo.AmazonInfo;importcom.netflix.appinfo.ApplicationInfoManager;importcom.netflix.appinfo.DataCenterInfo;importcom.netflix.appinfo.AmazonInfo.MetaDataKey;importcom.netflix.appinfo.DataCenterInfo.Name;importcom.netflix.servo.DefaultMonitorRegistry;importcom.netflix.servo.annotations.DataSourceType;importcom.netflix.servo.annotations.Monitor;importcom.netflix.servo.monitor.Monitors;importjava.util.concurrent.atomic.AtomicLong;publicenumEurekaMonitors{// 自启动以来收到的总续约次数RENEW("renewCounter","Number of total renews seen since startup"),// 自启动以来收到的总取消租约次数CANCEL("cancelCounter","Number of total cancels seen since startup"),// 自启动以来查询registry的总次数GET_ALL_CACHE_MISS("getAllCacheMissCounter","Number of total registery queries seen since startup"),// 自启动以来delta查询registry的总次数GET_ALL_CACHE_MISS_DELTA("getAllCacheMissDeltaCounter","Number of total registery queries for delta seen since startup"),// 自启动以来使用remote region查询registry的总次数GET_ALL_WITH_REMOTE_REGIONS_CACHE_MISS("getAllWithRemoteRegionCacheMissCounter","Number of total registry with remote region queries seen since startup"),// 自启动以来使用remote region及delta方式查询registry的总次数GET_ALL_WITH_REMOTE_REGIONS_CACHE_MISS_DELTA("getAllWithRemoteRegionCacheMissDeltaCounter","Number of total registry queries for delta with remote region seen since startup"),// 自启动以来查询delta的总次数GET_ALL_DELTA("getAllDeltaCounter","Number of total deltas since startup"),// 自启动以来传递regions查询delta的总次数GET_ALL_DELTA_WITH_REMOTE_REGIONS("getAllDeltaWithRemoteRegionCounter","Number of total deltas with remote regions since startup"),// 自启动以来查询'/{version}/apps'的次数GET_ALL("getAllCounter","Number of total registry queries seen since startup"),// 自启动以来传递regions参数查询'/{version}/apps'的次数GET_ALL_WITH_REMOTE_REGIONS("getAllWithRemoteRegionCounter","Number of total registry queries with remote regions, seen since startup"),// 自启动以来请求/{version}/apps/{appId}的总次数GET_APPLICATION("getApplicationCounter","Number of total application queries seen since startup"),// 自启动以来register的总次数REGISTER("registerCounter","Number of total registers seen since startup"),// 自启动以来剔除过期实例的总次数EXPIRED("expiredCounter","Number of total expired leases since startup"),// 自启动以来statusUpdate的总次数STATUS_UPDATE("statusUpdateCounter","Number of total admin status updates since startup"),// 自启动以来deleteStatusOverride的总次数STATUS_OVERRIDE_DELETE("statusOverrideDeleteCounter","Number of status override removals"),// 自启动以来收到cancel请求时对应实例找不到的次数CANCEL_NOT_FOUND("cancelNotFoundCounter","Number of total cancel requests on non-existing instance since startup"),// 自启动以来收到renew请求时对应实例找不到的次数RENEW_NOT_FOUND("renewNotFoundexpiredCounter","Number of total renew on non-existing instance since startup"),REJECTED_REPLICATIONS("numOfRejectedReplications","Number of replications rejected because of full queue"),FAILED_REPLICATIONS("numOfFailedReplications","Number of failed replications - likely from timeouts"),// 由于开启rate limiter被丢弃的请求数量RATE_LIMITED("numOfRateLimitedRequests","Number of requests discarded by the rate limiter"),// 如果开启rate limiter的话,将被丢弃的请求数RATE_LIMITED_CANDIDATES("numOfRateLimitedRequestCandidates","Number of requests that would be discarded if the rate limiter's throttling is activated"),// 开启rate limiter时请求全量registry被丢弃的请求数RATE_LIMITED_FULL_FETCH("numOfRateLimitedFullFetchRequests","Number of full registry fetch requests discarded by the rate limiter"),// 如果开启rate limiter时请求全量registry将被丢弃的请求数RATE_LIMITED_FULL_FETCH_CANDIDATES("numOfRateLimitedFullFetchRequestCandidates","Number of full registry fetch requests that would be discarded if the rate limiter's throttling is activated");privatefinalString name;privatefinalString myZoneCounterName;privatefinalString description;@Monitor(
        name ="count",
        type =DataSourceType.COUNTER)privatefinalAtomicLong counter =newAtomicLong();@Monitor(
        name ="count-minus-replication",
        type =DataSourceType.COUNTER)privatefinalAtomicLong myZoneCounter =newAtomicLong();privateEurekaMonitors(String name,String description){this.name = name;this.description = description;DataCenterInfo dcInfo =ApplicationInfoManager.getInstance().getInfo().getDataCenterInfo();if(dcInfo.getName()==Name.Amazon){this.myZoneCounterName =((AmazonInfo)dcInfo).get(MetaDataKey.availabilityZone)+"."+ name;}else{this.myZoneCounterName ="dcmaster."+ name;}}publicvoidincrement(){this.increment(false);}publicvoidincrement(boolean isReplication){this.counter.incrementAndGet();if(!isReplication){this.myZoneCounter.incrementAndGet();}}publicStringgetName(){returnthis.name;}publicStringgetZoneSpecificName(){returnthis.myZoneCounterName;}publicStringgetDescription(){returnthis.description;}publiclonggetCount(){returnthis.counter.get();}publiclonggetZoneSpecificCount(){returnthis.myZoneCounter.get();}publicstaticvoidregisterAllStats(){EurekaMonitors[] var0 =values();int var1 = var0.length;for(int var2 =0; var2 < var1;++var2){EurekaMonitors c = var0[var2];Monitors.registerObject(c.getName(), c);}}publicstaticvoidshutdown(){EurekaMonitors[] var0 =values();int var1 = var0.length;for(int var2 =0; var2 < var1;++var2){EurekaMonitors c = var0[var2];DefaultMonitorRegistry.getInstance().unregister(Monitors.newObjectMonitor(c.getName(), c));}}}
标签: eureka 后端 spring

本文转载自: https://blog.csdn.net/weixin_42008170/article/details/136026404
版权归原作者 NullzzZ 所有, 如有侵权,请联系我们删除。

“Spring Cloud Netflix Eureka的参数调优”的评论:

还没有评论