Nacos源码分析

it2023-10-02  73

Nacos源码分析

一、入口分析

切入点1:@EnableDiscoveryClient

如果需要将服务注册到注册中心,需要在启动类加上@EnableDiscoveryClient注解,该注解到底有什么作用?

@Target(ElementType.TYPE) @Retention(RetentionPolicy.RUNTIME) @Documented @Inherited @Import(EnableDiscoveryClientImportSelector.class) public @interface EnableDiscoveryClient { // 默认值为true // 如果为true,ServiceRegistry将会自动把服务注册到注册中心 boolean autoRegister() default true; }

从EnableDiscoveryClient源码可以看出该接口有一个autoRegister()方法默认返回值是true, 它还引用了EnableDiscoveryClientImportSelector类。

public class EnableDiscoveryClientImportSelector extends SpringFactoryImportSelector<EnableDiscoveryClient> { @Override public String[] selectImports(AnnotationMetadata metadata) { String[] imports = super.selectImports(metadata); AnnotationAttributes attributes = AnnotationAttributes.fromMap( metadata.getAnnotationAttributes(getAnnotationClass().getName(), true)); boolean autoRegister = attributes.getBoolean("autoRegister"); if (autoRegister) { List<String> importsList = new ArrayList<>(Arrays.asList(imports)); importsList.add( "org.springframework.cloud.client.serviceregistry.AutoServiceRegistrationConfiguration"); imports = importsList.toArray(new String[0]); } else { Environment env = getEnvironment(); if (ConfigurableEnvironment.class.isInstance(env)) { ConfigurableEnvironment configEnv = (ConfigurableEnvironment) env; LinkedHashMap<String, Object> map = new LinkedHashMap<>(); map.put("spring.cloud.service-registry.auto-registration.enabled", false); MapPropertySource propertySource = new MapPropertySource( "springCloudDiscoveryClient", map); configEnv.getPropertySources().addLast(propertySource); } } return imports; } }

主要查看selectImports方法,先获取@EnableDiscoveryClient注解的autoRegister属性。

当autoRegister=true 时,系统就会去自动装配AutoServiceRegistrationConfiguration类,利用了springboot自装配机制

而AutoServiceRegistrationConfiguration类作用则是注入一些我们配置到yml文件中的属性(比如是否自动注册)

@Configuration @EnableConfigurationProperties(AutoServiceRegistrationProperties.class) @ConditionalOnProperty(value = "spring.cloud.service-registry.auto-registration.enabled", matchIfMissing = true) public class AutoServiceRegistrationConfiguration { }

切入点二:META-INF/spring.factories

利用springboot自动注入的原理,以上bean都会被注入到spring容器中

我们看NacosDiscoveryAutoConfiguration

@Configuration @EnableConfigurationProperties @ConditionalOnNacosDiscoveryEnabled @ConditionalOnProperty(value = "spring.cloud.service-registry.auto-registration.enabled", matchIfMissing = true) @AutoConfigureAfter({ AutoServiceRegistrationConfiguration.class, AutoServiceRegistrationAutoConfiguration.class }) public class NacosDiscoveryAutoConfiguration { /** * NacosServiceRegistry:实现服务注册 */ @Bean public NacosServiceRegistry nacosServiceRegistry( NacosDiscoveryProperties nacosDiscoveryProperties) { return new NacosServiceRegistry(nacosDiscoveryProperties); } /** * NacosRegistration:保存服务的基本数据信息 * 这个Bean注入到spring容器的条件则为:容器中必须有AutoServiceRegistrationProperties这个bean */ @Bean @ConditionalOnBean(AutoServiceRegistrationProperties.class) public NacosRegistration nacosRegistration( NacosDiscoveryProperties nacosDiscoveryProperties, ApplicationContext context) { return new NacosRegistration(nacosDiscoveryProperties, context); } /** * NacosAutoServiceRegistration:实现服务自动注册 */ @Bean @ConditionalOnBean(AutoServiceRegistrationProperties.class) public NacosAutoServiceRegistration nacosAutoServiceRegistration( NacosServiceRegistry registry, AutoServiceRegistrationProperties autoServiceRegistrationProperties, NacosRegistration registration) { return new NacosAutoServiceRegistration(registry, autoServiceRegistrationProperties, registration); } }

二、主线分析

服务自动注册

1)原理分析

Spring Cloud 有 Euerka、ZK 等多种注册中心的实现,想要达到实现统一必须有一套规范,而Spring Cloud Commons 就是定义了这一规范。

Spring Cloud Commons里面的org.springframework.cloud.client.serviceregistry包下面有 AutoServiceRegistration、Registration、ServiceRegistry这三个接口,这是服务注册的核心接口。

1、AutoServiceRegistration

AutoServiceRegistration用于服务自动注册。自动注册的意思就是,服务启动后自动把服务信息注册到注册中心。

public interface AutoServiceRegistration { }

AutoServiceRegistration没有定义方法,它的存在就是要规范实现必须要有自动注册。

2、Registration

Registration存储服务信息,用于规范将什么信息注册到注册中心。

public interface Registration extends ServiceInstance { }

Registration继承 ServiceInstance,ServiceInstance定义了一个服务实例应该具有什么信息。

public interface ServiceInstance { /** * 实例的唯一标识 * @return The unique instance ID as registered. */ default String getInstanceId() { return null; } /** * 服务名称 * @return The service ID as registered. */ String getServiceId(); /** * 主机地址 * @return The hostname of the registered service instance. */ String getHost(); /** * 端口号 * @return The port of the registered service instance. */ int getPort(); /** * 是否是HTTPS * @return Whether the port of the registered service instance uses HTTPS. */ boolean isSecure(); /** * 服务URI地址 * @return The service URI address. */ URI getUri(); /** * 服务的元数据信息,如果我们需要给服务携带上其他额外的信息,就可以保存在这个里面 * @return The key / value pair metadata associated with the service instance. */ Map<String, String> getMetadata(); /** * @return The scheme of the service instance. */ default String getScheme() { return null; } }

3、ServiceRegistry

ServiceRegistry是服务注册接口,用来向注册中心注册服务。

public interface ServiceRegistry<R extends Registration> { // 注册服务 void register(R registration); // 反注册 void deregister(R registration); // 关闭 void close(); // 设置服务状态 void setStatus(R registration, String status); // 获取服务状态 <T> T getStatus(R registration); }

下面我们分析下Naocs是如何实现这一规范的?

通过以上入口代码可以发现有三个比较重要的bean,这三个bean也实现了以上规范,接下来主要分析下这三个bean的作用

NacosRegistration:

该类主要是管理服务的一些基本数据,如服务名,服务ip地址等信息。它实现了spring-cloud-commons 提供的Registration、ServiceInstance接口。

public class NacosRegistration implements Registration, ServiceInstance { public static final String MANAGEMENT_PORT = "management.port"; public static final String MANAGEMENT_CONTEXT_PATH = "management.context-path"; public static final String MANAGEMENT_ADDRESS = "management.address"; public static final String MANAGEMENT_ENDPOINT_BASE_PATH = "management.endpoints.web.base-path"; //保存着管理服务的基本数据 private NacosDiscoveryProperties nacosDiscoveryProperties; private ApplicationContext context; public NacosRegistration(NacosDiscoveryProperties nacosDiscoveryProperties, ApplicationContext context) { this.nacosDiscoveryProperties = nacosDiscoveryProperties; this.context = context; } .....省略 }

NacosServiceRegistry:

该类实现了 spring-cloud-commons 提供的 ServiceRegistry接口,在register方法中主要是将配置文件封装成Instance实例,调用了namingService.registerInstance(serviceId, instance)方法将服务注册到注册中心。

public class NacosServiceRegistry implements ServiceRegistry<Registration> { private static final Logger log = LoggerFactory.getLogger(NacosServiceRegistry.class); private final NacosDiscoveryProperties nacosDiscoveryProperties; // 这个是用来与注册中心进行通信的,如注册服务到nacos注册中心 private final NamingService namingService; public NacosServiceRegistry(NacosDiscoveryProperties nacosDiscoveryProperties) { this.nacosDiscoveryProperties = nacosDiscoveryProperties; this.namingService = nacosDiscoveryProperties.namingServiceInstance(); } // 服务注册 @Override public void register(Registration registration) { if (StringUtils.isEmpty(registration.getServiceId())) { log.warn("No service to register for nacos client..."); return; } String serviceId = registration.getServiceId(); String group = nacosDiscoveryProperties.getGroup(); // 将 Registration 转换成 Instance Instance instance = getNacosInstanceFromRegistration(registration); try { // 将服务注册到注册中心 // 这个方法里面的底层还是发送http请求完成注册(重点关注,其实是利用的HttpClient发送一个请求给服务端) namingService.registerInstance(serviceId, group, instance); log.info("nacos registry, {} {} {}:{} register finished", group, serviceId, instance.getIp(), instance.getPort()); } catch (Exception e) { log.error("nacos registry, {} register failed...{},", serviceId, registration.toString(), e); // rethrow a RuntimeException if the registration is failed. // issue : https://github.com/alibaba/spring-cloud-alibaba/issues/1132 rethrowRuntimeException(e); } } // 省略其它方法 }

NacosAutoServiceRegistration: 是用来触发服务注册行为的。

查看NacosAutoServiceRegistration源码可以发现,NacosAutoServiceRegistration实现了ApplicationListener接口,项目启动成功后会调用ApplicationListener的onApplicationEvent(WebServerInitializedEvent event)方法,通过这里最终会调用ServiceRegistry.register(R registration)方法将服务注册到注册中心

那为什么会调用onApplicationEvent方法呢? 是因为利用了观察者设计模式,也是spring的事件监听机制,实现了ApplicationListener接口的类会对一个事件感兴趣,当这个监听器监听到这个事件,则会触发onApplicationEvent方法

public abstract class AbstractAutoServiceRegistration<R extends Registration> implements AutoServiceRegistration, ApplicationContextAware, //实现ApplicationListener成为一个监听器,对WebServerInitializedEvent事件感兴趣 ApplicationListener<WebServerInitializedEvent> { //当监听到WebServerInitializedEvent会执行此方法 @Override @SuppressWarnings("deprecation") public void onApplicationEvent(WebServerInitializedEvent event) { //此处调用 bind(event); } @Deprecated public void bind(WebServerInitializedEvent event) { ApplicationContext context = event.getApplicationContext(); if (context instanceof ConfigurableWebServerApplicationContext) { if ("management".equals(((ConfigurableWebServerApplicationContext) context) .getServerNamespace())) { return; } } this.port.compareAndSet(0, event.getWebServer().getPort()); //此处调用 this.start(); } public void start() { if (!isEnabled()) { if (logger.isDebugEnabled()) { logger.debug("Discovery Lifecycle disabled. Not starting"); } return; } // only initialize if nonSecurePort is greater than 0 and it isn't already running // because of containerPortInitializer below if (!this.running.get()) { this.context.publishEvent( new InstancePreRegisteredEvent(this, getRegistration())); //这里会经过一系列调用链,会掉到NacosServiceRegistry中的register方法 register(); if (shouldRegisterManagement()) { registerManagement(); } this.context.publishEvent( new InstanceRegisteredEvent<>(this, getConfiguration())); this.running.compareAndSet(false, true); } } }

那么WebServerInitializedEvent事件是何时发布的呢?我们通过断点debug观察下:

2)服务注册细节分析

接下来我们看下客户端是如何向服务端注册服务的

客户端

NacosServiceRegistry#register

而NacosNamingService是NacosServiceRegistry实例化的时候创建的

public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException { if (instance.isEphemeral()) { //封装了心跳信息 BeatInfo beatInfo = new BeatInfo(); beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName)); beatInfo.setIp(instance.getIp()); beatInfo.setPort(instance.getPort()); beatInfo.setCluster(instance.getClusterName()); beatInfo.setWeight(instance.getWeight()); beatInfo.setMetadata(instance.getMetadata()); beatInfo.setScheduled(false); long instanceInterval = instance.getInstanceHeartBeatInterval(); beatInfo.setPeriod(instanceInterval == 0L ? DEFAULT_HEART_BEAT_INTERVAL : instanceInterval); //心跳续约的方法 this.beatReactor.addBeatInfo( NamingUtils.getGroupedName(serviceName, groupName), beatInfo); } //服务注册 this.serverProxy.registerService( NamingUtils.getGroupedName(serviceName, groupName), groupName, instance); } public void registerService(String serviceName, String groupName, Instance instance) throws NacosException { Map<String, String> params = new HashMap(9); params.put("namespaceId", this.namespaceId); params.put("serviceName", serviceName); params.put("groupName", groupName); params.put("clusterName", instance.getClusterName()); params.put("ip", instance.getIp()); params.put("port", String.valueOf(instance.getPort())); params.put("weight", String.valueOf(instance.getWeight())); params.put("enable", String.valueOf(instance.isEnabled())); params.put("healthy", String.valueOf(instance.isHealthy())); params.put("ephemeral", String.valueOf(instance.isEphemeral())); params.put("metadata", JSON.toJSONString(instance.getMetadata())); //服务注册 this.reqAPI(UtilAndComs.NACOS_URL_INSTANCE, params, (String)"POST"); } public String reqAPI(String api, Map<String, String> params, List<String> servers, String method) { params.put("namespaceId", this.getNamespaceId()); if (CollectionUtils.isEmpty(servers) && StringUtils.isEmpty(this.nacosDomain)) { throw new IllegalArgumentException("no server available"); } else { Exception exception = new Exception(); if (servers != null && !servers.isEmpty()) { Random random = new Random(System.currentTimeMillis()); int index = random.nextInt(servers.size()); for(int i = 0; i < servers.size(); ++i) { String server = (String)servers.get(index); try { //重点关注,服务注册方法 return this.callServer(api, params, server, method); } catch (NacosException var11) { exception = var11; LogUtils.NAMING_LOGGER.error("request {} failed.", server, var11); } catch (Exception var12) { exception = var12; LogUtils.NAMING_LOGGER.error("request {} failed.", server, var12); } index = (index + 1) % servers.size(); } } else { //重试机制 int i = 0; while(i < 3) { try { return this.callServer(api, params, this.nacosDomain); } catch (Exception var13) { exception = var13; ++i; } } } } } public String callServer(String api, Map<String, String> params, String curServer, String method) throws NacosException { long start = System.currentTimeMillis(); long end = 0L; this.checkSignature(params); List<String> headers = this.builderHeaders(); String url; if (!curServer.startsWith("https://") && !curServer.startsWith("http://")) { if (!curServer.contains(":")) { curServer = curServer + ":" + this.serverPort; } url = HttpClient.getPrefix() + curServer + api; } else { url = curServer + api; } //向服务端发送请求,注册服务 HttpResult result = HttpClient.request(url, headers, params, "UTF-8", method); end = System.currentTimeMillis(); MetricsMonitor .getNamingRequestMonitor(method, url, String.valueOf(result.code)).observe((double)(end - start)); if (200 == result.code) { return result.content; } else if (304 == result.code) { return ""; } else { } }

服务端

接下来我们看下服务端的代码,其实就是一个Controller

我们只需找到接受上面请求的Controller即可(InstanceController#register)

@CanDistro @PostMapping @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE) public String register(HttpServletRequest request) throws Exception { //从请求参数中获取namespaceId final String namespaceId = WebUtils .optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); //从请求参数中获取服务名称 final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); //校验服务名称的格式 分组名称@@服务名 checkServiceNameFormat(serviceName); //获取要注册的实例 final Instance instance = parseInstance(request); //注册入口 serviceManager.registerInstance(namespaceId, serviceName, instance); return "ok"; }

1.从注册入口进入

public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException { //Map<namespaceId, Map<groupName@@serviceName, Service>> serviceMap //如果serviceMap没有此服务,则创建一个并放入serviceMap中 createEmptyService(namespaceId, serviceName, instance.isEphemeral()); //获取服务 Service service = getService(namespaceId, serviceName); if (service == null) { throw new NacosException(NacosException.INVALID_PARAM, "service not found, namespace: " + namespaceId + ", service: " + serviceName); } //服务中挂载实例 addInstance(namespaceId, serviceName, instance.isEphemeral(), instance); }

2.1 进入createEmptyService方法

public void createEmptyService(String namespaceId, String serviceName, boolean local) throws NacosException { createServiceIfAbsent(namespaceId, serviceName, local, null); }

主要做了三件事情:

把服务放入serviceMap服务注册表中

初始化服务,创建一个健康检查的任务(主线–健康检查代码)

向一个队列中添加一个监听器(RecordListener类型)当监听到某些事件时会执行里面的onChange方法,Nacos大量运用了观察者设计模式,比如实例的注册、剔除等会被抽象成一个个的任务放到一个阻塞队列中,当监听到有任务时进来时,监听器会处理这些任务,执行onChange方法

Service:主要关心实例的变动

ServiceManager:主要关心服务的变动

SwitchManager:主要关心AP模型、CP模型的切换

很多开源框架为了提升操作性能会大量使用这种异步任务及内存队列的操作,这些操作本身并不需要写入后立即成功,用这种方式对提高性能有很大的帮助

public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException { Service service = getService(namespaceId, serviceName); //第一次注册服务service为null if (service == null) { Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName); service = new Service(); service.setName(serviceName); service.setNamespaceId(namespaceId); service.setGroupName(NamingUtils.getGroupName(serviceName)); // now validate the service. if failed, exception will be thrown service.setLastModifiedMillis(System.currentTimeMillis()); service.recalculateChecksum(); if (cluster != null) { cluster.setService(service); service.getClusterMap().put(cluster.getName(), cluster); } service.validate(); //主要做了三件事情:一把服务放入serviceMap服务注册表中, // 二初始化服务,创建一个健康检查的任务, // 三设置监听器 putServiceAndInit(service); //与持久化有关, if (!local) { addOrReplaceService(service); } } } private void putServiceAndInit(Service service) throws NacosException { //把服务放入serviceMap中 Map<namespaceId, Map<groupName@@serviceName, Service>> serviceMap putService(service); //初始化服务,创建一个健康检查的任务 service.init(); consistencyService .listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service); consistencyService .listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service); Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJson()); }

2.2 进入addInstance方法

public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)throws NacosException { //service 唯一标识 String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral); //获取服务 Service service = getService(namespaceId, serviceName); synchronized (service) { //拿到service服务下面的所有实例 List<Instance> instanceList = addIpAddresses(service, ephemeral, ips); Instances instances = new Instances(); instances.setInstanceList(instanceList); //consistencyService 默认的实现是DelegateConsistencyServiceImpl 类 //而DelegateConsistencyServiceImpl属于一种静态代理模式,根据是否是临时节点选择AP或者CP模式 //默认是CP模式,默认实现类为DistroConsistencyServiceImpl consistencyService.put(key, instances); } }

我们主要分析DistroConsistencyServiceImpl类,AP模型

@DependsOn("ProtocolManager") @org.springframework.stereotype.Service("distroConsistencyService") public class DistroConsistencyServiceImpl implements EphemeralConsistencyService, DistroDataProcessor { //一个通知任务线程 private volatile Notifier notifier = new Notifier(); //存储监听器队列 private Map<String, ConcurrentLinkedQueue<RecordListener>> listeners = new ConcurrentHashMap<>(); //当Spring容器启动后执行初始化方法 @PostConstruct public void init() { //让Notifier任务跑起来 GlobalExecutor.submitDistroNotifyTask(notifier); } //上面的方法会进入到这里 @Override public void put(String key, Record value) throws NacosException { onPut(key, value); //集群同步 distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE, globalConfig.getTaskDispatchPeriod() / 2); } public void onPut(String key, Record value) { if (KeyBuilder.matchEphemeralInstanceListKey(key)) { //Datum中存储了实例列表 Datum<Instances> datum = new Datum<>(); datum.value = (Instances) value; datum.key = key; datum.timestamp.incrementAndGet(); //dataStore与集群同步有关 dataStore.put(key, datum); } //Map<String, ConcurrentLinkedQueue<RecordListener>> listeners if (!listeners.containsKey(key)) { return; } //添加一个change任务 notifier.addTask(key, DataOperation.CHANGE); } public class Notifier implements Runnable { //存储任务的队列 private BlockingQueue<Pair<String, DataOperation>> tasks = new ArrayBlockingQueue<>(1024 * 1024); //队列中添加任务 public void addTask(String datumKey, DataOperation action) { if (services.containsKey(datumKey) && action == DataOperation.CHANGE) { return; } if (action == DataOperation.CHANGE) { services.put(datumKey, StringUtils.EMPTY); } tasks.offer(Pair.with(datumKey, action)); } @Override public void run() { Loggers.DISTRO.info("distro notifier started"); for (; ; ) { try { //BlockingQueue<Pair<String, DataOperation>> tasks 阻塞队列 //会一直往阻塞队列中获取任务 Pair<String, DataOperation> pair = tasks.take(); //拿到任务后执行 handle(pair); } catch (Throwable e) { Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e); } } } //处理任务 private void handle(Pair<String, DataOperation> pair) { try { String datumKey = pair.getValue0(); DataOperation action = pair.getValue1(); services.remove(datumKey); int count = 0; if (!listeners.containsKey(datumKey)) { return; } for (RecordListener listener : listeners.get(datumKey)) { count++; try { if (action == DataOperation.CHANGE) { //执行监听器中的onChange方法 listener.onChange(datumKey, dataStore.get(datumKey).value); continue; } if (action == DataOperation.DELETE) { //执行监听器中的onDelete方法 listener.onDelete(datumKey); continue; } } catch (Throwable e) { } } if (Loggers.DISTRO.isDebugEnabled()) { Loggers.DISTRO.debug(""); } } catch (Throwable e) { Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e); } } } }

3.通过以上分析,服务注册实在Service中的onChange方法中完成的

//当监听到注册任务时,会执行onChange方法 @Override public void onChange(String key, Instances value) throws Exception { Loggers.SRV_LOG.info("[NACOS-RAFT] datum is changed, key: {}, value: {}", key, value); for (Instance instance : value.getInstanceList()) { if (instance == null) { // Reject this abnormal instance list: throw new RuntimeException("got null instance " + key); } if (instance.getWeight() > 10000.0D) { instance.setWeight(10000.0D); } if (instance.getWeight() < 0.01D && instance.getWeight() > 0.0D) { instance.setWeight(0.01D); } } //将注册实例更新到注册表内存中 updateIPs(value.getInstanceList(), KeyBuilder.matchEphemeralInstanceListKey(key)); recalculateChecksum(); }

集群同步

初始化全量拉取:Distro协议节点启动时会从其他节点全量同步数据(DistroLoadDataTask)

public DistroProtocol(ServerMemberManager memberManager, DistroComponentHolder distroComponentHolder, DistroTaskEngineHolder distroTaskEngineHolder, DistroConfig distroConfig) { this.memberManager = memberManager; this.distroComponentHolder = distroComponentHolder; this.distroTaskEngineHolder = distroTaskEngineHolder; this.distroConfig = distroConfig; //nacos启动时会启动一些校验任务 startVerifyTask(); } private void startVerifyTask() { DistroCallback loadCallback = new DistroCallback() { @Override public void onSuccess() { loadCompleted = true; } @Override public void onFailed(Throwable throwable) { loadCompleted = false; } }; GlobalExecutor.schedulePartitionDataTimedSync( new DistroVerifyTask(memberManager, distroComponentHolder), distroConfig.getVerifyIntervalMillis()); //开启DistroLoadDataTask线程调用load()方法加载数据 GlobalExecutor.submitLoadDataTask( new DistroLoadDataTask(memberManager, distroComponentHolder, distroConfig, loadCallback)); } public class DistroLoadDataTask implements Runnable { @Override public void run() { try { //加载其他节点数据 load(); if (!checkCompleted()) { GlobalExecutor.submitLoadDataTask(this, distroConfig.getLoadDataRetryDelayMillis()); } else { loadCallback.onSuccess(); Loggers.DISTRO.info("[DISTRO-INIT] load snapshot data success"); } } catch (Exception e) { loadCallback.onFailed(e); Loggers.DISTRO.error("[DISTRO-INIT] load snapshot data failed. ", e); } } private void load() throws Exception { while (memberManager.allMembersWithoutSelf().isEmpty()) { Loggers.DISTRO.info("[DISTRO-INIT] waiting server list init..."); TimeUnit.SECONDS.sleep(1); } while (distroComponentHolder.getDataStorageTypes().isEmpty()) { Loggers.DISTRO.info("[DISTRO-INIT] waiting distro data storage register..."); TimeUnit.SECONDS.sleep(1); } for (String each : distroComponentHolder.getDataStorageTypes()) { if (!loadCompletedMap.containsKey(each) || !loadCompletedMap.get(each)) { //真正加载数据方法 loadCompletedMap.put(each, loadAllDataSnapshotFromRemote(each)); } } } //从远程机器同步所有数据 private boolean loadAllDataSnapshotFromRemote(String resourceType) { DistroTransportAgent transportAgent = distroComponentHolder.findTransportAgent(resourceType); DistroDataProcessor dataProcessor = distroComponentHolder.findDataProcessor(resourceType); if (null == transportAgent || null == dataProcessor) { return false; } for (Member each : memberManager.allMembersWithoutSelf()) { try { DistroData distroData = transportAgent.getDatumSnapshot(each.getAddress()); boolean result = dataProcessor.processSnapshot(distroData); if (result) { return true; } } catch (Exception e) { } } return false; } }

增量拉取:通过以上分析,我们发现在做服务注册的时候会调用一个onPut方法,之后会调用以下方法进行集群同步的操作

public void sync(DistroKey distroKey, DataOperation action, long delay) { //向除自身外所有节点广播 for (Member each : memberManager.allMembersWithoutSelf()) { DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(),each.getAddress()); DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay); //通过distroTaskEngineHolder发布延时任务 distroTaskEngineHolder.getDelayTaskExecuteEngine() .addTask(distroKeyWithTarget, distroDelayTask); if (Loggers.DISTRO.isDebugEnabled()) { Loggers.DISTRO.debug("[DISTRO-SCHEDULE] {} to {}", distroKey, each.getAddress()); } } }

心跳续约

通过以上服务注册源码分析发现在服务注册的时候会开启一个心跳续约的定时任务

客户端代码

class BeatTask implements Runnable { BeatInfo beatInfo; public BeatTask(BeatInfo beatInfo) { this.beatInfo = beatInfo; } @Override public void run() { if (beatInfo.isStopped()) { return; } //发送心跳续约请求入口 long result = serverProxy.sendBeat(beatInfo); long nextTime = result > 0 ? result : beatInfo.getPeriod(); //schedule这定时任务的API只会调用一次,每隔一定时间又会调用这个线程执行任务 executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS); } } public long sendBeat(BeatInfo beatInfo) { try { if (NAMING_LOGGER.isDebugEnabled()) { ...... } Map<String, String> params = new HashMap<String, String>(4); params.put("beat", JSON.toJSONString(beatInfo)); params.put(CommonParams.NAMESPACE_ID, namespaceId); params.put(CommonParams.SERVICE_NAME, beatInfo.getServiceName()); //发送一个心跳续约的请求给服务端 String result = reqAPI( UtilAndComs.NACOS_URL_BASE + "/instance/beat", params, HttpMethod.PUT); JSONObject jsonObject = JSON.parseObject(result); if (jsonObject != null) { return jsonObject.getLong("clientBeatInterval"); } } catch (Exception e) { ... } return 0L; }

那么服务端是如何处理心跳续约的呢?

服务端代码:InstanceController#beat

@CanDistro @PutMapping("/beat") @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE) public ObjectNode beat(HttpServletRequest request) throws Exception { ObjectNode result = JacksonUtils.createEmptyJsonNode(); result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, switchDomain.getClientBeatInterval()); //从请求参数parameterMap中,获取key为beat的值 String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY); RsInfo clientBeat = null; if (StringUtils.isNotBlank(beat)) { clientBeat = JacksonUtils.toObj(beat, RsInfo.class); } //从请求参数中获取clusterName的值 String clusterName = WebUtils .optional(request, CommonParams.CLUSTER_NAME, UtilsAndCommons.DEFAULT_CLUSTER_NAME); //从请求参数中获取ip的值 String ip = WebUtils.optional(request, "ip", StringUtils.EMPTY); //从请求参数中获取port的值 int port = Integer.parseInt(WebUtils.optional(request, "port", "0")); if (clientBeat != null) { if (StringUtils.isNotBlank(clientBeat.getCluster())) { //clusterName 默认为DEFAULT clusterName = clientBeat.getCluster(); } else { // fix #2533 clientBeat.setCluster(clusterName); } ip = clientBeat.getIp(); port = clientBeat.getPort(); } //命名空间 默认public String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); //服务名称,格式为 组名@@服务名 String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); //校验 serviceName 格式是否为nacos规定格式,避免手动向Server端发送错误心跳请求 checkServiceNameFormat(serviceName); //获取实例信息 Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port); if (instance == null) { if (clientBeat == null) { result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND); return result; } instance = new Instance(); instance.setPort(clientBeat.getPort()); instance.setIp(clientBeat.getIp()); instance.setWeight(clientBeat.getWeight()); instance.setMetadata(clientBeat.getMetadata()); instance.setClusterName(clusterName); instance.setServiceName(serviceName); instance.setInstanceId(instance.getInstanceId()); instance.setEphemeral(clientBeat.isEphemeral()); //如果实例不存在,从新注册一个从clientBeat获取实例信息 //这里是应对如果网络不通导致实例在服务端被下架,或者服务端重启临时节点的服务实例丢失的问题 serviceManager.registerInstance(namespaceId, serviceName, instance); } //获取服务信息 //Nacos的数据模型为 service(服务) --> cluster(集群) --> instance(实例,封装了IP、端口、权重等信息) Service service = serviceManager.getService(namespaceId, serviceName); if (service == null) { throw new NacosException(NacosException.SERVER_ERROR, "service not found: " + serviceName + "@" + namespaceId); } if (clientBeat == null) { clientBeat = new RsInfo(); clientBeat.setIp(ip); clientBeat.setPort(port); clientBeat.setCluster(clusterName); } //开启一个心跳续约的线程,更新客户端实例的最后心跳时间 service.processClientBeat(clientBeat); result.put(CommonParams.CODE, NamingResponseCode.OK); if (instance.containsMetadata(PreservedMetadataKeys.HEART_BEAT_INTERVAL)) { result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, instance.getInstanceHeartBeatInterval()); } result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled()); return result; } public void processClientBeat(final RsInfo rsInfo) { ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor(); clientBeatProcessor.setService(this); clientBeatProcessor.setRsInfo(rsInfo); HealthCheckReactor.scheduleNow(clientBeatProcessor); } public class ClientBeatProcessor implements Runnable { ... @Override public void run() { Service service = this.service; if (Loggers.EVT_LOG.isDebugEnabled()) { Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString()); } //获取IP,端口,集群,及集群中所有实例 String ip = rsInfo.getIp(); String clusterName = rsInfo.getCluster(); int port = rsInfo.getPort(); Cluster cluster = service.getClusterMap().get(clusterName); List<Instance> instances = cluster.allIPs(true); for (Instance instance : instances) { if (instance.getIp().equals(ip) && instance.getPort() == port) { if (Loggers.EVT_LOG.isDebugEnabled()) { Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString()); } //设置最后心跳时间 instance.setLastBeat(System.currentTimeMillis()); if (!instance.isMarked()) { if (!instance.isHealthy()) { instance.setHealthy(true); getPushService().serviceChanged(service); } } } } } }

健康检查

通过分析自动注册的代码,在第一注册服务时,会开启一个健康检查的任务(ServiceManager#putServiceAndInit)

private void putServiceAndInit(Service service) throws NacosException { //把服务放入serviceMap中 Map<namespaceId, Map<groupName@@serviceName, Service>> serviceMap putService(service); //初始化服务,创建一个健康检查的任务 service.init(); consistencyService .listen(KeyBuilder.buildInstanceListKey( service.getNamespaceId(), service.getName(), true), service); consistencyService .listen(KeyBuilder.buildInstanceListKey( service.getNamespaceId(), service.getName(), false), service); Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJson()); } public void init() { //创建一个健康检查的任务 //如果某个实例超过15s没有收到心跳,则它的healthy属性则会被设为false //如果某个实例超过30s没有收到心跳,直接剔除该实例 HealthCheckReactor.scheduleCheck(clientBeatCheckTask); for (Map.Entry<String, Cluster> entry : clusterMap.entrySet()) { entry.getValue().setService(this); entry.getValue().init(); } }

ClientBeatCheckTask:这个就是健康检查的任务

public class ClientBeatCheckTask implements Runnable { @Override public void run() { try { if (!getDistroMapper().responsible(service.getName())) { return; } if (!getSwitchDomain().isHealthCheckEnabled()) { return; } List<Instance> instances = service.allIPs(true); // first set health status of instances: for (Instance instance : instances) { //判读最后心跳时间是否超过15s if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) { if (!instance.isMarked()) { if (instance.isHealthy()) { //修改实例健康状态 instance.setHealthy(false); //发布一个ServiceChangeEvent事件, //PushService监听器中onApplicationEvent方法将会执行 getPushService().serviceChanged(service); //发布一个事件 ApplicationUtils .publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance)); } } } } if (!getGlobalConfig().isExpireInstance()) { return; } // then remove obsolete instances: for (Instance instance : instances) { if (instance.isMarked()) { continue; } //判断最后心跳时间是否超过30s if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) { //删除实例 deleteIp(instance); } } } catch (Exception e) { Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e); } } }

服务剔除

当最后心跳时间超过30s,会执行服务剔除方法

private void deleteIp(Instance instance) { try { NamingProxy.Request request = NamingProxy.Request.newRequest(); request.appendParam("ip", instance.getIp()) .appendParam("port", String.valueOf(instance.getPort())) .appendParam("ephemeral", "true") .appendParam("clusterName", instance.getClusterName()) .appendParam("serviceName", service.getName()) .appendParam("namespaceId", service.getNamespaceId()); String url = "http://127.0.0.1:" + ApplicationUtils.getPort() + ApplicationUtils.getContextPath() + UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance?" + request.toUrl(); // 发送剔除请求 HttpClient.asyncHttpDelete(url, null, null, new Callback<String>() { @Override public void onReceive(RestResult<String> result) { if (!result.ok()) { ... } } @Override public void onError(Throwable throwable) { ... } @Override public void onCancel() { ... } }); } catch (Exception e) { } } @CanDistro @DeleteMapping @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE) public String deregister(HttpServletRequest request) throws Exception { Instance instance = getIpAddress(request); String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); checkServiceNameFormat(serviceName); Service service = serviceManager.getService(namespaceId, serviceName); if (service == null) { Loggers.SRV_LOG.warn("remove instance from non-exist service: {}", serviceName); return "ok"; } //移除实例 serviceManager.removeInstance(namespaceId, serviceName, instance.isEphemeral(), instance); return "ok"; }
最新回复(0)