Files
pos-system/docs/en/skills/service-discovery-registry.md
Ho Ngoc Hai 2640b351c3 Enhance documentation with detailed diagrams and structured flows
- Added request/response flow diagrams to api-design and api-gateway-advanced skills for better visualization of processes.
- Introduced configuration loading flow in configuration-management skill to clarify the configuration process.
- Included error propagation flow in error-handling-patterns skill to illustrate error handling across layers.
- Enhanced various skills with additional diagrams to improve understanding of complex concepts.

These updates aim to provide clearer guidance and improve the overall documentation experience for developers.
2026-01-01 23:22:54 +07:00

9.8 KiB

name, description
name description
service-discovery-registry Service discovery and registry patterns for GoodGo microservices including service registry, health check orchestration, load balancing strategies, and service mesh integration.

Service Discovery & Registry Patterns

When to Use This Skill

Use this skill when:

  • Implementing service discovery mechanisms
  • Managing service registry
  • Orchestrating health checks
  • Implementing load balancing strategies
  • Integrating with service mesh
  • Managing dynamic service registration
  • Implementing DNS-based discovery
  • Building service catalog

Core Concepts

Service Discovery Types

  1. Client-Side Discovery: Client queries service registry
  2. Server-Side Discovery: Load balancer queries registry (Kubernetes DNS)
  3. Service Mesh: Automatic service discovery (Istio, Linkerd)

Service Registry

  • Static Configuration: Hardcoded service addresses
  • Dynamic Registry: Services register/unregister dynamically
  • DNS-Based: Use DNS for service discovery (Kubernetes)

Service Registration Flow

The service registration lifecycle involves startup registration, periodic heartbeats, and graceful shutdown unregistration.

sequenceDiagram
    participant Service as Service Instance
    participant Registry as Service Registry
    participant Health as Health Check

    Note over Service,Registry: Service Startup
    Service->>Registry: Register service info<br/>(name, url, version)
    Registry->>Registry: Store service metadata
    Registry-->>Service: Registration confirmed

    Note over Service,Registry: Heartbeat Loop (every 30s)
    loop Every 30 seconds
        Service->>Health: Check own health status
        Health-->>Service: Health status
        Service->>Registry: Update registration<br/>(status, lastHeartbeat)
        Registry->>Registry: Update service record
    end

    Note over Service,Registry: Service Shutdown
    Service->>Registry: Unregister service
    Registry->>Registry: Remove service record
    Registry-->>Service: Unregistration confirmed

Health Check Orchestration

Health checks ensure services are available and functioning correctly. The system aggregates health status from multiple services to determine overall system health.

flowchart TD
    Start([Health Check Request]) --> GetServices[Get All Services from Registry]
    GetServices --> CheckEach{For Each Service}
    
    CheckEach --> CheckHealth[Check Service Health Endpoint]
    CheckHealth --> HealthOK{Health OK?}
    
    HealthOK -->|Yes| UpdateHealthy[Update Status: Healthy]
    HealthOK -->|No| UpdateUnhealthy[Update Status: Unhealthy]
    
    UpdateHealthy --> CheckTimeout{Last Heartbeat<br/>< 60s?}
    UpdateUnhealthy --> CheckTimeout
    
    CheckTimeout -->|Yes| MarkActive[Mark as Active]
    CheckTimeout -->|No| MarkStale[Mark as Stale]
    
    MarkActive --> NextService{More Services?}
    MarkStale --> NextService
    
    NextService -->|Yes| CheckEach
    NextService -->|No| AggregateStatus[Aggregate Overall Status]
    
    AggregateStatus --> CountUnhealthy[Count Unhealthy Services]
    CountUnhealthy --> DetermineStatus{Unhealthy Count}
    
    DetermineStatus -->|0| StatusHealthy[Status: Healthy]
    DetermineStatus -->|< 50%| StatusDegraded[Status: Degraded]
    DetermineStatus -->|>= 50%| StatusUnhealthy[Status: Unhealthy]
    
    StatusHealthy --> ReturnResult[Return Health Status]
    StatusDegraded --> ReturnResult
    StatusUnhealthy --> ReturnResult
    
    ReturnResult --> End([End])

Load Balancing Strategies

Load balancing distributes requests across multiple service instances. Different strategies are used based on service characteristics and requirements.

flowchart TD
    Start([Incoming Request]) --> GetInstances[Get Available Service Instances]
    GetInstances --> SelectStrategy{Load Balancing Strategy}
    
    SelectStrategy -->|Round Robin| RoundRobin[Round Robin Algorithm]
    SelectStrategy -->|Least Connections| LeastConn[Least Connections Algorithm]
    SelectStrategy -->|Weighted| Weighted[Weighted Round Robin]
    
    RoundRobin --> SelectNext[Select Next Instance in Order]
    SelectNext --> UseInstance[Use Selected Instance]
    
    LeastConn --> CompareConn[Compare Connection Counts]
    CompareConn --> SelectMin[Select Instance with<br/>Minimum Connections]
    SelectMin --> UseInstance
    
    Weighted --> CalculateWeight[Calculate Total Weight]
    CalculateWeight --> RandomSelect[Random Selection Based on Weights]
    RandomSelect --> UseInstance
    
    UseInstance --> ForwardRequest[Forward Request to Instance]
    ForwardRequest --> UpdateStats[Update Statistics]
    UpdateStats --> End([Request Completed])
    
    style RoundRobin fill:#e1f5ff
    style LeastConn fill:#e1f5ff
    style Weighted fill:#e1f5ff

Kubernetes DNS Discovery

Kubernetes provides built-in DNS-based service discovery. Services are automatically discoverable via DNS names.

// Use Kubernetes DNS for service discovery
const serviceUrl = `http://user-service.production.svc.cluster.local`;

DNS Patterns

  • Short form (same namespace): http://user-service
  • Full form (cross-namespace): http://user-service.production.svc.cluster.local
  • With port: http://user-service.production.svc.cluster.local:5000

Service Registry Implementation

Register Service

// Register service
await serviceRegistry.register({
  name: 'user-service',
  version: '1.0.0',
  url: 'http://user-service:5000',
  healthCheckUrl: 'http://user-service:5000/health',
  status: 'healthy',
  lastHeartbeat: new Date(),
});

Discover Service

// Discover service
const service = await serviceRegistry.discover('user-service');
if (service?.status === 'healthy') {
  // Use service
}

List Healthy Services

// List all healthy services
const healthyServices = await serviceRegistry.listHealthyServices();

Health Check Aggregation

Aggregate health status from multiple services to determine overall system health.

// Aggregate health from multiple services
const health = await healthAggregator.getAggregatedHealth();
// Returns: { status: 'healthy' | 'degraded' | 'unhealthy', services: [...] }

Load Balancing Implementation

Round Robin

const instance = loadBalancer.roundRobin(instances);

Least Connections

const instance = loadBalancer.leastConnections(instances);

Weighted Round Robin

const instance = loadBalancer.weightedRoundRobin(instances);

Service Mesh Integration

Service mesh solutions like Istio provide automatic service discovery and advanced routing capabilities.

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: external-service
spec:
  hosts:
    - external-api.example.com
  ports:
    - number: 443
      name: https
      protocol: HTTPS
  location: MESH_EXTERNAL
  resolution: DNS

Best Practices

  1. Kubernetes DNS: Use Kubernetes DNS for service discovery in K8s environments
  2. Health Checks: Implement comprehensive health checks (/health, /health/live, /health/ready)
  3. Service Registry: Use registry for dynamic services that need runtime discovery
  4. Load Balancing: Choose appropriate load balancing strategy based on service characteristics
  5. Monitoring: Monitor service discovery and health check metrics
  6. Heartbeat: Implement periodic heartbeat (every 30 seconds) to keep registry updated
  7. Graceful Shutdown: Always unregister services on shutdown to prevent stale entries
  8. Fallback: Provide fallback mechanisms when registry is unavailable

Common Mistakes

  1. Hardcoded Service URLs: Breaks in different environments

    // ❌ BAD: Hardcoded
    const url = 'http://user-service:5000';
    
    // ✅ GOOD: Use discovery or env vars
    const url = discovery.getServiceUrl('user-service');
    
  2. No Heartbeat: Stale registry entries

    // ❌ BAD: Register once
    await registry.register(service);
    
    // ✅ GOOD: Periodic heartbeat
    setInterval(() => registry.register(service), 30000);
    
  3. Missing Graceful Shutdown: Orphaned registrations

    // ✅ Always unregister on shutdown
    process.on('SIGTERM', async () => {
      await registry.unregister(serviceName);
      process.exit(0);
    });
    
  4. No Fallback: Fails when registry unavailable

    // ❌ BAD: No fallback
    const url = await registry.discover('service');
    
    // ✅ GOOD: Fallback to default
    const url = await registry.discover('service')
      ?? process.env.SERVICE_FALLBACK_URL;
    

Quick Reference

Discovery Types

Discovery Type Implementation Use Case
K8s DNS service.namespace.svc.cluster.local Internal services
Service Registry Database-backed Dynamic services
Service Mesh Istio/Linkerd Complex routing
Environment Vars process.env.SERVICE_URL Simple/external

Health Check Endpoints

Endpoint Purpose
/health Basic health
/health/live K8s liveness probe
/health/ready K8s readiness probe

Load Balancing Strategies

Strategy When to Use
Round Robin Equal capacity servers
Least Connections Varying request durations
Weighted Different server capacities

Service Registration Lifecycle

Startup → Register → Heartbeat (30s) → ... → Shutdown → Unregister

Resources