Files
pos-system/docs/vi/architecture/caching-architecture.md

34 KiB

Kiến trúc Caching / Caching Architecture

VI: Chiến lược caching nhiều tầng để tối ưu hiệu suất EN: Multi-layer caching strategy for optimal performance

Sơ đồ Tổng quan / Overview Diagram

graph TD
    Request[API Request] --> L1{L1 Cache<br/>Memory}
    
    L1 -->|Hit| Return1[Return<br/>< 1ms]
    L1 -->|Miss| L2{L2 Cache<br/>Redis}
    
    L2 -->|Hit| WarmL1[Warm L1]
    WarmL1 --> Return2[Return<br/>< 5ms]
    
    L2 -->|Miss| DB[(Database)]
    DB --> StoreL2[Store L2 + L1]
    StoreL2 --> Return3[Return<br/>< 50ms]
    
    style L1 fill:#d4edda
    style L2 fill:#fff4e1
    style DB fill:#f0e1ff

## Bối cảnh Hệ thống / System Context

```mermaid
C4Context
    title Sơ đồ Bối cảnh Hệ thống Caching / Caching System Context

    System(service, "Microservice", "Client service using cache")
    System_Ext(db, "Neon PostgreSQL", "Primary database")
    
    Boundary(caching, "Caching Layer") {
        System(l1, "L1 Cache", "In-memory NodeCache")
        System(l2, "L2 Cache", "Redis Cluster")
    }

    Rel(service, l1, "Reads/Writes", "In-process")
    Rel(service, l2, "Reads/Writes", "Redis Protocol")
    Rel(l1, l2, "Fills from", "On miss")
    Rel(l2, db, "Cache aside", "On miss")

VI Mô tả Bối cảnh

  • Service: Giao tiếp trực tiếp với L1 Cache (in-memory) để đạt độ trễ thấp nhất.
  • L1 Cache: Cache cục bộ, không chia sẻ, tự động hết hạn (TTL ngắn).
  • L2 Cache: Redis cluster chia sẻ, giữ dữ liệu lâu dài hơn và đồng bộ giữa các instances.
  • Database: Nguồn dữ liệu gốc (source of truth), chỉ được truy cập khi cache miss.

EN Context Description

  • Service: Communicates directly with L1 Cache (in-memory) for lowest latency.
  • L1 Cache: Local cache, not shared, automatic expiration (short TTL).
  • L2 Cache: Shared Redis cluster, holds data longer and syncs across instances.
  • Database: Source of truth, accessed only on cache miss.

Mô tả Kiến trúc / Architecture Description

VI: Caching Nhiều Tầng

Nền tảng GoodGo sử dụng caching 2 tầng để tối ưu hiệu suất:

L1 Cache (Memory):

  • In-memory cache trên mỗi service instance
  • Truy cập rất nhanh (< 1ms)
  • Dung lượng giới hạn (10k keys mặc định)
  • TTL ngắn (60 giây mặc định, tối đa 5 phút)
  • Không share giữa instances

L2 Cache (Redis):

  • Shared distributed cache
  • Truy cập nhanh (< 5ms)
  • Dung lượng lớn
  • TTL dài hơn (configurable, thường 5-15 phút)
  • Share giữa tất cả service instances

Cache Flow:

Request → L1 → L2 → Database
  ↓        ↓    ↓      ↓
40-50%  80-90% 10-20%  Cache miss
hit rate hit rate        rate

EN: Multi-Layer Caching

GoodGo platform uses 2-layer caching for performance:

L1 Cache (Memory):

  • In-memory cache per service instance
  • Very fast access (< 1ms)
  • Limited capacity (10k keys default)
  • Short TTL (60 seconds default, max 5 minutes)
  • Not shared across instances

L2 Cache (Redis):

  • Shared distributed cache
  • Fast access (< 5ms)
  • Large capacity
  • Longer TTL (configurable, typically 5-15 minutes)
  • Shared across all service instances

Cache Flow:

Request → L1 → L2 → Database
  ↓        ↓    ↓      ↓
40-50%  80-90% 10-20%  Cache miss
hit rate hit rate        rate

Triển khai Cache / Cache Implementation

Multi-Layer Cache Service

// VI: Triển khai multi-layer cache
// EN: Multi-layer cache implementation
export class MultiLayerCache {
  private l1Cache: NodeCache;
  private l2Cache: Redis;
  
  constructor() {
    // VI: L1: Memory cache
    // EN: L1: Memory cache
    this.l1Cache = new NodeCache({
      stdTTL: 60,        // VI: 60 giây mặc định / EN: 60 seconds default
      maxKeys: 10000,    // VI: Tối đa 10k keys / EN: Max 10k keys
      checkperiod: 120   // VI: Kiểm tra expired keys mỗi 2 phút / EN: Check for expired keys every 2min
    });
    
    // VI: L2: Redis cache
    // EN: L2: Redis cache
    this.l2Cache = new Redis({
      host: process.env.REDIS_HOST,
      port: parseInt(process.env.REDIS_PORT),
      db: 0
    });
  }
  
  async get<T>(key: string): Promise<T | null> {
    // VI: Thử L1 trước
    // EN: Try L1 first
    const l1Value = this.l1Cache.get<T>(key);
    if (l1Value) {
      logger.debug('L1 cache hit', { key });
      return l1Value;
    }
    
    // VI: Thử L2
    // EN: Try L2
    const l2Value = await this.l2Cache.get(key);
    if (l2Value) {
      logger.debug('L2 cache hit', { key });
      const parsed = JSON.parse(l2Value) as T;
      
      // VI: Làm ấm L1 cache
      // EN: Warm L1 cache
      this.l1Cache.set(key, parsed);
      return parsed;
    }
    
    logger.debug('Cache miss', { key });
    return null;
  }
  
  async set(key: string, value: any, ttl: number = 300): Promise<void> {
    // VI: Lưu vào cả L1 và L2
    // EN: Store in both L1 and L2
    this.l1Cache.set(key, value, Math.min(ttl, 300)); // VI: L1 tối đa 5 phút / EN: L1 max 5min
    await this.l2Cache.setex(key, ttl, JSON.stringify(value));
  }
  
  async del(key: string): Promise<void> {
    this.l1Cache.del(key);
    await this.l2Cache.del(key);
  }
  
  async invalidatePattern(pattern: string): Promise<void> {
    // VI: L1: Xóa tất cả (cách đơn giản)
    // EN: L1: Clear all (simple approach)
    this.l1Cache.flushAll();
    
    // VI: L2: Xóa theo pattern
    // EN: L2: Delete by pattern
    const keys = await this.l2Cache.keys(pattern);
    if (keys.length > 0) {
      await this.l2Cache.del(...keys);
    }
  }
}

Quy ước Đặt tên Key / Cache Key Naming

Pattern: {service}:{entity}:{identifier}:{sub-resource}

Ví dụ / Examples:

// VI: User cache keys
// EN: User cache keys
const keys = {
  user: (userId: string) => `iam:user:${userId}`,
  userPermissions: (userId: string) => `iam:user:${userId}:permissions`,
  userRoles: (userId: string) => `iam:user:${userId}:roles`,
  session: (sessionId: string) => `iam:session:${sessionId}`,
};

// VI: Sử dụng
// EN: Usage
const user = await cache.get(keys.user('user_123'));
const permissions = await cache.get(keys.userPermissions('user_123'));

Chiến lược TTL / TTL Strategies

graph LR
    subgraph "TTL Tiers"
        Short[Short TTL<br/>60-300s<br/>Frequently changing]
        Medium[Medium TTL<br/>300-1800s<br/>Moderately changing]
        Long[Long TTL<br/>1800-3600s<br/>Rarely changing]
    end
    
    Short --> Permissions[User Permissions]
    Short --> Sessions[Session Data]
    
    Medium --> UserProfiles[User Profiles]
    Medium --> OrgData[Organization Data]
    
    Long --> Config[Static Config]
    Long --> RefData[Reference Data]
    
    %% style Short fill:#f8d7da
    %% style Medium fill:#fff3cd
    %% style Long fill:#d4edda

Hướng dẫn TTL / TTL Guidelines:

Loại Dữ liệu / Data Type TTL Lý do / Reason
User permissions 5 min Security-sensitive / Nhạy cảm bảo mật
Session data Varies Based on session length / Dựa trên độ dài session
User profiles 10 min Moderate update frequency / Tần suất cập nhật vừa phải
Organization data 15 min Infrequent updates / Cập nhật không thường xuyên
Static config 30-60 min Very stable / Rất ổn định
Reference data 1-2 hours Almost never changes / Hầu như không thay đổi

Vô hiệu hóa Cache / Cache Invalidation

sequenceDiagram
    participant API
    participant Service
    participant Cache
    participant DB
    
    API->>Service: Update User
    Service->>DB: UPDATE user
    DB-->>Service: Success
    
    Service->>Cache: Invalidate user:123
    Service->>Cache: Invalidate user:123:permissions
    Service->>Cache: Invalidate user:123:roles
    Cache-->>Service: Cleared
    
    Service-->>API: Success
    
    Note over Service,Cache: Next request will fetch fresh data

Chiến lược Invalidation / Invalidation Strategies:

// VI: 1. Invalidation single key
// EN: 1. Single key invalidation
async updateUser(userId: string, data: UpdateUserDto): Promise<User> {
  const user = await userRepository.update(userId, data);
  
  // VI: Vô hiệu hóa user cache
  // EN: Invalidate user cache
  await cache.del(cacheKeys.user(userId));
  
  return user;
}

// VI: 2. Invalidation theo pattern
// EN: 2. Pattern-based invalidation
async updateUserRole(userId: string, roleId: string): Promise<void> {
  await userRoleRepository.assign(userId, roleId);
  
  // VI: Vô hiệu hóa tất cả cache liên quan đến user
  // EN: Invalidate all user-related cache
  await cache.invalidatePattern(`iam:user:${userId}:*`);
}

// VI: 3. Invalidation theo thời gian (TTL expiry)
// EN: 3. Time-based invalidation (TTL expiry)
// VI: Tự động xử lý bởi cache
// EN: Automatically handled by cache

Làm ấm Cache / Cache Warming

// VI: Preload dữ liệu thường xuyên truy cập
// EN: Preload frequently accessed data
async warmCache(): Promise<void> {
  logger.info('Starting cache warming');
  
  // VI: Làm ấm user permissions cho active users
  // EN: Warm user permissions for active users
  const activeUsers = await userRepository.findActive({ limit: 1000 });
  
  for (const user of activeUsers) {
    const permissions = await rbacService.getUserPermissions(user.id);
    
    await cache.set(
      cacheKeys.userPermissions(user.id),
      permissions,
      300 // VI: 5 phút / EN: 5 minutes
    );
  }
  
  logger.info('Cache warming completed', { count: activeUsers.length });
}

// VI: Chạy khi service khởi động
// EN: Run on service startup
warmCache().catch(err => logger.error('Cache warming failed', { err }));

Quyết định Thiết kế / Design Decisions

Quyết định 1: Multi-layer Caching (L1 + L2)

VI Bối cảnh: Cần giảm tải cho Redis và đạt độ trễ cực thấp cho dữ liệu hot. VI Quyết định: Sử dụng kết hợp L1 (NodeCache) và L2 (Redis). VI Hậu quả:

  • Độ trễ < 1ms cho 40-50% requests.
  • Giảm network traffic tới Redis.
  • Phức tạp trong đồng bộ (L1 có thể stale trong thời gian ngắn).

EN Context: Need to reduce load on Redis and achieve ultra-low latency for hot data. EN Decision: Use combination of L1 (NodeCache) and L2 (Redis). EN Consequences:

  • Latency < 1ms for 40-50% requests.
  • Reduced network traffic to Redis.
  • synchronization complexity (L1 might be stale for short duration).

Đặc điểm Hiệu suất / Performance Characteristics

VI: Mục tiêu Hiệu suất

Chỉ số / Metric Mục tiêu / Target Ghi chú / Notes
L1 Hit Latency < 0.5ms In-memory lookup
L2 Hit Latency < 5ms Network RTT + Redis processing
Combine Hit Rate > 90% L1 + L2 combined
L1 Capacity 10k items Per instance limit to protect heap
Cache Warmup Time < 30s At service startup

EN: Performance Targets

Metric Target Notes
L1 Hit Latency < 0.5ms In-memory lookup
L2 Hit Latency < 5ms Network RTT + Redis processing
Combine Hit Rate > 90% L1 + L2 combined
L1 Capacity 10k items Per instance limit to protect heap
Cache Warmup Time < 30s At service startup

Cân nhắc Bảo mật / Security Considerations

VI: Bảo mật Cache

  • Encryption: Dữ liệu nhạy cảm (PII) PHẢI được mã hóa trước khi lưu vào L2 Redis (AES-256). L1 có thể lưuplaintext vì nằm trong memory process (trừ khi memory dump).
  • Isolation: Redis instance được bảo vệ bằng mật khẩu và Network Policy (chỉ allow traffic từ nội bộ K8s).
  • TLS: Kết nối tới Redis qua TLS 1.2+.
  • Data Sanitization: Không cache toàn bộ user object nếu chứa password hash hoặc secrets.

EN: Cache Security

  • Encryption: Sensitive data (PII) MUST be encrypted before storing in L2 Redis (AES-256). L1 can store plaintext as it is in process memory (unless memory dump).
  • Isolation: Redis instance protected by password and Network Policy (allow internal K8s traffic only).
  • TLS: Connect to Redis via TLS 1.2+.
  • Data Sanitization: Do not cache entire user objects if they contain password hashes or secrets.

Triển khai / Deployment

graph TD
    subgraph "Kubernetes Pod"
        Service[Microservice Container]
        L1[L1 Cache (RAM)]
        Service --- L1
    end

    subgraph "Infrastructure"
        RedisMaster[Redis Master]
        RedisSlave1[Redis Slave 1]
        RedisSlave2[Redis Slave 2]
    end

    Service -->|Write| RedisMaster
    Service -->|Read| RedisSlave1
    Service -->|Read| RedisSlave2

    RedisMaster -.->|Replication| RedisSlave1
    RedisMaster -.->|Replication| RedisSlave2

    style Service fill:#e1f5ff
    style L1 fill:#d4edda
    style RedisMaster fill:#fff4e1

VI Mô tả Triển khai:

  • L1: Nhúng trực tiếp trong process của Microservice, scale theo số lượng Pods.
  • L2: Cụm Redis (Cluster hoặc Sentinel) với ít nhất 3 nodes cho High Availability.
  • Connection Pooling: Sử dụng ioredis với connection pooling để quản lý kết nối hiệu quả.

EN Deployment Description:

  • L1: Embedded directly in Microservice process, scales with number of Pods.
  • L2: Redis Cluster (or Sentinel) with at least 3 nodes for High Availability.
  • Connection Pooling: Use ioredis with connection pooling for efficient connection management.

Giám sát & Khả năng quan sát / Monitoring & Observability

VI: Các chỉ số giám sát

  • Metrics: Prometheus metrics cho hit rate, miss rate, latency, memory usage.
  • Logs: Log cache miss/hit ở level debug (sample), log connection errors ở level error.
  • Health Checks: Readiness probe kiểm tra kết nối tới Redis.

EN: Monitoring Metrics

  • Metrics: Prometheus metrics for hit rate, miss rate, latency, memory usage.
  • Logs: Log cache miss/hit at debug level (sampled), log connection errors at error level.
  • Health Checks: Readiness probe checks connection to Redis.

Code Giám sát / Monitoring Code

Cache Hit Rates:

// VI: Theo dõi hiệu suất cache
// EN: Track cache performance
export class CacheMetrics {
  private hits = new Counter({
    name: 'cache_hits_total',
    help: 'Total cache hits',
    labelNames: ['layer', 'key_prefix']
  });
  
  private misses = new Counter({
    name: 'cache_misses_total',
    help: 'Total cache misses',
    labelNames: ['layer', 'key_prefix']
  });
  
  recordHit(layer: 'l1' | 'l2', key: string): void {
    const prefix = key.split(':')[0];
    this.hits.inc({ layer, key_prefix: prefix });
  }
  
  recordMiss(key: string): void {
    const prefix = key.split(':')[0];
    this.misses.inc({ layer: 'db', key_prefix: prefix });
  }
}

Hiệu suất Kỳ vọng / Expected Performance:

Chỉ số / Metric L1 Cache L2 Cache Database
Độ trễ / Latency < 1ms < 5ms < 50ms
Tỷ lệ Hit / Hit Rate 40-50% 80-90% -
Dung lượng / Capacity 10k keys Unlimited -

Best Practices

NÊN / DO:

  • Sử dụng cache cho dữ liệu thường xuyên truy cập / Use cache for frequently accessed data
  • Đặt TTL phù hợp dựa trên tần suất thay đổi dữ liệu / Set appropriate TTLs based on data change frequency
  • Vô hiệu hóa cache khi cập nhật dữ liệu / Invalidate cache on data updates
  • Sử dụng cache key namespacing / Use cache key namespacing
  • Giám sát cache hit rates / Monitor cache hit rates
  • Làm ấm cache khi khởi động cho dữ liệu quan trọng / Warm cache on startup for critical data

KHÔNG NÊN / DON'T:

  • Cache dữ liệu thay đổi rất thường xuyên / Cache data that changes very frequently
  • Đặt TTL quá dài (nguy cơ dữ liệu cũ) / Set TTL too long (stale data risk)
  • Đặt TTL quá ngắn (mất lợi ích cache) / Set TTL too short (negates cache benefit)
  • Cache dữ liệu nhạy cảm không mã hóa / Cache sensitive data without encryption
  • Bỏ qua cache invalidation khi cập nhật / Ignore cache invalidation on updates
  • Sử dụng cache làm primary data store / Use cache as primary data store

Bối cảnh Hệ thống / System Context

C4Context
    title Sơ đồ Bối cảnh Caching Architecture
    
    System(services, "Microservices", "Application services")
    
    System_Ext(redis, "Redis Cluster", "L2 distributed cache")
    System_Ext(db, "Neon PostgreSQL", "Primary data store")
    System_Ext(monitoring, "Monitoring", "Cache metrics & alerts")
    
    Rel(services, redis, "Cache operations", "Redis Protocol")
    Rel(services, db, "Data operations", "PostgreSQL")
    Rel(redis, monitoring, "Sends metrics", "Prometheus")
    
    BiRel(services, redis, "L2 cache miss → DB query")

VI Mô tả:

  • Microservices: Sử dụng multi-layer cache (L1: Memory, L2: Redis)
  • Redis Cluster: L2 cache shared giữa tất cả service instances
  • PostgreSQL: Primary data store, fallback khi cache miss
  • Monitoring: Thu thập cache metrics (hit rate, latency, evictions)

EN Description:

  • Microservices: Use multi-layer cache (L1: Memory, L2: Redis)
  • Redis Cluster: L2 cache shared across all service instances
  • PostgreSQL: Primary data store, fallback on cache miss
  • Monitoring: Collects cache metrics (hit rate, latency, evictions)

Cân nhắc Bảo mật / Security Considerations

VI: Phần Tiếng Việt

Access Control:

  • Redis AUTH password cho authentication
  • Network isolation: Redis chỉ accessible từ service pods
  • Kubernetes Network Policies: Whitelist specific services

Encryption:

  • TLS cho Redis connections (optional, recommended for production)
  • Encryption at rest: Redis persistence files encrypted
  • Sensitive data: Encrypt before caching (AES-256-GCM)

Data Sensitivity:

  • KHÔNG cache: Passwords, tokens, credit cards, SSN
  • Cache với encryption: PII (email, phone, address)
  • Cache plaintext: Non-sensitive data (public info, configs)

Cache Poisoning Prevention:

  • Validate data before caching
  • Use signed cache keys để prevent tampering
  • Implement cache key namespacing per service

TTL Management:

  • Short TTL (< 5 min) cho security-sensitive data
  • Invalidate cache immediately khi data changes
  • Auto-expire sessions on logout

Audit:

  • Log cache access cho sensitive data
  • Monitor unusual cache patterns (high miss rate, frequent invalidations)
  • Alert on cache security events

EN: English Section

Access Control:

  • Redis AUTH password for authentication
  • Network isolation: Redis only accessible from service pods
  • Kubernetes Network Policies: Whitelist specific services

Encryption:

  • TLS for Redis connections (optional, recommended for production)
  • Encryption at rest: Redis persistence files encrypted
  • Sensitive data: Encrypt before caching (AES-256-GCM)

Data Sensitivity:

  • DON'T cache: Passwords, tokens, credit cards, SSN
  • Cache with encryption: PII (email, phone, address)
  • Cache plaintext: Non-sensitive data (public info, configs)

Cache Poisoning Prevention:

  • Validate data before caching
  • Use signed cache keys to prevent tampering
  • Implement cache key namespacing per service

TTL Management:

  • Short TTL (< 5 min) for security-sensitive data
  • Invalidate cache immediately when data changes
  • Auto-expire sessions on logout

Audit:

  • Log cache access for sensitive data
  • Monitor unusual cache patterns (high miss rate, frequent invalidations)
  • Alert on cache security events

Triển khai / Deployment

graph TD
    subgraph "Redis Cluster"
        subgraph "Masters"
            M1[Redis Master 1<br/>Slots: 0-5460]
            M2[Redis Master 2<br/>Slots: 5461-10922]
            M3[Redis Master 3<br/>Slots: 10923-16383]
        end
        
        subgraph "Slaves"
            S1[Redis Slave 1<br/>Replica of M1]
            S2[Redis Slave 2<br/>Replica of M2]
            S3[Redis Slave 3<br/>Replica of M3]
        end
        
        M1 --> S1
        M2 --> S2
        M3 --> S3
        
        Sentinel[Redis Sentinel<br/>3 nodes]
        
        Sentinel -.->|Monitor| M1
        Sentinel -.->|Monitor| M2
        Sentinel -.->|Monitor| M3
    end
    
    subgraph "Services"
        Service1[Service A]
        Service2[Service B]
        Service3[Service C]
    end
    
    Service1 --> M1
    Service1 --> M2
    Service1 --> M3
    
    Service2 --> M1
    Service2 --> M2
    Service2 --> M3
    
    Service3 --> M1
    Service3 --> M2
    Service3 --> M3
    
    style M1 fill:#e1f5ff
    style M2 fill:#fff4e1
    style M3 fill:#d4edda
    style Sentinel fill:#f0e1ff

VI: Chiến lược Triển khai

Redis Cluster Configuration:

  • Mode: Cluster mode với 3 masters + 3 slaves
  • Replication: Mỗi master có 1 slave cho high availability
  • Sentinel: 3-node Sentinel ensemble cho automatic failover
  • Sharding: 16384 hash slots phân chia đều giữa 3 masters
  • Persistence: RDB snapshots mỗi 5 phút, AOF disabled (performance)

Resource Allocation:

Component CPU Memory Disk Replicas
Redis Master 1 core 2GB 10GB SSD 3
Redis Slave 1 core 2GB 10GB SSD 3
Sentinel 500m 512MB 5GB 3

Redis Configuration:

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru  # Evict least recently used keys
timeout 300  # Close idle connections after 5min
tcp-keepalive 60
save 300 10  # RDB snapshot every 5min if 10+ keys changed
appendonly no  # Disable AOF for performance

# Cluster config
cluster-enabled yes
cluster-node-timeout 5000
cluster-replica-validity-factor 0

High Availability:

  • Automatic failover với Redis Sentinel
  • Slave promotion khi master fails
  • Client-side retry logic
  • Connection pooling (max 50 connections per service)

Scaling Strategy:

  • Vertical: Tăng memory per node (2GB → 4GB → 8GB)
  • Horizontal: Thêm master nodes (3 → 5 → 7)
  • Read Scaling: Route reads to slaves
  • Monitoring: Auto-alert khi memory usage > 80%

EN: Deployment Strategy

Redis Cluster Configuration:

  • Mode: Cluster mode with 3 masters + 3 slaves
  • Replication: Each master has 1 slave for high availability
  • Sentinel: 3-node Sentinel ensemble for automatic failover
  • Sharding: 16384 hash slots distributed evenly across 3 masters
  • Persistence: RDB snapshots every 5 minutes, AOF disabled (performance)

Resource Allocation:

Component CPU Memory Disk Replicas
Redis Master 1 core 2GB 10GB SSD 3
Redis Slave 1 core 2GB 10GB SSD 3
Sentinel 500m 512MB 5GB 3

Redis Configuration:

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru  # Evict least recently used keys
timeout 300  # Close idle connections after 5min
tcp-keepalive 60
save 300 10  # RDB snapshot every 5min if 10+ keys changed
appendonly no  # Disable AOF for performance

# Cluster config
cluster-enabled yes
cluster-node-timeout 5000
cluster-replica-validity-factor 0

High Availability:

  • Automatic failover with Redis Sentinel
  • Slave promotion when master fails
  • Client-side retry logic
  • Connection pooling (max 50 connections per service)

Scaling Strategy:

  • Vertical: Increase memory per node (2GB → 4GB → 8GB)
  • Horizontal: Add master nodes (3 → 5 → 7)
  • Read Scaling: Route reads to slaves
  • Monitoring: Auto-alert when memory usage > 80%

Giám sát & Khả năng quan sát / Monitoring & Observability

VI: Chỉ số Chính

Cache Performance Metrics:

// VI: Custom metrics cho cache performance
// EN: Custom metrics for cache performance

import { Counter, Histogram, Gauge } from 'prom-client';

export const cacheHits = new Counter({
  name: 'cache_hits_total',
  help: 'Total cache hits',
  labelNames: ['layer', 'key_prefix'] // layer: l1/l2, key_prefix: user/session/etc
});

export const cacheMisses = new Counter({
  name: 'cache_misses_total',
  help: 'Total cache misses',
  labelNames: ['key_prefix']
});

export const cacheLatency = new Histogram({
  name: 'cache_operation_duration_seconds',
  help: 'Cache operation duration',
  labelNames: ['operation', 'layer'], // operation: get/set/del
  buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1]
});

export const cacheSize = new Gauge({
  name: 'cache_size_bytes',
  help: 'Cache size in bytes',
  labelNames: ['layer']
});

export const cacheEvictions = new Counter({
  name: 'cache_evictions_total',
  help: 'Total cache evictions',
  labelNames: ['layer', 'reason'] // reason: ttl_expired/memory_full
});

Redis Metrics:

  • redis_connected_clients - Connected clients
  • redis_used_memory_bytes - Memory usage
  • redis_memory_fragmentation_ratio - Memory fragmentation
  • redis_keyspace_hits_total - Cache hits
  • redis_keyspace_misses_total - Cache misses
  • redis_evicted_keys_total - Evicted keys
  • redis_expired_keys_total - Expired keys
  • redis_commands_processed_total - Commands processed

Calculated Metrics:

# Cache hit rate
rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))

# L1 hit rate
rate(cache_hits_total{layer="l1"}[5m]) / rate(cache_hits_total[5m])

# L2 hit rate
rate(cache_hits_total{layer="l2"}[5m]) / rate(cache_hits_total[5m])

# Average cache latency
histogram_quantile(0.95, cache_operation_duration_seconds_bucket)

# Memory usage percentage
redis_used_memory_bytes / redis_maxmemory_bytes * 100

Alerting Rules:

# VI: Quy tắc cảnh báo cho cache
# EN: Alerting rules for cache

groups:
  - name: cache_alerts
    interval: 30s
    rules:
      # Low cache hit rate
      - alert: LowCacheHitRate
        expr: |
          rate(cache_hits_total[5m]) / 
          (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m])) < 0.5
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Low cache hit rate"
          description: "Cache hit rate is {{ $value | humanizePercentage }}"
      
      # High memory usage
      - alert: HighRedisMemoryUsage
        expr: redis_used_memory_bytes / redis_maxmemory_bytes > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High Redis memory usage"
          description: "Redis memory usage is {{ $value | humanizePercentage }}"
      
      # High eviction rate
      - alert: HighEvictionRate
        expr: rate(redis_evicted_keys_total[5m]) > 100
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High cache eviction rate"
          description: "Eviction rate is {{ $value }}/sec"
      
      # Redis down
      - alert: RedisDown
        expr: redis_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Redis is down"
      
      # High replication lag
      - alert: HighReplicationLag
        expr: redis_replication_lag_seconds > 5
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High Redis replication lag"
          description: "Replication lag is {{ $value }}s"

Dashboards:

  • Cache Overview: Hit rate, miss rate, latency, size
  • Redis Cluster: Memory usage, connections, commands/sec
  • Performance: L1 vs L2 hit rates, operation latency
  • Evictions: Eviction rate, reasons, trends

Logging:

// VI: Structured logging cho cache operations
// EN: Structured logging for cache operations

logger.debug('Cache operation', {
  operation: 'get',
  layer: 'l1',
  key: cacheKey,
  hit: true,
  latency: duration,
  correlationId: req.correlationId
});

logger.warn('Cache eviction', {
  layer: 'l2',
  reason: 'memory_full',
  evictedKeys: count,
  memoryUsage: usagePercent
});

logger.error('Cache error', {
  operation: 'set',
  layer: 'l2',
  error: error.message,
  key: cacheKey
});

Health Checks:

// VI: Health check cho Redis
// EN: Health check for Redis
async function checkRedisHealth(): Promise<boolean> {
  try {
    await redis.ping();
    const info = await redis.info('memory');
    const memoryUsage = parseMemoryUsage(info);
    
    return memoryUsage < 0.9; // Healthy if < 90% memory
  } catch (error) {
    logger.error('Redis health check failed', { error });
    return false;
  }
}

EN: Key Metrics

Cache Performance Metrics:

// Custom metrics for cache performance

import { Counter, Histogram, Gauge } from 'prom-client';

export const cacheHits = new Counter({
  name: 'cache_hits_total',
  help: 'Total cache hits',
  labelNames: ['layer', 'key_prefix'] // layer: l1/l2, key_prefix: user/session/etc
});

export const cacheMisses = new Counter({
  name: 'cache_misses_total',
  help: 'Total cache misses',
  labelNames: ['key_prefix']
});

export const cacheLatency = new Histogram({
  name: 'cache_operation_duration_seconds',
  help: 'Cache operation duration',
  labelNames: ['operation', 'layer'], // operation: get/set/del
  buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1]
});

export const cacheSize = new Gauge({
  name: 'cache_size_bytes',
  help: 'Cache size in bytes',
  labelNames: ['layer']
});

export const cacheEvictions = new Counter({
  name: 'cache_evictions_total',
  help: 'Total cache evictions',
  labelNames: ['layer', 'reason'] // reason: ttl_expired/memory_full
});

Redis Metrics:

  • redis_connected_clients - Connected clients
  • redis_used_memory_bytes - Memory usage
  • redis_memory_fragmentation_ratio - Memory fragmentation
  • redis_keyspace_hits_total - Cache hits
  • redis_keyspace_misses_total - Cache misses
  • redis_evicted_keys_total - Evicted keys
  • redis_expired_keys_total - Expired keys
  • redis_commands_processed_total - Commands processed

Calculated Metrics:

# Cache hit rate
rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))

# L1 hit rate
rate(cache_hits_total{layer="l1"}[5m]) / rate(cache_hits_total[5m])

# L2 hit rate
rate(cache_hits_total{layer="l2"}[5m]) / rate(cache_hits_total[5m])

# Average cache latency
histogram_quantile(0.95, cache_operation_duration_seconds_bucket)

# Memory usage percentage
redis_used_memory_bytes / redis_maxmemory_bytes * 100

Alerting Rules:

# Alerting rules for cache

groups:
  - name: cache_alerts
    interval: 30s
    rules:
      # Low cache hit rate
      - alert: LowCacheHitRate
        expr: |
          rate(cache_hits_total[5m]) / 
          (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m])) < 0.5
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Low cache hit rate"
          description: "Cache hit rate is {{ $value | humanizePercentage }}"
      
      # High memory usage
      - alert: HighRedisMemoryUsage
        expr: redis_used_memory_bytes / redis_maxmemory_bytes > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High Redis memory usage"
          description: "Redis memory usage is {{ $value | humanizePercentage }}"
      
      # High eviction rate
      - alert: HighEvictionRate
        expr: rate(redis_evicted_keys_total[5m]) > 100
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High cache eviction rate"
          description: "Eviction rate is {{ $value }}/sec"
      
      # Redis down
      - alert: RedisDown
        expr: redis_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Redis is down"
      
      # High replication lag
      - alert: HighReplicationLag
        expr: redis_replication_lag_seconds > 5
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High Redis replication lag"
          description: "Replication lag is {{ $value }}s"

Dashboards:

  • Cache Overview: Hit rate, miss rate, latency, size
  • Redis Cluster: Memory usage, connections, commands/sec
  • Performance: L1 vs L2 hit rates, operation latency
  • Evictions: Eviction rate, reasons, trends

Logging:

// Structured logging for cache operations

logger.debug('Cache operation', {
  operation: 'get',
  layer: 'l1',
  key: cacheKey,
  hit: true,
  latency: duration,
  correlationId: req.correlationId
});

logger.warn('Cache eviction', {
  layer: 'l2',
  reason: 'memory_full',
  evictedKeys: count,
  memoryUsage: usagePercent
});

logger.error('Cache error', {
  operation: 'set',
  layer: 'l2',
  error: error.message,
  key: cacheKey
});

Health Checks:

// Health check for Redis
async function checkRedisHealth(): Promise<boolean> {
  try {
    await redis.ping();
    const info = await redis.info('memory');
    const memoryUsage = parseMemoryUsage(info);
    
    return memoryUsage < 0.9; // Healthy if < 90% memory
  } catch (error) {
    logger.error('Redis health check failed', { error });
    return false;
  }
}

Cập nhật Lần cuối / Last Updated: 2026-01-07
Tác giả / Authors: GoodGo Architecture Team