Files
pos-system/docs/en/architecture/caching-architecture.md

10 KiB

Caching Architecture

Multi-layer caching strategy for optimal performance

Overview Diagram

graph TD
    Request[API Request] --> L1{L1 Cache<br/>Memory}
    
    L1 -->|Hit| Return1[Return<br/>< 1ms]
    L1 -->|Miss| L2{L2 Cache<br/>Redis}
    
    L2 -->|Hit| WarmL1[Warm L1]
    WarmL1 --> Return2[Return<br/>< 5ms]
    
    L2 -->|Miss| DB[(Database)]
    DB --> StoreL2[Store L2 + L1]
    StoreL2 --> Return3[Return<br/>< 50ms]
    
    style L1 fill:#d4edda
    style L2 fill:#fff4e1
    style DB fill:#f0e1ff

System Context

C4Context
    title Caching System Context

    System(service, "Microservice", "Client service using cache")
    System_Ext(db, "Neon PostgreSQL", "Primary database")
    
    Boundary(caching, "Caching Layer") {
        System(l1, "L1 Cache", "In-memory NodeCache")
        System(l2, "L2 Cache", "Redis Cluster")
    }

    Rel(service, l1, "Reads/Writes", "In-process")
    Rel(service, l2, "Reads/Writes", "Redis Protocol")
    Rel(l1, l2, "Fills from", "On miss")
    Rel(l2, db, "Cache aside", "On miss")

Context Description

  • Service: Communicates directly with L1 Cache (in-memory) for lowest latency.
  • L1 Cache: Local cache, not shared, automatic expiration (short TTL).
  • L2 Cache: Shared Redis cluster, holds data longer and syncs across instances.
  • Database: Source of truth, accessed only on cache miss.

Architecture Description

Multi-Layer Caching

GoodGo platform uses 2-layer caching for performance:

L1 Cache (Memory):

  • In-memory cache per service instance
  • Very fast access (< 1ms)
  • Limited capacity (10k keys default)
  • Short TTL (60 seconds default, max 5 minutes)
  • Not shared across instances

L2 Cache (Redis):

  • Shared distributed cache
  • Fast access (< 5ms)
  • Large capacity
  • Longer TTL (configurable, typically 5-15 minutes)
  • Shared across all service instances

Cache Flow:

Request → L1 → L2 → Database
  ↓        ↓    ↓      ↓
40-50%  80-90% 10-20%  Cache miss
hit rate hit rate        rate

Cache Implementation

Multi-Layer Cache Service

export class MultiLayerCache {
  private l1Cache: NodeCache;
  private l2Cache: Redis;
  
  constructor() {
    // L1: Memory cache
    this.l1Cache = new NodeCache({
      stdTTL: 60,        // 60 seconds default
      maxKeys: 10000,    // Max 10k keys
      checkperiod: 120   // Check for expired keys every 2min
    });
    
    // L2: Redis cache
    this.l2Cache = new Redis({
      host: process.env.REDIS_HOST,
      port: parseInt(process.env.REDIS_PORT),
      db: 0
    });
  }
  
  async get<T>(key: string): Promise<T | null> {
    // Try L1 first
    const l1Value = this.l1Cache.get<T>(key);
    if (l1Value) {
      logger.debug('L1 cache hit', { key });
      return l1Value;
    }
    
    // Try L2
    const l2Value = await this.l2Cache.get(key);
    if (l2Value) {
      logger.debug('L2 cache hit', { key });
      const parsed = JSON.parse(l2Value) as T;
      
      // Warm L1 cache
      this.l1Cache.set(key, parsed);
      return parsed;
    }
    
    logger.debug('Cache miss', { key });
    return null;
  }
  
  async set(key: string, value: any, ttl: number = 300): Promise<void> {
    // Store in both L1 and L2
    this.l1Cache.set(key, value, Math.min(ttl, 300)); // L1 max 5min
    await this.l2Cache.setex(key, ttl, JSON.stringify(value));
  }
  
  async del(key: string): Promise<void> {
    this.l1Cache.del(key);
    await this.l2Cache.del(key);
  }
  
  async invalidatePattern(pattern: string): Promise<void> {
    // L1: Clear all (simple approach)
    this.l1Cache.flushAll();
    
    // L2: Delete by pattern
    const keys = await this.l2Cache.keys(pattern);
    if (keys.length > 0) {
      await this.l2Cache.del(...keys);
    }
  }
}

Cache Key Naming

Pattern: {service}:{entity}:{identifier}:{sub-resource}

Examples:

const keys = {
  user: (userId: string) => `iam:user:${userId}`,
  userPermissions: (userId: string) => `iam:user:${userId}:permissions`,
  userRoles: (userId: string) => `iam:user:${userId}:roles`,
  session: (sessionId: string) => `iam:session:${sessionId}`,
};

// Usage
const user = await cache.get(keys.user('user_123'));
const permissions = await cache.get(keys.userPermissions('user_123'));

TTL Strategies

graph LR
    subgraph "TTL Tiers"
        Short[Short TTL<br/>60-300s<br/>Frequently changing]
        Medium[Medium TTL<br/>300-1800s<br/>Moderately changing]
        Long[Long TTL<br/>1800-3600s<br/>Rarely changing]
    end
    
    Short --> Permissions[User Permissions]
    Short --> Sessions[Session Data]
    
    Medium --> UserProfiles[User Profiles]
    Medium --> OrgData[Organization Data]
    
    Long --> Config[Static Config]
    Long --> RefData[Reference Data]
    
    style Short fill:#f8d7da
    style Medium fill:#fff3cd
    style Long fill:#d4edda

TTL Guidelines:

Data Type TTL Reason
User permissions 5 min Security-sensitive
Session data Varies Based on session length
User profiles 10 min Moderate update frequency
Organization data 15 min Infrequent updates
Static config 30-60 min Very stable
Reference data 1-2 hours Almost never changes

Cache Invalidation

sequenceDiagram
    participant API
    participant Service
    participant Cache
    participant DB
    
    API->>Service: Update User
    Service->>DB: UPDATE user
    DB-->>Service: Success
    
    Service->>Cache: Invalidate user:123
    Service->>Cache: Invalidate user:123:permissions
    Service->>Cache: Invalidate user:123:roles
    Cache-->>Service: Cleared
    
    Service-->>API: Success
    
    Note over Service,Cache: Next request will fetch fresh data

Invalidation Strategies:

// 1. Single key invalidation
async updateUser(userId: string, data: UpdateUserDto): Promise<User> {
  const user = await userRepository.update(userId, data);
  
  // Invalidate user cache
  await cache.del(cacheKeys.user(userId));
  
  return user;
}

// 2. Pattern-based invalidation
async updateUserRole(userId: string, roleId: string): Promise<void> {
  await userRoleRepository.assign(userId, roleId);
  
  // Invalidate all user-related cache
  await cache.invalidatePattern(`iam:user:${userId}:*`);
}

// 3. Time-based invalidation (TTL expiry)
// Automatically handled by cache

Cache Warming

// Preload frequently accessed data
async warmCache(): Promise<void> {
  logger.info('Starting cache warming');
  
  // Warm user permissions for active users
  const activeUsers = await userRepository.findActive({ limit: 1000 });
  
  for (const user of activeUsers) {
    const permissions = await rbacService.getUserPermissions(user.id);
    
    await cache.set(
      cacheKeys.userPermissions(user.id),
      permissions,
      300 // 5 minutes
    );
  }
  
  logger.info('Cache warming completed', { count: activeUsers.length });
}

// Run on service startup
warmCache().catch(err => logger.error('Cache warming failed', { err }));

Design Decisions

Decision 1: Multi-layer Caching (L1 + L2)

Context: Need to reduce load on Redis and achieve ultra-low latency for hot data. Decision: Use combination of L1 (NodeCache) and L2 (Redis). Consequences:

  • Latency < 1ms for 40-50% requests.
  • Reduced network traffic to Redis.
  • Synchronization complexity (L1 might be stale for short duration).

Performance Characteristics

Performance Targets

Metric Target Notes
L1 Hit Latency < 0.5ms In-memory lookup
L2 Hit Latency < 5ms Network RTT + Redis processing
Combine Hit Rate > 90% L1 + L2 combined
L1 Capacity 10k items Per instance limit to protect heap
Cache Warmup Time < 30s At service startup

Security Considerations

Cache Security

  • Encryption: Sensitive data (PII) MUST be encrypted before storing in L2 Redis (AES-256). L1 can store plaintext as it is in process memory (unless memory dump).
  • Isolation: Redis instance protected by password and Network Policy (allow internal K8s traffic only).
  • TLS: Connect to Redis via TLS 1.2+.
  • Data Sanitization: Do not cache entire user objects if they contain password hashes or secrets.

Deployment

graph TD
    subgraph "Kubernetes Pod"
        Service[Microservice Container]
        L1[L1 Cache (RAM)]
        Service --- L1
    end

    subgraph "Infrastructure"
        RedisMaster[Redis Master]
        RedisSlave1[Redis Slave 1]
        RedisSlave2[Redis Slave 2]
    end

    Service -->|Write| RedisMaster
    Service -->|Read| RedisSlave1
    Service -->|Read| RedisSlave2

    RedisMaster -.->|Replication| RedisSlave1
    RedisMaster -.->|Replication| RedisSlave2

    style Service fill:#e1f5ff
    style L1 fill:#d4edda
    style RedisMaster fill:#fff4e1

Deployment Description:

  • L1: Embedded directly in Microservice process, scales with number of Pods.
  • L2: Redis Cluster (or Sentinel) with at least 3 nodes for High Availability.
  • Connection Pooling: Use ioredis with connection pooling for efficient connection management.

Monitoring & Observability

Monitoring Metrics

  • Metrics: Prometheus metrics for hit rate, miss rate, latency, memory usage.
  • Logs: Log cache miss/hit at debug level (sampled), log connection errors at error level.
  • Health Checks: Readiness probe checks connection to Redis.

Monitoring Code

Cache Hit Rates:

// Track cache performance
export class CacheMetrics {
  // ... Prometheus Implementation ...
}

Expected Performance:

Metric L1 Cache L2 Cache Database
Latency < 1ms < 5ms < 50ms
Hit Rate 40-50% 80-90% -
Capacity 10k keys Unlimited -

Best Practices

DO:

  • Use cache for frequently accessed data
  • Set appropriate TTLs based on data change frequency
  • Invalidate cache on data updates
  • Use cache key namespacing
  • Monitor cache hit rates
  • Warm cache on startup for critical data

DON'T:

  • Cache data that changes very frequently
  • Set TTL too long (stale data risk)
  • Set TTL too short (negates cache benefit)
  • Cache sensitive data without encryption
  • Ignore cache invalidation on updates
  • Use cache as primary data store