# Caching Architecture > Multi-layer caching strategy for optimal performance ## Overview Diagram ```mermaid graph TD Request[API Request] --> L1{L1 Cache
Memory} L1 -->|Hit| Return1[Return
< 1ms] L1 -->|Miss| L2{L2 Cache
Redis} L2 -->|Hit| WarmL1[Warm L1] WarmL1 --> Return2[Return
< 5ms] L2 -->|Miss| DB[(Database)] DB --> StoreL2[Store L2 + L1] StoreL2 --> Return3[Return
< 50ms] style L1 fill:#d4edda style L2 fill:#fff4e1 style DB fill:#f0e1ff ``` ## System Context ```mermaid C4Context title Caching System Context System(service, "Microservice", "Client service using cache") System_Ext(db, "Neon PostgreSQL", "Primary database") Boundary(caching, "Caching Layer") { System(l1, "L1 Cache", "In-memory NodeCache") System(l2, "L2 Cache", "Redis Cluster") } Rel(service, l1, "Reads/Writes", "In-process") Rel(service, l2, "Reads/Writes", "Redis Protocol") Rel(l1, l2, "Fills from", "On miss") Rel(l2, db, "Cache aside", "On miss") ``` ### Context Description - **Service**: Communicates directly with L1 Cache (in-memory) for lowest latency. - **L1 Cache**: Local cache, not shared, automatic expiration (short TTL). - **L2 Cache**: Shared Redis cluster, holds data longer and syncs across instances. - **Database**: Source of truth, accessed only on cache miss. ## Architecture Description ### Multi-Layer Caching GoodGo platform uses 2-layer caching for performance: **L1 Cache (Memory)**: - In-memory cache per service instance - Very fast access (< 1ms) - Limited capacity (10k keys default) - Short TTL (60 seconds default, max 5 minutes) - Not shared across instances **L2 Cache (Redis)**: - Shared distributed cache - Fast access (< 5ms) - Large capacity - Longer TTL (configurable, typically 5-15 minutes) - Shared across all service instances **Cache Flow**: ``` Request → L1 → L2 → Database ↓ ↓ ↓ ↓ 40-50% 80-90% 10-20% Cache miss hit rate hit rate rate ``` ## Cache Implementation ### Multi-Layer Cache Service ```typescript export class MultiLayerCache { private l1Cache: NodeCache; private l2Cache: Redis; constructor() { // L1: Memory cache this.l1Cache = new NodeCache({ stdTTL: 60, // 60 seconds default maxKeys: 10000, // Max 10k keys checkperiod: 120 // Check for expired keys every 2min }); // L2: Redis cache this.l2Cache = new Redis({ host: process.env.REDIS_HOST, port: parseInt(process.env.REDIS_PORT), db: 0 }); } async get(key: string): Promise { // Try L1 first const l1Value = this.l1Cache.get(key); if (l1Value) { logger.debug('L1 cache hit', { key }); return l1Value; } // Try L2 const l2Value = await this.l2Cache.get(key); if (l2Value) { logger.debug('L2 cache hit', { key }); const parsed = JSON.parse(l2Value) as T; // Warm L1 cache this.l1Cache.set(key, parsed); return parsed; } logger.debug('Cache miss', { key }); return null; } async set(key: string, value: any, ttl: number = 300): Promise { // Store in both L1 and L2 this.l1Cache.set(key, value, Math.min(ttl, 300)); // L1 max 5min await this.l2Cache.setex(key, ttl, JSON.stringify(value)); } async del(key: string): Promise { this.l1Cache.del(key); await this.l2Cache.del(key); } async invalidatePattern(pattern: string): Promise { // L1: Clear all (simple approach) this.l1Cache.flushAll(); // L2: Delete by pattern const keys = await this.l2Cache.keys(pattern); if (keys.length > 0) { await this.l2Cache.del(...keys); } } } ``` ### Cache Key Naming **Pattern**: `{service}:{entity}:{identifier}:{sub-resource}` **Examples**: ```typescript const keys = { user: (userId: string) => `iam:user:${userId}`, userPermissions: (userId: string) => `iam:user:${userId}:permissions`, userRoles: (userId: string) => `iam:user:${userId}:roles`, session: (sessionId: string) => `iam:session:${sessionId}`, }; // Usage const user = await cache.get(keys.user('user_123')); const permissions = await cache.get(keys.userPermissions('user_123')); ``` ## TTL Strategies ```mermaid graph LR subgraph "TTL Tiers" Short[Short TTL
60-300s
Frequently changing] Medium[Medium TTL
300-1800s
Moderately changing] Long[Long TTL
1800-3600s
Rarely changing] end Short --> Permissions[User Permissions] Short --> Sessions[Session Data] Medium --> UserProfiles[User Profiles] Medium --> OrgData[Organization Data] Long --> Config[Static Config] Long --> RefData[Reference Data] style Short fill:#f8d7da style Medium fill:#fff3cd style Long fill:#d4edda ``` **TTL Guidelines**: | Data Type | TTL | Reason | |-----------|-----|--------| | User permissions | 5 min | Security-sensitive | | Session data | Varies | Based on session length | | User profiles | 10 min | Moderate update frequency | | Organization data | 15 min | Infrequent updates | | Static config | 30-60 min | Very stable | | Reference data | 1-2 hours | Almost never changes | ## Cache Invalidation ```mermaid sequenceDiagram participant API participant Service participant Cache participant DB API->>Service: Update User Service->>DB: UPDATE user DB-->>Service: Success Service->>Cache: Invalidate user:123 Service->>Cache: Invalidate user:123:permissions Service->>Cache: Invalidate user:123:roles Cache-->>Service: Cleared Service-->>API: Success Note over Service,Cache: Next request will fetch fresh data ``` **Invalidation Strategies**: ```typescript // 1. Single key invalidation async updateUser(userId: string, data: UpdateUserDto): Promise { const user = await userRepository.update(userId, data); // Invalidate user cache await cache.del(cacheKeys.user(userId)); return user; } // 2. Pattern-based invalidation async updateUserRole(userId: string, roleId: string): Promise { await userRoleRepository.assign(userId, roleId); // Invalidate all user-related cache await cache.invalidatePattern(`iam:user:${userId}:*`); } // 3. Time-based invalidation (TTL expiry) // Automatically handled by cache ``` ## Cache Warming ```typescript // Preload frequently accessed data async warmCache(): Promise { logger.info('Starting cache warming'); // Warm user permissions for active users const activeUsers = await userRepository.findActive({ limit: 1000 }); for (const user of activeUsers) { const permissions = await rbacService.getUserPermissions(user.id); await cache.set( cacheKeys.userPermissions(user.id), permissions, 300 // 5 minutes ); } logger.info('Cache warming completed', { count: activeUsers.length }); } // Run on service startup warmCache().catch(err => logger.error('Cache warming failed', { err })); ``` ## Design Decisions ### Decision 1: Multi-layer Caching (L1 + L2) **Context**: Need to reduce load on Redis and achieve ultra-low latency for hot data. **Decision**: Use combination of L1 (NodeCache) and L2 (Redis). **Consequences**: - ✅ Latency < 1ms for 40-50% requests. - ✅ Reduced network traffic to Redis. - ❌ Synchronization complexity (L1 might be stale for short duration). ## Performance Characteristics ### Performance Targets | Metric | Target | Notes | |--------|--------|-------| | **L1 Hit Latency** | < 0.5ms | In-memory lookup | | **L2 Hit Latency** | < 5ms | Network RTT + Redis processing | | **Combine Hit Rate** | > 90% | L1 + L2 combined | | **L1 Capacity** | 10k items | Per instance limit to protect heap | | **Cache Warmup Time** | < 30s | At service startup | ## Security Considerations ### Cache Security - **Encryption**: Sensitive data (PII) MUST be encrypted before storing in L2 Redis (AES-256). L1 can store plaintext as it is in process memory (unless memory dump). - **Isolation**: Redis instance protected by password and Network Policy (allow internal K8s traffic only). - **TLS**: Connect to Redis via TLS 1.2+. - **Data Sanitization**: Do not cache entire user objects if they contain password hashes or secrets. ## Deployment ```mermaid graph TD subgraph "Kubernetes Pod" Service[Microservice Container] L1[L1 Cache (RAM)] Service --- L1 end subgraph "Infrastructure" RedisMaster[Redis Master] RedisSlave1[Redis Slave 1] RedisSlave2[Redis Slave 2] end Service -->|Write| RedisMaster Service -->|Read| RedisSlave1 Service -->|Read| RedisSlave2 RedisMaster -.->|Replication| RedisSlave1 RedisMaster -.->|Replication| RedisSlave2 style Service fill:#e1f5ff style L1 fill:#d4edda style RedisMaster fill:#fff4e1 ``` **Deployment Description**: - **L1**: Embedded directly in Microservice process, scales with number of Pods. - **L2**: Redis Cluster (or Sentinel) with at least 3 nodes for High Availability. - **Connection Pooling**: Use ioredis with connection pooling for efficient connection management. ## Monitoring & Observability ### Monitoring Metrics - **Metrics**: Prometheus metrics for hit rate, miss rate, latency, memory usage. - **Logs**: Log cache miss/hit at debug level (sampled), log connection errors at error level. - **Health Checks**: Readiness probe checks connection to Redis. ### Monitoring Code **Cache Hit Rates**: ```typescript // Track cache performance export class CacheMetrics { // ... Prometheus Implementation ... } ``` **Expected Performance**: | Metric | L1 Cache | L2 Cache | Database | |--------|----------|----------|----------| | Latency | < 1ms | < 5ms | < 50ms | | Hit Rate | 40-50% | 80-90% | - | | Capacity | 10k keys | Unlimited | - | ## Best Practices **DO**: - ✅ Use cache for frequently accessed data - ✅ Set appropriate TTLs based on data change frequency - ✅ Invalidate cache on data updates - ✅ Use cache key namespacing - ✅ Monitor cache hit rates - ✅ Warm cache on startup for critical data **DON'T**: - ❌ Cache data that changes very frequently - ❌ Set TTL too long (stale data risk) - ❌ Set TTL too short (negates cache benefit) - ❌ Cache sensitive data without encryption - ❌ Ignore cache invalidation on updates - ❌ Use cache as primary data store