10 KiB
10 KiB
Caching Architecture
Multi-layer caching strategy for optimal performance
Overview Diagram
graph TD
Request[API Request] --> L1{L1 Cache<br/>Memory}
L1 -->|Hit| Return1[Return<br/>< 1ms]
L1 -->|Miss| L2{L2 Cache<br/>Redis}
L2 -->|Hit| WarmL1[Warm L1]
WarmL1 --> Return2[Return<br/>< 5ms]
L2 -->|Miss| DB[(Database)]
DB --> StoreL2[Store L2 + L1]
StoreL2 --> Return3[Return<br/>< 50ms]
style L1 fill:#d4edda
style L2 fill:#fff4e1
style DB fill:#f0e1ff
System Context
C4Context
title Caching System Context
System(service, "Microservice", "Client service using cache")
System_Ext(db, "Neon PostgreSQL", "Primary database")
Boundary(caching, "Caching Layer") {
System(l1, "L1 Cache", "In-memory NodeCache")
System(l2, "L2 Cache", "Redis Cluster")
}
Rel(service, l1, "Reads/Writes", "In-process")
Rel(service, l2, "Reads/Writes", "Redis Protocol")
Rel(l1, l2, "Fills from", "On miss")
Rel(l2, db, "Cache aside", "On miss")
Context Description
- Service: Communicates directly with L1 Cache (in-memory) for lowest latency.
- L1 Cache: Local cache, not shared, automatic expiration (short TTL).
- L2 Cache: Shared Redis cluster, holds data longer and syncs across instances.
- Database: Source of truth, accessed only on cache miss.
Architecture Description
Multi-Layer Caching
GoodGo platform uses 2-layer caching for performance:
L1 Cache (Memory):
- In-memory cache per service instance
- Very fast access (< 1ms)
- Limited capacity (10k keys default)
- Short TTL (60 seconds default, max 5 minutes)
- Not shared across instances
L2 Cache (Redis):
- Shared distributed cache
- Fast access (< 5ms)
- Large capacity
- Longer TTL (configurable, typically 5-15 minutes)
- Shared across all service instances
Cache Flow:
Request → L1 → L2 → Database
↓ ↓ ↓ ↓
40-50% 80-90% 10-20% Cache miss
hit rate hit rate rate
Cache Implementation
Multi-Layer Cache Service
export class MultiLayerCache {
private l1Cache: NodeCache;
private l2Cache: Redis;
constructor() {
// L1: Memory cache
this.l1Cache = new NodeCache({
stdTTL: 60, // 60 seconds default
maxKeys: 10000, // Max 10k keys
checkperiod: 120 // Check for expired keys every 2min
});
// L2: Redis cache
this.l2Cache = new Redis({
host: process.env.REDIS_HOST,
port: parseInt(process.env.REDIS_PORT),
db: 0
});
}
async get<T>(key: string): Promise<T | null> {
// Try L1 first
const l1Value = this.l1Cache.get<T>(key);
if (l1Value) {
logger.debug('L1 cache hit', { key });
return l1Value;
}
// Try L2
const l2Value = await this.l2Cache.get(key);
if (l2Value) {
logger.debug('L2 cache hit', { key });
const parsed = JSON.parse(l2Value) as T;
// Warm L1 cache
this.l1Cache.set(key, parsed);
return parsed;
}
logger.debug('Cache miss', { key });
return null;
}
async set(key: string, value: any, ttl: number = 300): Promise<void> {
// Store in both L1 and L2
this.l1Cache.set(key, value, Math.min(ttl, 300)); // L1 max 5min
await this.l2Cache.setex(key, ttl, JSON.stringify(value));
}
async del(key: string): Promise<void> {
this.l1Cache.del(key);
await this.l2Cache.del(key);
}
async invalidatePattern(pattern: string): Promise<void> {
// L1: Clear all (simple approach)
this.l1Cache.flushAll();
// L2: Delete by pattern
const keys = await this.l2Cache.keys(pattern);
if (keys.length > 0) {
await this.l2Cache.del(...keys);
}
}
}
Cache Key Naming
Pattern: {service}:{entity}:{identifier}:{sub-resource}
Examples:
const keys = {
user: (userId: string) => `iam:user:${userId}`,
userPermissions: (userId: string) => `iam:user:${userId}:permissions`,
userRoles: (userId: string) => `iam:user:${userId}:roles`,
session: (sessionId: string) => `iam:session:${sessionId}`,
};
// Usage
const user = await cache.get(keys.user('user_123'));
const permissions = await cache.get(keys.userPermissions('user_123'));
TTL Strategies
graph LR
subgraph "TTL Tiers"
Short[Short TTL<br/>60-300s<br/>Frequently changing]
Medium[Medium TTL<br/>300-1800s<br/>Moderately changing]
Long[Long TTL<br/>1800-3600s<br/>Rarely changing]
end
Short --> Permissions[User Permissions]
Short --> Sessions[Session Data]
Medium --> UserProfiles[User Profiles]
Medium --> OrgData[Organization Data]
Long --> Config[Static Config]
Long --> RefData[Reference Data]
style Short fill:#f8d7da
style Medium fill:#fff3cd
style Long fill:#d4edda
TTL Guidelines:
| Data Type | TTL | Reason |
|---|---|---|
| User permissions | 5 min | Security-sensitive |
| Session data | Varies | Based on session length |
| User profiles | 10 min | Moderate update frequency |
| Organization data | 15 min | Infrequent updates |
| Static config | 30-60 min | Very stable |
| Reference data | 1-2 hours | Almost never changes |
Cache Invalidation
sequenceDiagram
participant API
participant Service
participant Cache
participant DB
API->>Service: Update User
Service->>DB: UPDATE user
DB-->>Service: Success
Service->>Cache: Invalidate user:123
Service->>Cache: Invalidate user:123:permissions
Service->>Cache: Invalidate user:123:roles
Cache-->>Service: Cleared
Service-->>API: Success
Note over Service,Cache: Next request will fetch fresh data
Invalidation Strategies:
// 1. Single key invalidation
async updateUser(userId: string, data: UpdateUserDto): Promise<User> {
const user = await userRepository.update(userId, data);
// Invalidate user cache
await cache.del(cacheKeys.user(userId));
return user;
}
// 2. Pattern-based invalidation
async updateUserRole(userId: string, roleId: string): Promise<void> {
await userRoleRepository.assign(userId, roleId);
// Invalidate all user-related cache
await cache.invalidatePattern(`iam:user:${userId}:*`);
}
// 3. Time-based invalidation (TTL expiry)
// Automatically handled by cache
Cache Warming
// Preload frequently accessed data
async warmCache(): Promise<void> {
logger.info('Starting cache warming');
// Warm user permissions for active users
const activeUsers = await userRepository.findActive({ limit: 1000 });
for (const user of activeUsers) {
const permissions = await rbacService.getUserPermissions(user.id);
await cache.set(
cacheKeys.userPermissions(user.id),
permissions,
300 // 5 minutes
);
}
logger.info('Cache warming completed', { count: activeUsers.length });
}
// Run on service startup
warmCache().catch(err => logger.error('Cache warming failed', { err }));
Design Decisions
Decision 1: Multi-layer Caching (L1 + L2)
Context: Need to reduce load on Redis and achieve ultra-low latency for hot data. Decision: Use combination of L1 (NodeCache) and L2 (Redis). Consequences:
- ✅ Latency < 1ms for 40-50% requests.
- ✅ Reduced network traffic to Redis.
- ❌ Synchronization complexity (L1 might be stale for short duration).
Performance Characteristics
Performance Targets
| Metric | Target | Notes |
|---|---|---|
| L1 Hit Latency | < 0.5ms | In-memory lookup |
| L2 Hit Latency | < 5ms | Network RTT + Redis processing |
| Combine Hit Rate | > 90% | L1 + L2 combined |
| L1 Capacity | 10k items | Per instance limit to protect heap |
| Cache Warmup Time | < 30s | At service startup |
Security Considerations
Cache Security
- Encryption: Sensitive data (PII) MUST be encrypted before storing in L2 Redis (AES-256). L1 can store plaintext as it is in process memory (unless memory dump).
- Isolation: Redis instance protected by password and Network Policy (allow internal K8s traffic only).
- TLS: Connect to Redis via TLS 1.2+.
- Data Sanitization: Do not cache entire user objects if they contain password hashes or secrets.
Deployment
graph TD
subgraph "Kubernetes Pod"
Service[Microservice Container]
L1[L1 Cache (RAM)]
Service --- L1
end
subgraph "Infrastructure"
RedisMaster[Redis Master]
RedisSlave1[Redis Slave 1]
RedisSlave2[Redis Slave 2]
end
Service -->|Write| RedisMaster
Service -->|Read| RedisSlave1
Service -->|Read| RedisSlave2
RedisMaster -.->|Replication| RedisSlave1
RedisMaster -.->|Replication| RedisSlave2
style Service fill:#e1f5ff
style L1 fill:#d4edda
style RedisMaster fill:#fff4e1
Deployment Description:
- L1: Embedded directly in Microservice process, scales with number of Pods.
- L2: Redis Cluster (or Sentinel) with at least 3 nodes for High Availability.
- Connection Pooling: Use ioredis with connection pooling for efficient connection management.
Monitoring & Observability
Monitoring Metrics
- Metrics: Prometheus metrics for hit rate, miss rate, latency, memory usage.
- Logs: Log cache miss/hit at debug level (sampled), log connection errors at error level.
- Health Checks: Readiness probe checks connection to Redis.
Monitoring Code
Cache Hit Rates:
// Track cache performance
export class CacheMetrics {
// ... Prometheus Implementation ...
}
Expected Performance:
| Metric | L1 Cache | L2 Cache | Database |
|---|---|---|---|
| Latency | < 1ms | < 5ms | < 50ms |
| Hit Rate | 40-50% | 80-90% | - |
| Capacity | 10k keys | Unlimited | - |
Best Practices
DO:
- ✅ Use cache for frequently accessed data
- ✅ Set appropriate TTLs based on data change frequency
- ✅ Invalidate cache on data updates
- ✅ Use cache key namespacing
- ✅ Monitor cache hit rates
- ✅ Warm cache on startup for critical data
DON'T:
- ❌ Cache data that changes very frequently
- ❌ Set TTL too long (stale data risk)
- ❌ Set TTL too short (negates cache benefit)
- ❌ Cache sensitive data without encryption
- ❌ Ignore cache invalidation on updates
- ❌ Use cache as primary data store