20 KiB
Data Consistency Patterns
Patterns for maintaining data consistency in distributed microservices architecture
Overview Diagram
graph TD
subgraph "Consistency Patterns"
Saga[Saga Pattern<br/>Distributed Transactions]
Outbox[Outbox Pattern<br/>Reliable Events]
Idempotency[Idempotency<br/>Retry Safety]
OptimisticLock[Optimistic Locking<br/>Concurrent Updates]
CQRS[CQRS<br/>Read/Write Separation]
end
Service1[Service A] --> Saga
Service2[Service B] --> Outbox
Service3[Service C] --> Idempotency
Saga --> EventualConsistency[Eventual Consistency]
Outbox --> EventualConsistency
Idempotency --> EventualConsistency
OptimisticLock --> StrongConsistency[Strong Consistency]
CQRS --> EventualConsistency
style Saga fill:#e1f5ff
style Outbox fill:#fff4e1
style Idempotency fill:#f0e1ff
style CQRS fill:#d4edda
Architecture Description
Architecture Overview
GoodGo platform uses multiple consistency patterns to handle distributed data:
Core Challenges:
- No distributed transactions (2PC too slow)
- Services own their data (database per service)
- Network failures can cause partial completion
- Need to maintain data integrity across services
Pattern Selection:
- Saga: For multi-service workflows
- Outbox: For guaranteed event publishing
- Idempotency: For safe retries
- Optimistic Locking: For concurrent updates
- CQRS: For read/write optimization
System Context
C4Context
title System Context for Data Consistency in GoodGo Platform
Person(user, "User", "End user performing actions")
System_Boundary(goodgo, "GoodGo Microservices") {
System(order_service, "Order Service", "Manages orders with Saga")
System(payment_service, "Payment Service", "Processes payments")
System(inventory_service, "Inventory Service", "Manages stock")
System(saga_orchestrator, "Saga Orchestrator", "Coordinates distributed transactions")
System(outbox_processor, "Outbox Processor", "Publishes events reliably")
}
System_Ext(db_order, "Order DB", "PostgreSQL with Outbox table")
System_Ext(db_payment, "Payment DB", "PostgreSQL with version field")
System_Ext(db_inventory, "Inventory DB", "PostgreSQL")
System_Ext(kafka, "Event Bus", "Kafka - Event streaming")
System_Ext(redis, "Cache", "Redis - Idempotency keys")
Rel(user, order_service, "Places order", "HTTPS")
Rel(order_service, saga_orchestrator, "Starts saga", "Internal")
Rel(saga_orchestrator, payment_service, "Process payment", "HTTP")
Rel(saga_orchestrator, inventory_service, "Reserve stock", "HTTP")
Rel(order_service, db_order, "Writes + Outbox", "SQL")
Rel(payment_service, db_payment, "Updates with version", "SQL")
Rel(inventory_service, db_inventory, "Reads/Writes", "SQL")
Rel(outbox_processor, db_order, "Polls outbox", "SQL")
Rel(outbox_processor, kafka, "Publishes events", "Kafka Protocol")
Rel(order_service, redis, "Checks idempotency key", "Redis Protocol")
UpdateRelStyle(saga_orchestrator, payment_service, $lineColor="red", $textColor="red")
UpdateRelStyle(saga_orchestrator, inventory_service, $lineColor="red", $textColor="red")
The GoodGo platform uses a database-per-service architecture where each service owns its data. Data consistency across services is achieved through patterns like Saga (for coordinated workflows), Outbox (for reliable event publishing), Idempotency (for safe retries), and Optimistic Locking (for concurrent updates). These patterns enable eventual consistency while maintaining data integrity.
Saga Pattern
sequenceDiagram
participant Orchestrator
participant OrderService
participant PaymentService
participant InventoryService
Orchestrator->>OrderService: 1. Create Order
OrderService-->>Orchestrator: Order Created
Orchestrator->>PaymentService: 2. Process Payment
PaymentService-->>Orchestrator: Payment Success
Orchestrator->>InventoryService: 3. Reserve Inventory
alt Inventory Reserved
InventoryService-->>Orchestrator: Success
Orchestrator->>Orchestrator: Complete Saga ✓
else Inventory Failed
InventoryService-->>Orchestrator: Failed ✗
Orchestrator->>PaymentService: Compensate: Refund
PaymentService-->>Orchestrator: Refunded
Orchestrator->>OrderService: Compensate: Cancel Order
OrderService-->>Orchestrator: Cancelled
end
Description: Saga manages distributed transactions as sequence of local transactions with compensation.
Implementation:
// Saga orchestrator
class OrderSaga {
async execute(orderData: OrderData): Promise<void> {
const sagaContext = {
orderId: null,
paymentId: null,
inventoryId: null
};
try {
// Step 1: Create order
sagaContext.orderId = await orderService.create(orderData);
// Step 2: Process payment
sagaContext.paymentId = await paymentService.process(orderData.payment);
// Step 3: Reserve inventory
sagaContext.inventoryId = await inventoryService.reserve(orderData.items);
// All success - commit
await this.completeSaga(sagaContext);
} catch (error) {
// Compensate in reverse order
await this.compensate(sagaContext, error);
throw error;
}
}
private async compensate(context: SagaContext, error: Error): Promise<void> {
if (context.inventoryId) {
await inventoryService.release(context.inventoryId);
}
if (context.paymentId) {
await paymentService.refund(context.paymentId);
}
if (context.orderId) {
await orderService.cancel(context.orderId);
}
}
}
Outbox Pattern
sequenceDiagram
participant Service
participant DB as Database
participant OutboxTable as Outbox Table
participant Processor as Outbox Processor
participant Kafka
Service->>DB: Begin Transaction
Service->>DB: Update Business Data
Service->>OutboxTable: Insert Event
Service->>DB: Commit Transaction
loop Every 5 seconds
Processor->>OutboxTable: SELECT unpublished events
OutboxTable-->>Processor: Events
Processor->>Kafka: Publish Events
Kafka-->>Processor: Ack
Processor->>OutboxTable: Mark as published
end
Description: Guarantees event publishing by storing events in database within same transaction as business data.
Implementation:
// Store event in outbox
async createUser(userData: CreateUserDto): Promise<User> {
return await prisma.$transaction(async (tx) => {
// Business operation
const user = await tx.user.create({ data: userData });
// Store event in outbox (same transaction)
await tx.outbox.create({
data: {
aggregateId: user.id,
aggregateType: 'User',
eventType: 'user.created.v1',
payload: JSON.stringify(user),
createdAt: new Date()
}
});
return user;
});
}
// Outbox processor (runs periodically)
async processOutbox(): Promise<void> {
const events = await prisma.outbox.findMany({
where: { publishedAt: null },
take: 100
});
for (const event of events) {
try {
await kafkaProducer.send({
topic: event.eventType,
messages: [{ value: event.payload }]
});
await prisma.outbox.update({
where: { id: event.id },
data: { publishedAt: new Date() }
});
} catch (error) {
logger.error('Failed to publish event', { event, error });
}
}
}
Idempotency Pattern
graph LR
Request1[Request with<br/>Idempotency Key]
Request2[Retry with<br/>Same Key]
Request1 --> Check{Key Exists?}
Check -->|No| Process[Process Request]
Check -->|Yes| Return[Return Cached Result]
Process --> Store[Store Result<br/>with Key]
Store --> Response1[Response]
Request2 --> Check
Return --> Response2[Same Response]
style Check fill:#fff3cd
style Store fill:#d4edda
Description: Ensures operations can be safely retried without side effects by using idempotency keys.
Implementation:
// Idempotency middleware
async function idempotentOperation<T>(
key: string,
operation: () => Promise<T>,
ttl: number = 86400 // 24 hours
): Promise<T> {
// Check if already processed
const cached = await redis.get(`idempotency:${key}`);
if (cached) {
return JSON.parse(cached);
}
// Process operation
const result = await operation();
// Store result
await redis.setex(`idempotency:${key}`, ttl, JSON.stringify(result));
return result;
}
// Usage in controller
async createPayment(req: Request, res: Response): Promise<void> {
const idempotencyKey = req.headers['idempotency-key'] as string;
if (!idempotencyKey) {
return res.status(400).json({ error: 'Idempotency-Key header required' });
}
const result = await idempotentOperation(
idempotencyKey,
() => paymentService.process(req.body)
);
res.json({ success: true, data: result });
}
Optimistic Locking
sequenceDiagram
participant User1
participant User2
participant Service
participant DB
User1->>Service: Read (version=1)
User2->>Service: Read (version=1)
User1->>Service: Update (version=1)
Service->>DB: UPDATE WHERE version=1
DB-->>Service: Success, version→2
Service-->>User1: Success
User2->>Service: Update (version=1)
Service->>DB: UPDATE WHERE version=1
DB-->>Service: No rows updated
Service-->>User2: Conflict - version mismatch
User2->>Service: Read (version=2)
User2->>Service: Update (version=2)
Service-->>User2: Success
Description: Prevents lost updates by checking version on update.
Implementation:
// Prisma schema
model User {
id String @id @default(cuid())
email String @unique
name String
version Int @default(1) // Version field
}
// Update with optimistic locking
async updateUser(userId: string, data: UpdateUserDto, currentVersion: number): Promise<User> {
const result = await prisma.user.updateMany({
where: {
id: userId,
version: currentVersion // Check version
},
data: {
...data,
version: { increment: 1 } // Increment version
}
});
if (result.count === 0) {
throw new ConflictError('Version mismatch - data was modified by another user');
}
return await prisma.user.findUnique({ where: { id: userId } });
}
CQRS Pattern
graph LR
subgraph "Write Side"
Command[Command] --> WriteModel[Write Model<br/>Normalized]
WriteModel --> Events[Domain Events]
end
subgraph "Read Side"
Events --> Projection[Event Projection]
Projection --> ReadModel[Read Model<br/>Denormalized]
Query[Query] --> ReadModel
end
WriteModel --> DB1[(Write DB)]
ReadModel --> DB2[(Read DB<br/>Optimized)]
style WriteModel fill:#f0e1ff
style ReadModel fill:#d4edda
Description: Separates read and write models for optimal performance.
Performance Characteristics
Performance metrics and optimization strategies for data consistency patterns.
| Pattern | Latency Impact | Throughput | Notes |
|---|---|---|---|
| Saga Execution | 500ms - 2s | 100-500 sagas/s | Depends on number of steps and compensation |
| Outbox Processing | < 100ms | 10,000 events/s | Async processing, minimal user impact |
| Idempotency Check | < 10ms | 50,000 checks/s | Redis lookup, very fast |
| Optimistic Lock Update | < 50ms | 5,000 updates/s | Single DB operation with version check |
| CQRS Projection | 100ms - 1s | 1,000 events/s | Event processing to read model |
| Compensation Execution | 200ms - 1s | Varies | Rollback operations in saga |
Performance Optimization Strategies
Saga Pattern:
- Minimize number of steps (< 5 steps ideal)
- Parallel execution where possible
- Cache service responses
- Set appropriate timeouts (30s default)
Outbox Pattern:
- Batch process outbox events (100-500 per batch)
- Index
publishedAtcolumn for performance - Archive processed events periodically
- Use connection pooling for Kafka
Idempotency:
- Use Redis for fast key lookups
- Set TTL to 24-48 hours
- Hash long idempotency keys
- Clean expired keys regularly
Optimistic Locking:
- Works best for low-contention scenarios
- Implement retry with exponential backoff
- Monitor conflict rates (should be < 5%)
- Consider pessimistic locking if conflicts > 10%
Security Considerations
Security measures for protecting data consistency operations.
Saga Security
Compensation Protection:
- Validate saga execution permissions at each step
- Encrypt sensitive data in saga context
- Log all saga executions for audit
- Implement timeout to prevent hanging sagas
// Secure saga context
interface SecureSagaContext {
sagaId: string;
userId: string; // User who initiated
permissions: string[]; // Required permissions
encryptedData: string; // Encrypted sensitive data
auditLog: AuditEntry[]; // Audit trail
}
Outbox Security
Event Payload Encryption:
- Encrypt PII (Personally Identifiable Information) before storing in outbox
- Use AES-256-GCM for event payload encryption
- Decrypt only when publishing to Kafka
- Rotate encryption keys quarterly
Access Control:
- Restrict outbox table access to outbox processor only
- Use database roles and permissions
- Monitor outbox table access patterns
Idempotency Security
Key Security:
- Use cryptographic hashing for idempotency keys (SHA-256)
- Include user context in key generation
- Validate key ownership before processing
- Clear keys on user logout for sensitive operations
// Secure idempotency key generation
function generateIdempotencyKey(
operation: string,
userId: string,
data: any
): string {
const payload = JSON.stringify({ operation, userId, data });
return crypto.createHash('sha256').update(payload).digest('hex');
}
Optimistic Locking Security
Version Tampering Prevention:
- Validate version field on server-side only
- Never accept version from client directly
- Log version conflicts for security monitoring
- Rate limit update attempts per user
Deployment
How data consistency patterns are deployed and scaled.
graph TD
subgraph "Production Deployment"
subgraph "Order Service Cluster"
OS1[Order Service\nPod 1]
OS2[Order Service\nPod 2]
OS3[Order Service\nPod 3]
end
subgraph "Saga Orchestrator"
SO1[Saga Orchestrator\nPod 1]
SO2[Saga Orchestrator\nPod 2]
end
subgraph "Outbox Processor"
OP1[Outbox Processor\nPod 1]
OP2[Outbox Processor\nPod 2]
end
OS1 & OS2 & OS3 --> DB[(Order DB\nwith Outbox)]
OS1 & OS2 & OS3 --> Redis[(Redis\nIdempotency Keys)]
SO1 & SO2 --> PS[Payment Service]
SO1 & SO2 --> IS[Inventory Service]
OP1 & OP2 --> DB
OP1 & OP2 --> Kafka[Kafka Cluster\n5 brokers]
end
style SO1 fill:#e1f5ff
style SO2 fill:#e1f5ff
style OP1 fill:#fff4e1
style OP2 fill:#fff4e1
style DB fill:#d4edda
style Kafka fill:#ffe1e1
Deployment Configuration
| Component | Replicas | Resources | HA Strategy |
|---|---|---|---|
| Saga Orchestrator | 2-3 | 512Mi RAM, 500m CPU | Leader election with etcd |
| Outbox Processor | 2-5 | 256Mi RAM, 250m CPU | Distributed lock per event batch |
| Services with Outbox | 3+ | Varies | Standard service scaling |
| Redis (Idempotency) | 3 nodes | 1Gi RAM each | Redis Cluster with replication |
Scaling Strategy
Saga Orchestrator:
- Scale based on pending saga count
- Use queue-based load distribution
- Monitor saga execution duration
Outbox Processor:
- Scale with database sharding (1 processor per shard)
- Increase batch size before adding replicas
- Monitor outbox table size and age
Idempotency Store (Redis):
- Scale Redis cluster horizontally
- Use consistent hashing for key distribution
- Monitor memory usage (should be < 70%)
Monitoring & Observability
Monitoring strategies for data consistency patterns.
Key Metrics
Saga Metrics:
saga_executions_total- Total saga executions (success/failure)saga_duration_seconds- Saga execution time histogramsaga_compensations_total- Total compensation executionssaga_timeout_total- Sagas that timed outsaga_pending_count- Sagas currently executing
Outbox Metrics:
outbox_events_total- Events written to outboxoutbox_published_total- Events published to Kafkaoutbox_processing_lag_seconds- Time from write to publishoutbox_table_size- Outbox table row countoutbox_failed_events_total- Failed event publications
Idempotency Metrics:
idempotency_checks_total- Total idempotency checksidempotency_hits_total- Duplicate requests preventedidempotency_key_ttl_seconds- Average key TTLidempotency_redis_errors_total- Redis failures
Optimistic Lock Metrics:
optimistic_lock_conflicts_total- Version conflicts detectedoptimistic_lock_retries_total- Retry attempts after conflictoptimistic_lock_success_rate- Update success percentage
Alerts
Critical Alerts:
# Saga timeout rate too high
alert: HighSagaTimeoutRate
expr: rate(saga_timeout_total[5m]) > 0.05
for: 5m
severity: critical
# Outbox processing lag
alert: OutboxProcessingLag
expr: outbox_processing_lag_seconds > 300
for: 10m
severity: critical
# High optimistic lock conflict rate
alert: HighOptimisticLockConflicts
expr: rate(optimistic_lock_conflicts_total[5m]) / rate(optimistic_lock_attempts_total[5m]) > 0.1
for: 5m
severity: warning
Monitoring Dashboard
Grafana Panels:
-
Saga Orchestration Overview:
- Saga execution rate (success/failure)
- Average saga duration
- Compensation rate
- Pending saga count
-
Outbox Processing Health:
- Outbox publishing rate
- Processing lag (P95, P99)
- Failed events
- Table size trend
-
Idempotency Effectiveness:
- Duplicate prevention rate
- Redis hit rate
- Key distribution
-
Data Consistency SLA:
- Overall consistency rate (target: 99.9%)
- Mean time to consistency (MTTC)
- Conflict resolution success rate
Distributed Tracing
Trace Saga Execution:
// Traced saga step
async function executeStepWithTracing(
step: SagaStep,
context: SagaContext
): Promise<void> {
const tracer = trace.getTracer('saga-orchestrator');
const span = tracer.startSpan(`saga.step.${step.name}`, {
attributes: {
'saga.id': context.sagaId,
'saga.step': step.name,
'saga.attempt': context.currentAttempt
}
});
try {
await step.execute(context);
span.setStatus({ code: SpanStatusCode.OK });
} catch (error) {
span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
span.recordException(error);
throw error;
} finally {
span.end();
}
}
Related Documentation
- Event-Driven Architecture - Event sourcing and Kafka
- System Design - Overall architecture
- Microservices Communication - Service communication patterns
- Resilience Patterns - Circuit breaker, retry for saga steps
- Caching Patterns - Caching for idempotency keys
- Database Prisma - Prisma transactions for outbox pattern
Last Updated: 2026-01-07
Author: VelikHo (hongochai10@icloud.com)
Reviewers: To be assigned