---
name: resilience-patterns
description: Resilience patterns for GoodGo microservices including circuit breaker, retry strategies, timeout handling, and graceful degradation. Use when implementing fault tolerance, handling external service failures, or improving system reliability.
---
# Resilience Patterns
## When to Use This Skill
Use this skill when:
- Implementing circuit breaker patterns for external services
- Adding retry logic for transient failures
- Setting timeout handling for long-running operations
- Implementing graceful degradation strategies
- Handling external service failures
- Improving system fault tolerance
## Core Concepts
### Resilience Patterns
1. **Circuit Breaker**: Prevents cascading failures by stopping calls to failing services
2. **Retry**: Automatically retries failed operations with backoff
3. **Timeout**: Sets maximum time limits for operations
4. **Bulkhead**: Isolates failures to prevent spread
5. **Graceful Degradation**: Provides fallback behavior when services fail
## Patterns
### Circuit Breaker Pattern
Protects against cascading failures:
The circuit breaker has three states that transition based on error rates and timeouts:
```mermaid
stateDiagram-v2
[*] --> CLOSED: Initial State
CLOSED --> OPEN: Errors exceed threshold
(errorThresholdPercentage: 50%)
OPEN --> HALF_OPEN: Reset timeout expires
(resetTimeout: 30s)
HALF_OPEN --> CLOSED: Request succeeds
HALF_OPEN --> OPEN: Request fails
CLOSED --> [*]: Normal operation
OPEN --> [*]: Circuit open (rejecting requests)
HALF_OPEN --> [*]: Testing recovery
```
**Circuit Breaker States:**
- **CLOSED**: Normal operation, requests pass through
- **OPEN**: Circuit is open, requests are immediately rejected
- **HALF-OPEN**: Testing if service has recovered, allows limited requests
```typescript
import CircuitBreaker from 'opossum';
import { logger } from '@goodgo/logger';
export const createCircuitBreaker = (
action: (...args: TArgs) => Promise,
name: string,
options: Partial = {}
): CircuitBreaker => {
const breaker = new CircuitBreaker(action, {
timeout: 3000,
errorThresholdPercentage: 50,
resetTimeout: 30000,
...options,
name,
});
breaker.on('open', () => {
logger.warn(`Circuit Breaker OPEN: ${name}`);
});
breaker.on('halfOpen', () => {
logger.info(`Circuit Breaker HALF-OPEN: ${name}`);
});
breaker.on('close', () => {
logger.info(`Circuit Breaker CLOSED: ${name}`);
});
return breaker;
};
// Usage
const externalApiBreaker = createCircuitBreaker(
async (data) => await externalApi.call(data),
'external-api'
);
try {
const result = await externalApiBreaker.fire(requestData);
} catch (error) {
// Handle circuit breaker error or fallback
}
```
### Retry Pattern
Retry transient failures with exponential backoff:
The retry pattern attempts an operation multiple times with increasing delays between attempts:
```mermaid
flowchart TD
Start([Start Operation]) --> Attempt[Attempt Operation]
Attempt --> Success{Success?}
Success -->|Yes| Return([Return Result])
Success -->|No| CheckRetries{Attempt < Max Retries?}
CheckRetries -->|No| ThrowError([Throw Error])
CheckRetries -->|Yes| CalculateDelay[Calculate Delay:
baseDelay × 2^attempt]
CalculateDelay --> Wait[Wait for Delay]
Wait --> IncrementAttempt[Increment Attempt]
IncrementAttempt --> Attempt
style Start fill:#e1f5e1
style Return fill:#e1f5e1
style ThrowError fill:#ffe1e1
style CalculateDelay fill:#fff4e1
```
**Exponential Backoff Example:**
- Attempt 1: 1s delay
- Attempt 2: 2s delay
- Attempt 3: 4s delay
- Attempt 4: 8s delay
```typescript
async function retryWithBackoff(
fn: () => Promise,
maxRetries: number = 3,
baseDelay: number = 1000
): Promise {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
if (attempt === maxRetries) throw error;
const delay = baseDelay * Math.pow(2, attempt);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw new Error('Retry exhausted');
}
```
### Timeout Pattern
Set maximum time limits:
The timeout pattern uses Promise.race to enforce maximum execution time:
```mermaid
sequenceDiagram
participant Client
participant TimeoutWrapper
participant Operation
participant TimeoutTimer
Client->>TimeoutWrapper: Execute with timeout
TimeoutWrapper->>Operation: Start operation
TimeoutWrapper->>TimeoutTimer: Start timeout timer
alt Operation completes first
Operation-->>TimeoutWrapper: Return result
TimeoutWrapper-->>Client: Return result
TimeoutWrapper->>TimeoutTimer: Cancel timer
else Timeout expires first
TimeoutTimer-->>TimeoutWrapper: Timeout error
TimeoutWrapper->>Operation: (Operation may continue)
TimeoutWrapper-->>Client: Reject with timeout error
end
```
**Timeout Behavior:**
- Uses `Promise.race()` to compete operation vs timeout
- First to resolve/reject wins
- Operation may continue after timeout, but result is ignored
```typescript
async function withTimeout(
promise: Promise,
timeoutMs: number
): Promise {
const timeout = new Promise((_, reject) => {
setTimeout(() => reject(new Error('Operation timeout')), timeoutMs);
});
return Promise.race([promise, timeout]);
}
// Usage
try {
const result = await withTimeout(
externalService.call(),
5000 // 5 second timeout
);
} catch (error) {
if (error.message === 'Operation timeout') {
// Handle timeout
}
}
```
### Graceful Degradation
Provide fallback behavior:
```typescript
async function getDataWithFallback() {
try {
return await primaryDataSource.get();
} catch (error) {
logger.warn('Primary source failed, using fallback', { error });
return await fallbackDataSource.get();
}
}
```
## Best Practices
1. **Circuit Breaker**: Use for external service calls
2. **Retry**: Retry only transient failures (network, timeout)
3. **Timeout**: Set appropriate timeouts for all external calls
4. **Fallback**: Always provide fallback behavior
5. **Monitoring**: Monitor circuit breaker states and retry rates
6. **Logging**: Log all resilience actions for debugging
## Common Mistakes
1. **Retrying Non-Retryable Errors**: Retrying 4xx errors (client errors)
2. **No Timeout**: Missing timeouts on external calls
3. **No Fallback**: No graceful degradation strategy
4. **Too Many Retries**: Excessive retries causing performance issues
## Resources
- [Circuit Breaker](../../services/iam-service/src/modules/common/circuit-breaker.ts) - Circuit breaker implementation