- Updated skill documentation files to include structured metadata for better organization. - Enhanced bilingual descriptions and guidelines for clarity in both English and Vietnamese. - Refined sections on usage, best practices, and related skills to ensure consistency across all documentation. - Improved formatting and removed outdated references to streamline the documentation experience. - Added best practices checklists to relevant skills for better usability and adherence to standards.
4.2 KiB
4.2 KiB
name, description
| name | description |
|---|---|
| resilience-patterns | Resilience patterns for GoodGo microservices including circuit breaker, retry strategies, timeout handling, and graceful degradation. Use when implementing fault tolerance, handling external service failures, or improving system reliability. |
Resilience Patterns
When to Use This Skill
Use this skill when:
- Implementing circuit breaker patterns for external services
- Adding retry logic for transient failures
- Setting timeout handling for long-running operations
- Implementing graceful degradation strategies
- Handling external service failures
- Improving system fault tolerance
Core Concepts
Resilience Patterns
- Circuit Breaker: Prevents cascading failures by stopping calls to failing services
- Retry: Automatically retries failed operations with backoff
- Timeout: Sets maximum time limits for operations
- Bulkhead: Isolates failures to prevent spread
- Graceful Degradation: Provides fallback behavior when services fail
Patterns
Circuit Breaker Pattern
Protects against cascading failures:
import CircuitBreaker from 'opossum';
import { logger } from '@goodgo/logger';
export const createCircuitBreaker = <TArgs extends any[], TResult>(
action: (...args: TArgs) => Promise<TResult>,
name: string,
options: Partial<CircuitBreaker.Options> = {}
): CircuitBreaker<TArgs, TResult> => {
const breaker = new CircuitBreaker(action, {
timeout: 3000,
errorThresholdPercentage: 50,
resetTimeout: 30000,
...options,
name,
});
breaker.on('open', () => {
logger.warn(`Circuit Breaker OPEN: ${name}`);
});
breaker.on('halfOpen', () => {
logger.info(`Circuit Breaker HALF-OPEN: ${name}`);
});
breaker.on('close', () => {
logger.info(`Circuit Breaker CLOSED: ${name}`);
});
return breaker;
};
// Usage
const externalApiBreaker = createCircuitBreaker(
async (data) => await externalApi.call(data),
'external-api'
);
try {
const result = await externalApiBreaker.fire(requestData);
} catch (error) {
// Handle circuit breaker error or fallback
}
Retry Pattern
Retry transient failures with exponential backoff:
async function retryWithBackoff<T>(
fn: () => Promise<T>,
maxRetries: number = 3,
baseDelay: number = 1000
): Promise<T> {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
if (attempt === maxRetries) throw error;
const delay = baseDelay * Math.pow(2, attempt);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw new Error('Retry exhausted');
}
Timeout Pattern
Set maximum time limits:
async function withTimeout<T>(
promise: Promise<T>,
timeoutMs: number
): Promise<T> {
const timeout = new Promise<never>((_, reject) => {
setTimeout(() => reject(new Error('Operation timeout')), timeoutMs);
});
return Promise.race([promise, timeout]);
}
// Usage
try {
const result = await withTimeout(
externalService.call(),
5000 // 5 second timeout
);
} catch (error) {
if (error.message === 'Operation timeout') {
// Handle timeout
}
}
Graceful Degradation
Provide fallback behavior:
async function getDataWithFallback() {
try {
return await primaryDataSource.get();
} catch (error) {
logger.warn('Primary source failed, using fallback', { error });
return await fallbackDataSource.get();
}
}
Best Practices
- Circuit Breaker: Use for external service calls
- Retry: Retry only transient failures (network, timeout)
- Timeout: Set appropriate timeouts for all external calls
- Fallback: Always provide fallback behavior
- Monitoring: Monitor circuit breaker states and retry rates
- Logging: Log all resilience actions for debugging
Common Mistakes
- Retrying Non-Retryable Errors: Retrying 4xx errors (client errors)
- No Timeout: Missing timeouts on external calls
- No Fallback: No graceful degradation strategy
- Too Many Retries: Excessive retries causing performance issues
Resources
- Circuit Breaker - Circuit breaker implementation