- Change 'dependencies' to 'compatibility' in various skills for consistency - Add detailed examples and best practices to improve clarity in api-design, api-gateway-advanced, data-consistency-patterns, database-prisma, deployment-kubernetes, event-driven-architecture, inter-service-communication, observability-monitoring, security, and testing-patterns - Refine Common Mistakes sections with BAD/GOOD code examples for better learning All skills now feature improved structure and comprehensive guidance for developers. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
31 KiB
Microservices Development Process - Detailed Reference
This document contains detailed phase descriptions, diagrams, verification steps, and rollback strategies for microservices development.
Table of Contents
- Process Flow Diagrams
- Phase 1: Planning and Impact Analysis
- Phase 2: Foundation Setup
- Phase 3: Core Implementation
- Phase 4: Integration
- Phase 5: Testing
- Phase 6: Documentation
- Phase 7: Cleanup and Verification
- Phase 8: Deployment
- Rollback Strategies
- Common Pitfalls
Process Flow Diagrams
Main Process Flow
graph TD
Start([Start: New Service Requirements]) --> Phase1[Phase 1: Planning & Impact Analysis]
Phase1 --> ImpactCheck{Impact Analysis<br/>Complete?}
ImpactCheck -->|No| Phase1
ImpactCheck -->|Yes| Phase2[Phase 2: Foundation Setup]
Phase2 --> FoundationCheck{Service Starts<br/>& Health Check Passes?}
FoundationCheck -->|No| Phase2
FoundationCheck -->|Yes| Phase3[Phase 3: Core Implementation]
Phase3 --> ImplementationCheck{Business Logic<br/>Implemented?}
ImplementationCheck -->|No| Phase3
ImplementationCheck -->|Yes| Phase4[Phase 4: Integration]
Phase4 --> IntegrationCheck{Routes & Middleware<br/>Working?}
IntegrationCheck -->|No| Phase4
IntegrationCheck -->|Yes| Phase5[Phase 5: Testing]
Phase5 --> TestCheck{Tests Pass<br/>& Coverage Met?}
TestCheck -->|No| Phase5
TestCheck -->|Yes| Phase6[Phase 6: Documentation]
Phase6 --> DocCheck{Docs<br/>Complete?}
DocCheck -->|No| Phase6
DocCheck -->|Yes| Phase7[Phase 7: Cleanup & Verification]
Phase7 --> VerificationCheck{All Checks<br/>Pass?}
VerificationCheck -->|No| Phase7
VerificationCheck -->|Yes| Phase8[Phase 8: Deployment]
Phase8 --> DeployCheck{Staging<br/>Deployed?}
DeployCheck -->|No| Phase8
DeployCheck -->|Yes| Production{Deploy to<br/>Production?}
Production -->|Yes| ProdDeploy[Production Deployment]
Production -->|No| Complete([Complete])
ProdDeploy --> Complete
style Phase1 fill:#e1f5ff
style Phase2 fill:#fff4e1
style Phase3 fill:#f0e1ff
style Phase4 fill:#e1ffe1
style Phase5 fill:#ffe1e1
style Phase6 fill:#e1ffff
style Phase7 fill:#fff0e1
style Phase8 fill:#ffe1f5
style Complete fill:#d4edda
Detailed Phase Flow
graph LR
subgraph Planning["Phase 1: Planning"]
P1A[Define Scope] --> P1B[Impact Analysis]
P1B --> P1C[Dependencies Map]
P1C --> P1D[Acceptance Criteria]
end
subgraph Foundation["Phase 2: Foundation"]
F2A[Copy Template] --> F2B[Configure Package]
F2B --> F2C[Setup Database]
F2C --> F2D[Configure Docker]
F2D --> F2E[Setup Traefik]
end
subgraph Implementation["Phase 3: Implementation"]
I3A[DTOs] --> I3B[Repository]
I3B --> I3C[Service]
I3C --> I3D[Controller]
I3D --> I3E[Module]
end
subgraph Integration["Phase 4: Integration"]
IN4A[Register Routes] --> IN4B[Setup Middleware]
IN4B --> IN4C[External Services]
IN4C --> IN4D[Health Checks]
end
subgraph Testing["Phase 5: Testing"]
T5A[Unit Tests] --> T5B[Integration Tests]
T5B --> T5C[E2E Tests]
T5C --> T5D[Coverage Check]
end
subgraph Documentation["Phase 6: Documentation"]
D6A[README] --> D6B[API Docs]
D6B --> D6C[Architecture Docs]
end
subgraph Cleanup["Phase 7: Cleanup"]
C7A[Remove Temp Files] --> C7B[Update References]
C7B --> C7C[Verify Everything]
end
subgraph Deployment["Phase 8: Deployment"]
DEP8A[Staging] --> DEP8B[Verification]
DEP8B --> DEP8C[Production]
end
Planning --> Foundation
Foundation --> Implementation
Implementation --> Integration
Integration --> Testing
Testing --> Documentation
Documentation --> Cleanup
Cleanup --> Deployment
style Planning fill:#e1f5ff
style Foundation fill:#fff4e1
style Implementation fill:#f0e1ff
style Integration fill:#e1ffe1
style Testing fill:#ffe1e1
style Documentation fill:#e1ffff
style Cleanup fill:#fff0e1
style Deployment fill:#ffe1f5
Phase 1: Planning and Impact Analysis
Scope Definition
Define clearly before starting any implementation:
- Service Purpose: What business capability does it provide?
- API Surface: What endpoints are needed?
- Data Models: What data structures are required?
- Dependencies: What services/packages does it depend on?
- Breaking Changes: Any backward compatibility concerns?
Impact Analysis Checklist
Before starting implementation, identify all affected areas:
Files to Create
- Service directory:
services/service-name/ - Prisma schema:
services/service-name/prisma/schema.prisma - Dockerfile:
services/service-name/Dockerfile - Service README:
services/service-name/README.md - Source files:
services/service-name/src/ - Test files:
services/service-name/src/__tests__/ - Configuration:
services/service-name/src/config/
Files to Update
- Root
package.jsonworkspace config deployments/local/docker-compose.yml- Add serviceinfra/traefik/dynamic/routes.yml- Add routes.github/workflows/ci-*.yml- Add CI workflow (if needed)- Documentation:
docs/en/guides/,docs/vi/guides/ - Scripts:
scripts/db/*.sh,scripts/dev/*.sh(if service-specific)
Infrastructure Changes
- Database: New schema/tables
- Redis: New cache keys/patterns (if needed)
- Traefik: New routes and services
- Observability: New service metrics/traces
- Kubernetes: Deployment manifests (if deploying to K8s)
Dependencies Mapping
- External: Database, Redis, third-party APIs
- Internal: Shared packages (@goodgo/logger, @goodgo/types, etc.)
- Other Services: List dependent services
Acceptance Criteria for Phase 1
- Service purpose clearly defined
- All endpoints identified
- Data models designed
- Dependencies mapped
- Files to create/update listed
- Infrastructure changes identified
Phase 2: Foundation Setup
Service Structure Creation
Template Usage
# Copy from template
cp -r services/_template services/new-service-name
cd services/new-service-name
# Update package.json name to @goodgo/new-service-name
# Update other package.json fields as needed
Required Files
| File | Purpose |
|---|---|
package.json |
Package configuration with correct name and dependencies |
src/config/app.config.ts |
Configuration with Zod validation |
.env.example |
Environment variables template |
prisma/schema.prisma |
Database schema |
Dockerfile |
Container configuration |
jest.config.ts |
Test configuration |
tsconfig.json |
TypeScript configuration |
Database Setup
# Navigate to service directory
cd services/service-name
# Create initial migration
pnpm prisma migrate dev --name init
# Generate Prisma client
pnpm prisma generate
# Verify database connection
pnpm prisma db push --dry-run
Docker and Infrastructure Setup
Docker Compose Integration
Add service to deployments/local/docker-compose.yml:
services:
new-service-name:
build:
context: ../../services/new-service-name
dockerfile: Dockerfile
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/db
- NODE_ENV=development
labels:
- "traefik.enable=true"
- "traefik.http.routers.new-service-name.rule=PathPrefix(`/api/v1/new-service-name`)"
- "traefik.http.services.new-service-name.loadbalancer.server.port=5000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
interval: 30s
timeout: 10s
retries: 3
depends_on:
postgres:
condition: service_healthy
Traefik Routes
Update infra/traefik/dynamic/routes.yml:
http:
routers:
new-service-name:
rule: "PathPrefix(`/api/v1/new-service-name`)"
service: new-service-name
middlewares:
- cors
- rate-limit
- auth
services:
new-service-name:
loadBalancer:
servers:
- url: "http://new-service-name:5000"
Verification Steps for Phase 2
# 1. Verify service starts
cd services/service-name
pnpm dev
# 2. Check health endpoint
curl http://localhost:5000/health
# 3. TypeScript check
pnpm typecheck
# 4. Docker build
docker build -t service-name .
# 5. Full stack with Docker Compose
cd deployments/local
docker-compose up new-service-name
# 6. Verify Traefik routing
curl http://localhost/api/v1/new-service-name/health
Acceptance Criteria for Phase 2
- Service directory created from template
package.jsonconfigured correctly- Environment variables defined in
.env.example - Prisma schema created and migration run
- Service starts:
pnpm dev(health check passes) - Docker build succeeds
- Service accessible via Traefik
- No TypeScript errors:
pnpm typecheck
Phase 3: Core Implementation
Module Structure
Each feature module follows this pattern:
modules/feature-name/
├── feature.controller.ts # HTTP handlers
├── feature.service.ts # Business logic
├── feature.repository.ts # Data access
├── feature.dto.ts # Validation schemas (Zod)
├── feature.module.ts # Module registration
├── feature.test.ts # Unit tests
└── index.ts # Public exports
Implementation Flow
graph TD
Start[Start Implementation] --> DTOs[1. Create DTOs<br/>Zod Validation Schemas]
DTOs --> Repo[2. Create Repository<br/>Prisma Data Access]
Repo --> Service[3. Create Service<br/>Business Logic]
Service --> Controller[4. Create Controller<br/>HTTP Handlers]
Controller --> Module[5. Create Module<br/>Wire Up Components]
Module --> Test[Manual Testing]
Test --> Pass{Tests Pass?}
Pass -->|No| Repo
Pass -->|Yes| Next[Next Feature Module]
style DTOs fill:#e1f5ff
style Repo fill:#fff4e1
style Service fill:#f0e1ff
style Controller fill:#e1ffe1
style Module fill:#ffe1e1
Implementation Details
1. DTOs (Data Transfer Objects)
Create Zod schemas for request/response validation:
// feature.dto.ts
import { z } from 'zod';
export const CreateFeatureSchema = z.object({
name: z.string().min(1).max(100),
description: z.string().optional(),
isActive: z.boolean().default(true),
});
export const UpdateFeatureSchema = CreateFeatureSchema.partial();
export const FeatureResponseSchema = z.object({
id: z.string().uuid(),
name: z.string(),
description: z.string().nullable(),
isActive: z.boolean(),
createdAt: z.date(),
updatedAt: z.date(),
});
export type CreateFeatureDto = z.infer<typeof CreateFeatureSchema>;
export type UpdateFeatureDto = z.infer<typeof UpdateFeatureSchema>;
export type FeatureResponse = z.infer<typeof FeatureResponseSchema>;
2. Repository
Prisma-based data access layer:
// feature.repository.ts
import { PrismaClient, Feature } from '@prisma/client';
import { CreateFeatureDto, UpdateFeatureDto } from './feature.dto';
export class FeatureRepository {
constructor(private prisma: PrismaClient) {}
async findAll(): Promise<Feature[]> {
return this.prisma.feature.findMany();
}
async findById(id: string): Promise<Feature | null> {
return this.prisma.feature.findUnique({ where: { id } });
}
async create(data: CreateFeatureDto): Promise<Feature> {
return this.prisma.feature.create({ data });
}
async update(id: string, data: UpdateFeatureDto): Promise<Feature> {
return this.prisma.feature.update({ where: { id }, data });
}
async delete(id: string): Promise<void> {
await this.prisma.feature.delete({ where: { id } });
}
}
3. Service
Business logic layer:
// feature.service.ts
import { Logger } from '@goodgo/logger';
import { FeatureRepository } from './feature.repository';
import { CreateFeatureDto, UpdateFeatureDto } from './feature.dto';
import { NotFoundError } from '@goodgo/errors';
export class FeatureService {
private logger = new Logger('FeatureService');
constructor(private repository: FeatureRepository) {}
async getAll() {
this.logger.info('Fetching all features');
return this.repository.findAll();
}
async getById(id: string) {
const feature = await this.repository.findById(id);
if (!feature) {
throw new NotFoundError(`Feature with id ${id} not found`);
}
return feature;
}
async create(data: CreateFeatureDto) {
this.logger.info('Creating feature', { name: data.name });
return this.repository.create(data);
}
async update(id: string, data: UpdateFeatureDto) {
await this.getById(id); // Ensure exists
return this.repository.update(id, data);
}
async delete(id: string) {
await this.getById(id); // Ensure exists
await this.repository.delete(id);
}
}
4. Controller
HTTP request handlers:
// feature.controller.ts
import { Router, Request, Response } from 'express';
import { FeatureService } from './feature.service';
import { CreateFeatureSchema, UpdateFeatureSchema } from './feature.dto';
import { validateBody } from '@goodgo/middleware';
export class FeatureController {
public router = Router();
constructor(private service: FeatureService) {
this.initializeRoutes();
}
private initializeRoutes() {
this.router.get('/', this.getAll.bind(this));
this.router.get('/:id', this.getById.bind(this));
this.router.post('/', validateBody(CreateFeatureSchema), this.create.bind(this));
this.router.patch('/:id', validateBody(UpdateFeatureSchema), this.update.bind(this));
this.router.delete('/:id', this.delete.bind(this));
}
async getAll(req: Request, res: Response) {
const features = await this.service.getAll();
res.json({ success: true, data: features });
}
async getById(req: Request, res: Response) {
const feature = await this.service.getById(req.params.id);
res.json({ success: true, data: feature });
}
async create(req: Request, res: Response) {
const feature = await this.service.create(req.body);
res.status(201).json({ success: true, data: feature });
}
async update(req: Request, res: Response) {
const feature = await this.service.update(req.params.id, req.body);
res.json({ success: true, data: feature });
}
async delete(req: Request, res: Response) {
await this.service.delete(req.params.id);
res.status(204).send();
}
}
5. Module
Wire up components:
// feature.module.ts
import { PrismaClient } from '@prisma/client';
import { FeatureRepository } from './feature.repository';
import { FeatureService } from './feature.service';
import { FeatureController } from './feature.controller';
export function createFeatureModule(prisma: PrismaClient) {
const repository = new FeatureRepository(prisma);
const service = new FeatureService(repository);
const controller = new FeatureController(service);
return {
repository,
service,
controller,
router: controller.router,
};
}
Acceptance Criteria for Phase 3
- All DTOs defined with Zod validation
- Repository methods implemented (CRUD operations)
- Service business logic implemented
- Controllers handle requests correctly
- Modules configured properly
- No TypeScript errors
- Manual API testing successful
Phase 4: Integration
Route Registration
Update src/routes/index.ts:
import { Router } from 'express';
import { createFeatureModule } from '../modules/feature';
import { prisma } from '../lib/prisma';
export function createRoutes(): Router {
const router = Router();
// Create modules
const featureModule = createFeatureModule(prisma);
// Register routes
router.use('/features', featureModule.router);
return router;
}
// In app.ts
app.use('/api/v1/service-name', createRoutes());
Middleware Setup
Required middlewares in order:
// app.ts
import express from 'express';
import {
correlationMiddleware,
loggingMiddleware,
metricsMiddleware,
corsMiddleware,
rateLimitMiddleware,
authMiddleware,
errorMiddleware,
} from '@goodgo/middleware';
const app = express();
// 1. Correlation ID (first - sets up request context)
app.use(correlationMiddleware());
// 2. Logging (early - logs all requests)
app.use(loggingMiddleware());
// 3. Metrics (early - tracks all requests)
app.use(metricsMiddleware());
// 4. CORS (before routes)
app.use(corsMiddleware());
// 5. Rate limiting (before routes)
app.use(rateLimitMiddleware());
// 6. Body parsing
app.use(express.json());
// 7. Authentication (for protected routes)
app.use('/api/v1', authMiddleware());
// 8. Routes
app.use('/api/v1/service-name', createRoutes());
// 9. Error handling (always last)
app.use(errorMiddleware());
External Service Integration
HTTP Client Setup
import { HttpClient } from '@goodgo/http-client';
const externalService = new HttpClient({
baseUrl: process.env.EXTERNAL_SERVICE_URL,
timeout: 5000,
retries: 3,
});
Redis Caching Pattern
import { Redis } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
async function getCachedData<T>(key: string, fetcher: () => Promise<T>, ttl = 300): Promise<T> {
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
const data = await fetcher();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
Acceptance Criteria for Phase 4
- All routes registered and accessible
- Middlewares applied in correct order
- Error handling works for all scenarios
- External services integrated (if any)
- Caching implemented (if needed)
- Health check endpoint works:
/health - Metrics endpoint works:
/metrics
Phase 5: Testing
Test Structure
| Type | Location | Purpose |
|---|---|---|
| Unit Tests | Next to source (*.test.ts) |
Test isolated components with mocks |
| Integration Tests | src/__tests__/*.integration.ts |
Test component interactions |
| E2E Tests | src/__tests__/*.e2e.ts |
Test full API workflows |
Test Coverage Targets
| Component | Minimum | Recommended |
|---|---|---|
| Overall | 70% | 80% |
| Critical Paths | 90% | 95% |
| Repositories | 80% | 90% |
| Services | 80% | 90% |
| Controllers | 70% | 80% |
Testing Checklist
Unit Tests
- Repository tests: All CRUD operations
- Service tests: Business logic, error handling, edge cases
- Controller tests: Request/response handling, validation
- DTO tests: Validation rules, edge cases
- Utility tests: Helper functions
Integration Tests
- Module integration: Controller -> Service -> Repository
- Database operations: Real Prisma client with test DB
- Middleware chain: Request flow through middlewares
- External service mocks: HTTP client integrations
E2E Tests
- API endpoints: Full request/response cycle
- Authentication: Protected routes, token validation
- Error scenarios: 400, 401, 403, 404, 500 responses
- Health checks: /health endpoint
- Concurrent requests: Race conditions
Running Tests
# Run all tests
pnpm test
# Run with coverage
pnpm test:coverage
# Run specific test file
pnpm test -- feature.test.ts
# Run E2E tests
pnpm test:e2e
# Watch mode
pnpm test:watch
Acceptance Criteria for Phase 5
- All unit tests pass:
pnpm test - Integration tests pass
- E2E tests pass
- Coverage meets thresholds:
pnpm test:coverage - No test warnings or errors
- Tests run in CI pipeline successfully
Phase 6: Documentation
Required Documentation
Service README
Required sections:
- Service overview (bilingual EN/VI)
- Features list
- Prerequisites
- Quick start guide
- Configuration reference (environment variables table)
- API endpoints overview
- Development guide
- Testing instructions
API Documentation (Swagger/OpenAPI)
// src/docs/swagger.ts
import swaggerJSDoc from 'swagger-jsdoc';
const options = {
definition: {
openapi: '3.0.0',
info: {
title: 'Service Name API',
version: '1.0.0',
description: 'API documentation for Service Name',
},
servers: [
{ url: '/api/v1/service-name' },
],
},
apis: ['./src/modules/**/*.controller.ts'],
};
export const swaggerSpec = swaggerJSDoc(options);
Architecture Documentation (if complex)
ARCHITECTURE.en.md/ARCHITECTURE.vi.md- System design diagrams
- Data flow descriptions
- Component interactions
Documentation Checklist
- README is comprehensive and bilingual
- Swagger docs accessible:
/api-docs - All endpoints appear in Swagger
- Request/response examples provided
- Environment variables documented
- Error responses documented
- Architecture docs created (if needed)
Acceptance Criteria for Phase 6
- README is comprehensive and bilingual
- Swagger docs accessible:
/api-docs - All endpoints documented with examples
- Documentation reviewed and accurate
Phase 7: Cleanup and Verification
Verification Process Flow
graph TD
Start[Start Cleanup] --> Remove[Remove Temporary Files]
Remove --> Update{Is Migration?}
Update -->|Yes| RefUpdate[Update References<br/>grep & replace]
Update -->|No| Verify[Run Verification]
RefUpdate --> Verify
Verify --> TypeCheck[TypeScript Check]
TypeCheck --> LintCheck[Lint Check]
LintCheck --> TestCheck[Test Check]
TestCheck --> BuildCheck[Build Check]
BuildCheck --> DockerCheck[Docker Build]
DockerCheck --> HealthCheck[Health Check]
HealthCheck --> TraefikCheck[Traefik Check]
TraefikCheck --> AllPass{All Pass?}
AllPass -->|No| Fix[Fix Issues]
Fix --> Verify
AllPass -->|Yes| Complete[Phase Complete]
style Remove fill:#ffe1e1
style RefUpdate fill:#fff4e1
style Verify fill:#e1ffe1
style Complete fill:#d4edda
Cleanup Checklist
Remove Temporary Files
- Remove backup directories (e.g.,
service-name.backup/) - Remove temporary status files (e.g.,
*_STATUS.md,*_CHECKLIST.md) - Remove debug/scratch files
- Clean up unused imports
- Remove commented-out code
- Remove console.log statements (use logger instead)
Reference Updates (for migrations/renames)
# Find all references
grep -r "old-service-name" . --exclude-dir=node_modules --exclude-dir=.git
# Update checklist:
- [ ] Package names: `@goodgo/old-name` -> `@goodgo/new-name`
- [ ] Service paths: `services/old-name` -> `services/new-name`
- [ ] Docker images: `goodgo/old-name` -> `goodgo/new-name`
- [ ] Deployment names: `old-name` -> `new-name`
- [ ] Environment variables updated
- [ ] CI/CD workflows updated
- [ ] Scripts updated (if needed)
- [ ] Documentation updated (except historical context)
Comprehensive Verification Steps
# 1. Service starts successfully
cd services/service-name
pnpm dev &
sleep 5
curl http://localhost:5000/health
# 2. Type checking passes
pnpm typecheck
# 3. Linting passes
pnpm lint
# 4. Tests pass with coverage
pnpm test
pnpm test:coverage
# 5. Build succeeds
pnpm build
# 6. Docker build succeeds
docker build -t service-name .
# 7. Docker Compose works
cd deployments/local
docker-compose up -d service-name
docker-compose ps
# 8. Service accessible via Traefik
curl http://localhost/api/v1/service-name/health
# 9. No broken references (if migration)
grep -r "old-reference" . --exclude-dir=node_modules --exclude-dir=.git
Final Verification Checklist
Code Quality
- No TypeScript errors
- No linting errors
- No unused imports/variables
- Code follows project conventions
- Comments are clear (bilingual if needed)
Functionality
- Service starts without errors
- Health check works
- All API endpoints functional
- Database operations work
- External integrations work (if any)
Testing
- All tests pass
- Coverage meets requirements
- E2E tests verify full workflows
Documentation
- README is complete and accurate
- API documentation is up-to-date
- Code comments are helpful
Infrastructure
- Docker image builds
- Service works in Docker Compose
- Traefik routes configured correctly
- Environment variables documented
Cleanup
- Temporary files removed
- All references updated (if migration)
- No orphaned files
Acceptance Criteria for Phase 7
- All cleanup tasks completed
- All verification steps pass
- No broken references or links
- Code is production-ready
- Documentation is complete
Phase 8: Deployment
Staging Deployment
Pre-deployment Checklist
- All Phase 7 verification passed
- Database migrations tested:
pnpm prisma migrate deploy - Environment variables configured in staging
- Kubernetes manifests reviewed
- Secrets configured in Kubernetes
- Health checks configured
- Resource limits set appropriately
Deployment Steps
# 1. Build and push Docker image
docker build -t goodgo/service-name:latest .
docker push goodgo/service-name:latest
# 2. Apply Kubernetes configs
kubectl apply -f deployments/staging/kubernetes/service-name.yaml
kubectl apply -f deployments/staging/kubernetes/service-name-configmap.yaml
# 3. Wait for rollout
kubectl rollout status deployment/service-name -n staging
# 4. Verify deployment
kubectl get pods -n staging -l app=service-name
kubectl logs -f deployment/service-name -n staging
# 5. Health check
curl https://staging-api.example.com/api/v1/service-name/health
# 6. Run smoke tests
pnpm test:smoke -- --env=staging
Production Deployment
Pre-production Checklist
- Staging tests passed for at least 24 hours
- Database backup created
- Rollback plan documented and tested
- Monitoring dashboards ready
- Alerting configured
- On-call team notified
- Deployment window approved
Production Deployment Steps
# 1. Create database backup
kubectl exec -n production deployment/postgres -- pg_dump -U postgres db > backup.sql
# 2. Tag release
git tag v1.0.0
docker tag goodgo/service-name:latest goodgo/service-name:v1.0.0
docker push goodgo/service-name:v1.0.0
# 3. Update Kubernetes manifest with new image tag
# Edit deployments/production/kubernetes/service-name.yaml
# 4. Apply to production
kubectl apply -f deployments/production/kubernetes/service-name.yaml
# 5. Monitor rollout
kubectl rollout status deployment/service-name -n production
# 6. Verify
curl https://api.example.com/api/v1/service-name/health
Acceptance Criteria for Phase 8
- Service deployed to staging successfully
- All staging tests pass
- Monitoring shows healthy metrics
- Production deployment completed (if applicable)
- Post-deployment verification successful
Rollback Strategies
When to Rollback
Trigger a rollback when:
- Critical errors in staging/production
- Performance degradation (response time > 2x normal)
- Data integrity issues detected
- Security vulnerabilities discovered
- Error rate exceeds threshold (> 1%)
Quick Rollback Steps
# 1. Identify previous working version
kubectl rollout history deployment/service-name -n staging
# 2. Rollback to previous version
kubectl rollout undo deployment/service-name -n staging
# 3. Verify rollback
kubectl rollout status deployment/service-name -n staging
# 4. Check health
curl https://staging-api.example.com/api/v1/service-name/health
Database Rollback
If schema changes were made:
# 1. Identify the migration to revert to
pnpm prisma migrate status
# 2. Restore from backup (if data changes)
kubectl exec -n staging deployment/postgres -- psql -U postgres db < backup.sql
# 3. Reset migrations (development only)
pnpm prisma migrate reset
# 4. Or revert specific migration
pnpm prisma migrate resolve --rolled-back <migration-name>
Rollback Verification
After rollback, verify:
- Service health check passes
- All endpoints responding
- No error spikes in logs
- Database connectivity restored
- External service integrations working
Common Pitfalls
1. Skipping Impact Analysis
Problem: Missing updates in scripts, configs, or documentation.
Solution: Always complete the impact analysis checklist before coding:
# Find all files that might reference the service
grep -r "service-name" . --exclude-dir=node_modules --exclude-dir=.git
2. No Phase Verification
Problem: Accumulated issues that are hard to debug later.
Solution: Complete phase checklist before moving on:
pnpm typecheck && pnpm lint && pnpm test
3. Deferring Cleanup
Problem: Technical debt accumulates, temporary files forgotten.
Solution: Clean up as you go, not at the end:
# Regular cleanup
rm -rf *.backup/ *_STATUS.md *_CHECKLIST.md
4. Incomplete Testing
Problem: Missing edge cases and error scenarios in production.
Solution: Write tests alongside implementation:
- Test happy path AND error cases
- Test boundary conditions
- Test concurrent access
5. Poor Documentation
Problem: Difficult maintenance, onboarding issues.
Solution: Document as you implement:
- Add JSDoc comments to functions
- Update README with new features
- Add Swagger annotations to endpoints
6. No Rollback Plan
Problem: Difficult recovery from deployment failures.
Solution: Always prepare before deployment:
- Database backup
- Previous image tag recorded
- Rollback commands documented and tested
7. Hardcoded Configuration
Problem: Different behavior in different environments.
Solution: Use environment variables with Zod validation:
const config = z.object({
DATABASE_URL: z.string().url(),
PORT: z.coerce.number().default(5000),
}).parse(process.env);
8. Missing Health Checks
Problem: Unhealthy services not detected by load balancer.
Solution: Implement comprehensive health checks:
app.get('/health', async (req, res) => {
const dbHealthy = await checkDatabase();
const redisHealthy = await checkRedis();
if (dbHealthy && redisHealthy) {
res.json({ status: 'healthy' });
} else {
res.status(503).json({ status: 'unhealthy' });
}
});