Files
pos-system/.cursor/skills/microservices-development-process/references/REFERENCE.md
Ho Ngoc Hai ed1eb2b6e4 Update skill metadata and enhance documentation across multiple skills
- Change 'dependencies' to 'compatibility' in various skills for consistency
- Add detailed examples and best practices to improve clarity in api-design, api-gateway-advanced, data-consistency-patterns, database-prisma, deployment-kubernetes, event-driven-architecture, inter-service-communication, observability-monitoring, security, and testing-patterns
- Refine Common Mistakes sections with BAD/GOOD code examples for better learning

All skills now feature improved structure and comprehensive guidance for developers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-01-01 16:14:11 +07:00

31 KiB

Microservices Development Process - Detailed Reference

This document contains detailed phase descriptions, diagrams, verification steps, and rollback strategies for microservices development.

Table of Contents


Process Flow Diagrams

Main Process Flow

graph TD
    Start([Start: New Service Requirements]) --> Phase1[Phase 1: Planning & Impact Analysis]
    Phase1 --> ImpactCheck{Impact Analysis<br/>Complete?}
    ImpactCheck -->|No| Phase1
    ImpactCheck -->|Yes| Phase2[Phase 2: Foundation Setup]

    Phase2 --> FoundationCheck{Service Starts<br/>& Health Check Passes?}
    FoundationCheck -->|No| Phase2
    FoundationCheck -->|Yes| Phase3[Phase 3: Core Implementation]

    Phase3 --> ImplementationCheck{Business Logic<br/>Implemented?}
    ImplementationCheck -->|No| Phase3
    ImplementationCheck -->|Yes| Phase4[Phase 4: Integration]

    Phase4 --> IntegrationCheck{Routes & Middleware<br/>Working?}
    IntegrationCheck -->|No| Phase4
    IntegrationCheck -->|Yes| Phase5[Phase 5: Testing]

    Phase5 --> TestCheck{Tests Pass<br/>& Coverage Met?}
    TestCheck -->|No| Phase5
    TestCheck -->|Yes| Phase6[Phase 6: Documentation]

    Phase6 --> DocCheck{Docs<br/>Complete?}
    DocCheck -->|No| Phase6
    DocCheck -->|Yes| Phase7[Phase 7: Cleanup & Verification]

    Phase7 --> VerificationCheck{All Checks<br/>Pass?}
    VerificationCheck -->|No| Phase7
    VerificationCheck -->|Yes| Phase8[Phase 8: Deployment]

    Phase8 --> DeployCheck{Staging<br/>Deployed?}
    DeployCheck -->|No| Phase8
    DeployCheck -->|Yes| Production{Deploy to<br/>Production?}
    Production -->|Yes| ProdDeploy[Production Deployment]
    Production -->|No| Complete([Complete])
    ProdDeploy --> Complete

    style Phase1 fill:#e1f5ff
    style Phase2 fill:#fff4e1
    style Phase3 fill:#f0e1ff
    style Phase4 fill:#e1ffe1
    style Phase5 fill:#ffe1e1
    style Phase6 fill:#e1ffff
    style Phase7 fill:#fff0e1
    style Phase8 fill:#ffe1f5
    style Complete fill:#d4edda

Detailed Phase Flow

graph LR
    subgraph Planning["Phase 1: Planning"]
        P1A[Define Scope] --> P1B[Impact Analysis]
        P1B --> P1C[Dependencies Map]
        P1C --> P1D[Acceptance Criteria]
    end

    subgraph Foundation["Phase 2: Foundation"]
        F2A[Copy Template] --> F2B[Configure Package]
        F2B --> F2C[Setup Database]
        F2C --> F2D[Configure Docker]
        F2D --> F2E[Setup Traefik]
    end

    subgraph Implementation["Phase 3: Implementation"]
        I3A[DTOs] --> I3B[Repository]
        I3B --> I3C[Service]
        I3C --> I3D[Controller]
        I3D --> I3E[Module]
    end

    subgraph Integration["Phase 4: Integration"]
        IN4A[Register Routes] --> IN4B[Setup Middleware]
        IN4B --> IN4C[External Services]
        IN4C --> IN4D[Health Checks]
    end

    subgraph Testing["Phase 5: Testing"]
        T5A[Unit Tests] --> T5B[Integration Tests]
        T5B --> T5C[E2E Tests]
        T5C --> T5D[Coverage Check]
    end

    subgraph Documentation["Phase 6: Documentation"]
        D6A[README] --> D6B[API Docs]
        D6B --> D6C[Architecture Docs]
    end

    subgraph Cleanup["Phase 7: Cleanup"]
        C7A[Remove Temp Files] --> C7B[Update References]
        C7B --> C7C[Verify Everything]
    end

    subgraph Deployment["Phase 8: Deployment"]
        DEP8A[Staging] --> DEP8B[Verification]
        DEP8B --> DEP8C[Production]
    end

    Planning --> Foundation
    Foundation --> Implementation
    Implementation --> Integration
    Integration --> Testing
    Testing --> Documentation
    Documentation --> Cleanup
    Cleanup --> Deployment

    style Planning fill:#e1f5ff
    style Foundation fill:#fff4e1
    style Implementation fill:#f0e1ff
    style Integration fill:#e1ffe1
    style Testing fill:#ffe1e1
    style Documentation fill:#e1ffff
    style Cleanup fill:#fff0e1
    style Deployment fill:#ffe1f5

Phase 1: Planning and Impact Analysis

Scope Definition

Define clearly before starting any implementation:

  • Service Purpose: What business capability does it provide?
  • API Surface: What endpoints are needed?
  • Data Models: What data structures are required?
  • Dependencies: What services/packages does it depend on?
  • Breaking Changes: Any backward compatibility concerns?

Impact Analysis Checklist

Before starting implementation, identify all affected areas:

Files to Create

  • Service directory: services/service-name/
  • Prisma schema: services/service-name/prisma/schema.prisma
  • Dockerfile: services/service-name/Dockerfile
  • Service README: services/service-name/README.md
  • Source files: services/service-name/src/
  • Test files: services/service-name/src/__tests__/
  • Configuration: services/service-name/src/config/

Files to Update

  • Root package.json workspace config
  • deployments/local/docker-compose.yml - Add service
  • infra/traefik/dynamic/routes.yml - Add routes
  • .github/workflows/ci-*.yml - Add CI workflow (if needed)
  • Documentation: docs/en/guides/, docs/vi/guides/
  • Scripts: scripts/db/*.sh, scripts/dev/*.sh (if service-specific)

Infrastructure Changes

  • Database: New schema/tables
  • Redis: New cache keys/patterns (if needed)
  • Traefik: New routes and services
  • Observability: New service metrics/traces
  • Kubernetes: Deployment manifests (if deploying to K8s)

Dependencies Mapping

  • External: Database, Redis, third-party APIs
  • Internal: Shared packages (@goodgo/logger, @goodgo/types, etc.)
  • Other Services: List dependent services

Acceptance Criteria for Phase 1

  • Service purpose clearly defined
  • All endpoints identified
  • Data models designed
  • Dependencies mapped
  • Files to create/update listed
  • Infrastructure changes identified

Phase 2: Foundation Setup

Service Structure Creation

Template Usage

# Copy from template
cp -r services/_template services/new-service-name
cd services/new-service-name

# Update package.json name to @goodgo/new-service-name
# Update other package.json fields as needed

Required Files

File Purpose
package.json Package configuration with correct name and dependencies
src/config/app.config.ts Configuration with Zod validation
.env.example Environment variables template
prisma/schema.prisma Database schema
Dockerfile Container configuration
jest.config.ts Test configuration
tsconfig.json TypeScript configuration

Database Setup

# Navigate to service directory
cd services/service-name

# Create initial migration
pnpm prisma migrate dev --name init

# Generate Prisma client
pnpm prisma generate

# Verify database connection
pnpm prisma db push --dry-run

Docker and Infrastructure Setup

Docker Compose Integration

Add service to deployments/local/docker-compose.yml:

services:
  new-service-name:
    build:
      context: ../../services/new-service-name
      dockerfile: Dockerfile
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/db
      - NODE_ENV=development
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.new-service-name.rule=PathPrefix(`/api/v1/new-service-name`)"
      - "traefik.http.services.new-service-name.loadbalancer.server.port=5000"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    depends_on:
      postgres:
        condition: service_healthy

Traefik Routes

Update infra/traefik/dynamic/routes.yml:

http:
  routers:
    new-service-name:
      rule: "PathPrefix(`/api/v1/new-service-name`)"
      service: new-service-name
      middlewares:
        - cors
        - rate-limit
        - auth

  services:
    new-service-name:
      loadBalancer:
        servers:
          - url: "http://new-service-name:5000"

Verification Steps for Phase 2

# 1. Verify service starts
cd services/service-name
pnpm dev

# 2. Check health endpoint
curl http://localhost:5000/health

# 3. TypeScript check
pnpm typecheck

# 4. Docker build
docker build -t service-name .

# 5. Full stack with Docker Compose
cd deployments/local
docker-compose up new-service-name

# 6. Verify Traefik routing
curl http://localhost/api/v1/new-service-name/health

Acceptance Criteria for Phase 2

  • Service directory created from template
  • package.json configured correctly
  • Environment variables defined in .env.example
  • Prisma schema created and migration run
  • Service starts: pnpm dev (health check passes)
  • Docker build succeeds
  • Service accessible via Traefik
  • No TypeScript errors: pnpm typecheck

Phase 3: Core Implementation

Module Structure

Each feature module follows this pattern:

modules/feature-name/
├── feature.controller.ts    # HTTP handlers
├── feature.service.ts       # Business logic
├── feature.repository.ts    # Data access
├── feature.dto.ts          # Validation schemas (Zod)
├── feature.module.ts       # Module registration
├── feature.test.ts         # Unit tests
└── index.ts                # Public exports

Implementation Flow

graph TD
    Start[Start Implementation] --> DTOs[1. Create DTOs<br/>Zod Validation Schemas]
    DTOs --> Repo[2. Create Repository<br/>Prisma Data Access]
    Repo --> Service[3. Create Service<br/>Business Logic]
    Service --> Controller[4. Create Controller<br/>HTTP Handlers]
    Controller --> Module[5. Create Module<br/>Wire Up Components]
    Module --> Test[Manual Testing]
    Test --> Pass{Tests Pass?}
    Pass -->|No| Repo
    Pass -->|Yes| Next[Next Feature Module]

    style DTOs fill:#e1f5ff
    style Repo fill:#fff4e1
    style Service fill:#f0e1ff
    style Controller fill:#e1ffe1
    style Module fill:#ffe1e1

Implementation Details

1. DTOs (Data Transfer Objects)

Create Zod schemas for request/response validation:

// feature.dto.ts
import { z } from 'zod';

export const CreateFeatureSchema = z.object({
  name: z.string().min(1).max(100),
  description: z.string().optional(),
  isActive: z.boolean().default(true),
});

export const UpdateFeatureSchema = CreateFeatureSchema.partial();

export const FeatureResponseSchema = z.object({
  id: z.string().uuid(),
  name: z.string(),
  description: z.string().nullable(),
  isActive: z.boolean(),
  createdAt: z.date(),
  updatedAt: z.date(),
});

export type CreateFeatureDto = z.infer<typeof CreateFeatureSchema>;
export type UpdateFeatureDto = z.infer<typeof UpdateFeatureSchema>;
export type FeatureResponse = z.infer<typeof FeatureResponseSchema>;

2. Repository

Prisma-based data access layer:

// feature.repository.ts
import { PrismaClient, Feature } from '@prisma/client';
import { CreateFeatureDto, UpdateFeatureDto } from './feature.dto';

export class FeatureRepository {
  constructor(private prisma: PrismaClient) {}

  async findAll(): Promise<Feature[]> {
    return this.prisma.feature.findMany();
  }

  async findById(id: string): Promise<Feature | null> {
    return this.prisma.feature.findUnique({ where: { id } });
  }

  async create(data: CreateFeatureDto): Promise<Feature> {
    return this.prisma.feature.create({ data });
  }

  async update(id: string, data: UpdateFeatureDto): Promise<Feature> {
    return this.prisma.feature.update({ where: { id }, data });
  }

  async delete(id: string): Promise<void> {
    await this.prisma.feature.delete({ where: { id } });
  }
}

3. Service

Business logic layer:

// feature.service.ts
import { Logger } from '@goodgo/logger';
import { FeatureRepository } from './feature.repository';
import { CreateFeatureDto, UpdateFeatureDto } from './feature.dto';
import { NotFoundError } from '@goodgo/errors';

export class FeatureService {
  private logger = new Logger('FeatureService');

  constructor(private repository: FeatureRepository) {}

  async getAll() {
    this.logger.info('Fetching all features');
    return this.repository.findAll();
  }

  async getById(id: string) {
    const feature = await this.repository.findById(id);
    if (!feature) {
      throw new NotFoundError(`Feature with id ${id} not found`);
    }
    return feature;
  }

  async create(data: CreateFeatureDto) {
    this.logger.info('Creating feature', { name: data.name });
    return this.repository.create(data);
  }

  async update(id: string, data: UpdateFeatureDto) {
    await this.getById(id); // Ensure exists
    return this.repository.update(id, data);
  }

  async delete(id: string) {
    await this.getById(id); // Ensure exists
    await this.repository.delete(id);
  }
}

4. Controller

HTTP request handlers:

// feature.controller.ts
import { Router, Request, Response } from 'express';
import { FeatureService } from './feature.service';
import { CreateFeatureSchema, UpdateFeatureSchema } from './feature.dto';
import { validateBody } from '@goodgo/middleware';

export class FeatureController {
  public router = Router();

  constructor(private service: FeatureService) {
    this.initializeRoutes();
  }

  private initializeRoutes() {
    this.router.get('/', this.getAll.bind(this));
    this.router.get('/:id', this.getById.bind(this));
    this.router.post('/', validateBody(CreateFeatureSchema), this.create.bind(this));
    this.router.patch('/:id', validateBody(UpdateFeatureSchema), this.update.bind(this));
    this.router.delete('/:id', this.delete.bind(this));
  }

  async getAll(req: Request, res: Response) {
    const features = await this.service.getAll();
    res.json({ success: true, data: features });
  }

  async getById(req: Request, res: Response) {
    const feature = await this.service.getById(req.params.id);
    res.json({ success: true, data: feature });
  }

  async create(req: Request, res: Response) {
    const feature = await this.service.create(req.body);
    res.status(201).json({ success: true, data: feature });
  }

  async update(req: Request, res: Response) {
    const feature = await this.service.update(req.params.id, req.body);
    res.json({ success: true, data: feature });
  }

  async delete(req: Request, res: Response) {
    await this.service.delete(req.params.id);
    res.status(204).send();
  }
}

5. Module

Wire up components:

// feature.module.ts
import { PrismaClient } from '@prisma/client';
import { FeatureRepository } from './feature.repository';
import { FeatureService } from './feature.service';
import { FeatureController } from './feature.controller';

export function createFeatureModule(prisma: PrismaClient) {
  const repository = new FeatureRepository(prisma);
  const service = new FeatureService(repository);
  const controller = new FeatureController(service);

  return {
    repository,
    service,
    controller,
    router: controller.router,
  };
}

Acceptance Criteria for Phase 3

  • All DTOs defined with Zod validation
  • Repository methods implemented (CRUD operations)
  • Service business logic implemented
  • Controllers handle requests correctly
  • Modules configured properly
  • No TypeScript errors
  • Manual API testing successful

Phase 4: Integration

Route Registration

Update src/routes/index.ts:

import { Router } from 'express';
import { createFeatureModule } from '../modules/feature';
import { prisma } from '../lib/prisma';

export function createRoutes(): Router {
  const router = Router();

  // Create modules
  const featureModule = createFeatureModule(prisma);

  // Register routes
  router.use('/features', featureModule.router);

  return router;
}

// In app.ts
app.use('/api/v1/service-name', createRoutes());

Middleware Setup

Required middlewares in order:

// app.ts
import express from 'express';
import {
  correlationMiddleware,
  loggingMiddleware,
  metricsMiddleware,
  corsMiddleware,
  rateLimitMiddleware,
  authMiddleware,
  errorMiddleware,
} from '@goodgo/middleware';

const app = express();

// 1. Correlation ID (first - sets up request context)
app.use(correlationMiddleware());

// 2. Logging (early - logs all requests)
app.use(loggingMiddleware());

// 3. Metrics (early - tracks all requests)
app.use(metricsMiddleware());

// 4. CORS (before routes)
app.use(corsMiddleware());

// 5. Rate limiting (before routes)
app.use(rateLimitMiddleware());

// 6. Body parsing
app.use(express.json());

// 7. Authentication (for protected routes)
app.use('/api/v1', authMiddleware());

// 8. Routes
app.use('/api/v1/service-name', createRoutes());

// 9. Error handling (always last)
app.use(errorMiddleware());

External Service Integration

HTTP Client Setup

import { HttpClient } from '@goodgo/http-client';

const externalService = new HttpClient({
  baseUrl: process.env.EXTERNAL_SERVICE_URL,
  timeout: 5000,
  retries: 3,
});

Redis Caching Pattern

import { Redis } from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

async function getCachedData<T>(key: string, fetcher: () => Promise<T>, ttl = 300): Promise<T> {
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }

  const data = await fetcher();
  await redis.setex(key, ttl, JSON.stringify(data));
  return data;
}

Acceptance Criteria for Phase 4

  • All routes registered and accessible
  • Middlewares applied in correct order
  • Error handling works for all scenarios
  • External services integrated (if any)
  • Caching implemented (if needed)
  • Health check endpoint works: /health
  • Metrics endpoint works: /metrics

Phase 5: Testing

Test Structure

Type Location Purpose
Unit Tests Next to source (*.test.ts) Test isolated components with mocks
Integration Tests src/__tests__/*.integration.ts Test component interactions
E2E Tests src/__tests__/*.e2e.ts Test full API workflows

Test Coverage Targets

Component Minimum Recommended
Overall 70% 80%
Critical Paths 90% 95%
Repositories 80% 90%
Services 80% 90%
Controllers 70% 80%

Testing Checklist

Unit Tests

  • Repository tests: All CRUD operations
  • Service tests: Business logic, error handling, edge cases
  • Controller tests: Request/response handling, validation
  • DTO tests: Validation rules, edge cases
  • Utility tests: Helper functions

Integration Tests

  • Module integration: Controller -> Service -> Repository
  • Database operations: Real Prisma client with test DB
  • Middleware chain: Request flow through middlewares
  • External service mocks: HTTP client integrations

E2E Tests

  • API endpoints: Full request/response cycle
  • Authentication: Protected routes, token validation
  • Error scenarios: 400, 401, 403, 404, 500 responses
  • Health checks: /health endpoint
  • Concurrent requests: Race conditions

Running Tests

# Run all tests
pnpm test

# Run with coverage
pnpm test:coverage

# Run specific test file
pnpm test -- feature.test.ts

# Run E2E tests
pnpm test:e2e

# Watch mode
pnpm test:watch

Acceptance Criteria for Phase 5

  • All unit tests pass: pnpm test
  • Integration tests pass
  • E2E tests pass
  • Coverage meets thresholds: pnpm test:coverage
  • No test warnings or errors
  • Tests run in CI pipeline successfully

Phase 6: Documentation

Required Documentation

Service README

Required sections:

  • Service overview (bilingual EN/VI)
  • Features list
  • Prerequisites
  • Quick start guide
  • Configuration reference (environment variables table)
  • API endpoints overview
  • Development guide
  • Testing instructions

API Documentation (Swagger/OpenAPI)

// src/docs/swagger.ts
import swaggerJSDoc from 'swagger-jsdoc';

const options = {
  definition: {
    openapi: '3.0.0',
    info: {
      title: 'Service Name API',
      version: '1.0.0',
      description: 'API documentation for Service Name',
    },
    servers: [
      { url: '/api/v1/service-name' },
    ],
  },
  apis: ['./src/modules/**/*.controller.ts'],
};

export const swaggerSpec = swaggerJSDoc(options);

Architecture Documentation (if complex)

  • ARCHITECTURE.en.md / ARCHITECTURE.vi.md
  • System design diagrams
  • Data flow descriptions
  • Component interactions

Documentation Checklist

  • README is comprehensive and bilingual
  • Swagger docs accessible: /api-docs
  • All endpoints appear in Swagger
  • Request/response examples provided
  • Environment variables documented
  • Error responses documented
  • Architecture docs created (if needed)

Acceptance Criteria for Phase 6

  • README is comprehensive and bilingual
  • Swagger docs accessible: /api-docs
  • All endpoints documented with examples
  • Documentation reviewed and accurate

Phase 7: Cleanup and Verification

Verification Process Flow

graph TD
    Start[Start Cleanup] --> Remove[Remove Temporary Files]
    Remove --> Update{Is Migration?}
    Update -->|Yes| RefUpdate[Update References<br/>grep & replace]
    Update -->|No| Verify[Run Verification]
    RefUpdate --> Verify

    Verify --> TypeCheck[TypeScript Check]
    TypeCheck --> LintCheck[Lint Check]
    LintCheck --> TestCheck[Test Check]
    TestCheck --> BuildCheck[Build Check]
    BuildCheck --> DockerCheck[Docker Build]
    DockerCheck --> HealthCheck[Health Check]
    HealthCheck --> TraefikCheck[Traefik Check]
    TraefikCheck --> AllPass{All Pass?}

    AllPass -->|No| Fix[Fix Issues]
    Fix --> Verify
    AllPass -->|Yes| Complete[Phase Complete]

    style Remove fill:#ffe1e1
    style RefUpdate fill:#fff4e1
    style Verify fill:#e1ffe1
    style Complete fill:#d4edda

Cleanup Checklist

Remove Temporary Files

  • Remove backup directories (e.g., service-name.backup/)
  • Remove temporary status files (e.g., *_STATUS.md, *_CHECKLIST.md)
  • Remove debug/scratch files
  • Clean up unused imports
  • Remove commented-out code
  • Remove console.log statements (use logger instead)

Reference Updates (for migrations/renames)

# Find all references
grep -r "old-service-name" . --exclude-dir=node_modules --exclude-dir=.git

# Update checklist:
- [ ] Package names: `@goodgo/old-name` -> `@goodgo/new-name`
- [ ] Service paths: `services/old-name` -> `services/new-name`
- [ ] Docker images: `goodgo/old-name` -> `goodgo/new-name`
- [ ] Deployment names: `old-name` -> `new-name`
- [ ] Environment variables updated
- [ ] CI/CD workflows updated
- [ ] Scripts updated (if needed)
- [ ] Documentation updated (except historical context)

Comprehensive Verification Steps

# 1. Service starts successfully
cd services/service-name
pnpm dev &
sleep 5
curl http://localhost:5000/health

# 2. Type checking passes
pnpm typecheck

# 3. Linting passes
pnpm lint

# 4. Tests pass with coverage
pnpm test
pnpm test:coverage

# 5. Build succeeds
pnpm build

# 6. Docker build succeeds
docker build -t service-name .

# 7. Docker Compose works
cd deployments/local
docker-compose up -d service-name
docker-compose ps

# 8. Service accessible via Traefik
curl http://localhost/api/v1/service-name/health

# 9. No broken references (if migration)
grep -r "old-reference" . --exclude-dir=node_modules --exclude-dir=.git

Final Verification Checklist

Code Quality

  • No TypeScript errors
  • No linting errors
  • No unused imports/variables
  • Code follows project conventions
  • Comments are clear (bilingual if needed)

Functionality

  • Service starts without errors
  • Health check works
  • All API endpoints functional
  • Database operations work
  • External integrations work (if any)

Testing

  • All tests pass
  • Coverage meets requirements
  • E2E tests verify full workflows

Documentation

  • README is complete and accurate
  • API documentation is up-to-date
  • Code comments are helpful

Infrastructure

  • Docker image builds
  • Service works in Docker Compose
  • Traefik routes configured correctly
  • Environment variables documented

Cleanup

  • Temporary files removed
  • All references updated (if migration)
  • No orphaned files

Acceptance Criteria for Phase 7

  • All cleanup tasks completed
  • All verification steps pass
  • No broken references or links
  • Code is production-ready
  • Documentation is complete

Phase 8: Deployment

Staging Deployment

Pre-deployment Checklist

  • All Phase 7 verification passed
  • Database migrations tested: pnpm prisma migrate deploy
  • Environment variables configured in staging
  • Kubernetes manifests reviewed
  • Secrets configured in Kubernetes
  • Health checks configured
  • Resource limits set appropriately

Deployment Steps

# 1. Build and push Docker image
docker build -t goodgo/service-name:latest .
docker push goodgo/service-name:latest

# 2. Apply Kubernetes configs
kubectl apply -f deployments/staging/kubernetes/service-name.yaml
kubectl apply -f deployments/staging/kubernetes/service-name-configmap.yaml

# 3. Wait for rollout
kubectl rollout status deployment/service-name -n staging

# 4. Verify deployment
kubectl get pods -n staging -l app=service-name
kubectl logs -f deployment/service-name -n staging

# 5. Health check
curl https://staging-api.example.com/api/v1/service-name/health

# 6. Run smoke tests
pnpm test:smoke -- --env=staging

Production Deployment

Pre-production Checklist

  • Staging tests passed for at least 24 hours
  • Database backup created
  • Rollback plan documented and tested
  • Monitoring dashboards ready
  • Alerting configured
  • On-call team notified
  • Deployment window approved

Production Deployment Steps

# 1. Create database backup
kubectl exec -n production deployment/postgres -- pg_dump -U postgres db > backup.sql

# 2. Tag release
git tag v1.0.0
docker tag goodgo/service-name:latest goodgo/service-name:v1.0.0
docker push goodgo/service-name:v1.0.0

# 3. Update Kubernetes manifest with new image tag
# Edit deployments/production/kubernetes/service-name.yaml

# 4. Apply to production
kubectl apply -f deployments/production/kubernetes/service-name.yaml

# 5. Monitor rollout
kubectl rollout status deployment/service-name -n production

# 6. Verify
curl https://api.example.com/api/v1/service-name/health

Acceptance Criteria for Phase 8

  • Service deployed to staging successfully
  • All staging tests pass
  • Monitoring shows healthy metrics
  • Production deployment completed (if applicable)
  • Post-deployment verification successful

Rollback Strategies

When to Rollback

Trigger a rollback when:

  • Critical errors in staging/production
  • Performance degradation (response time > 2x normal)
  • Data integrity issues detected
  • Security vulnerabilities discovered
  • Error rate exceeds threshold (> 1%)

Quick Rollback Steps

# 1. Identify previous working version
kubectl rollout history deployment/service-name -n staging

# 2. Rollback to previous version
kubectl rollout undo deployment/service-name -n staging

# 3. Verify rollback
kubectl rollout status deployment/service-name -n staging

# 4. Check health
curl https://staging-api.example.com/api/v1/service-name/health

Database Rollback

If schema changes were made:

# 1. Identify the migration to revert to
pnpm prisma migrate status

# 2. Restore from backup (if data changes)
kubectl exec -n staging deployment/postgres -- psql -U postgres db < backup.sql

# 3. Reset migrations (development only)
pnpm prisma migrate reset

# 4. Or revert specific migration
pnpm prisma migrate resolve --rolled-back <migration-name>

Rollback Verification

After rollback, verify:

  • Service health check passes
  • All endpoints responding
  • No error spikes in logs
  • Database connectivity restored
  • External service integrations working

Common Pitfalls

1. Skipping Impact Analysis

Problem: Missing updates in scripts, configs, or documentation.

Solution: Always complete the impact analysis checklist before coding:

# Find all files that might reference the service
grep -r "service-name" . --exclude-dir=node_modules --exclude-dir=.git

2. No Phase Verification

Problem: Accumulated issues that are hard to debug later.

Solution: Complete phase checklist before moving on:

pnpm typecheck && pnpm lint && pnpm test

3. Deferring Cleanup

Problem: Technical debt accumulates, temporary files forgotten.

Solution: Clean up as you go, not at the end:

# Regular cleanup
rm -rf *.backup/ *_STATUS.md *_CHECKLIST.md

4. Incomplete Testing

Problem: Missing edge cases and error scenarios in production.

Solution: Write tests alongside implementation:

  • Test happy path AND error cases
  • Test boundary conditions
  • Test concurrent access

5. Poor Documentation

Problem: Difficult maintenance, onboarding issues.

Solution: Document as you implement:

  • Add JSDoc comments to functions
  • Update README with new features
  • Add Swagger annotations to endpoints

6. No Rollback Plan

Problem: Difficult recovery from deployment failures.

Solution: Always prepare before deployment:

  • Database backup
  • Previous image tag recorded
  • Rollback commands documented and tested

7. Hardcoded Configuration

Problem: Different behavior in different environments.

Solution: Use environment variables with Zod validation:

const config = z.object({
  DATABASE_URL: z.string().url(),
  PORT: z.coerce.number().default(5000),
}).parse(process.env);

8. Missing Health Checks

Problem: Unhealthy services not detected by load balancer.

Solution: Implement comprehensive health checks:

app.get('/health', async (req, res) => {
  const dbHealthy = await checkDatabase();
  const redisHealthy = await checkRedis();

  if (dbHealthy && redisHealthy) {
    res.json({ status: 'healthy' });
  } else {
    res.status(503).json({ status: 'unhealthy' });
  }
});