pos-system/docs/vi/skills/deployment-kubernetes.md

# Triển Khai Kubernetes

Kubernetes deployment patterns for GoodGo microservices. Use when deploying to staging/production, creating K8s manifests, configuring HPA, setting up ingress, or troubleshooting K8s deployments.
> Các pattern triển khai Kubernetes cho microservices GoodGo. Sử dụng khi triển khai lên staging/production, tạo K8s manifests, cấu hình HPA, thiết lập ingress, hoặc xử lý sự cố triển khai K8s.

## Tổng Quan

This skill covers Kubernetes deployment patterns and best practices for GoodGo microservices. It includes creating deployment manifests, configuring autoscaling, managing secrets and configmaps, setting up ingress, and implementing health checks.

Skill này bao gồm các pattern triển khai Kubernetes và best practices cho microservices GoodGo. Nó bao gồm tạo deployment manifests, cấu hình autoscaling, quản lý secrets và configmaps, thiết lập ingress, và triển khai health checks.

## Khi Nào Sử Dụng

Use this skill when:
- Deploying services to staging/production environments
- Creating or updating Kubernetes manifests
- Configuring autoscaling (HPA/VPA)
- Setting up ingress and load balancing
- Managing secrets and configmaps
- Troubleshooting deployment issues
- Implementing health checks and probes
- Setting up monitoring and logging

Sử dụng skill này khi:
- Triển khai services lên môi trường staging/production
- Tạo hoặc cập nhật Kubernetes manifests
- Cấu hình autoscaling (HPA/VPA)
- Thiết lập ingress và load balancing
- Quản lý secrets và configmaps
- Xử lý sự cố triển khai
- Triển khai health checks và probes
- Thiết lập monitoring và logging

## Khái Niệm Chính

### Kiến Trúc Kubernetes

Sơ đồ sau minh họa các thành phần Kubernetes chính và mối quan hệ của chúng trong một triển khai service GoodGo điển hình:

```mermaid
graph TB
    subgraph External["External Traffic"]
        Client[Client Request]
    end

    subgraph IngressLayer["Ingress Layer"]
        Ingress[Ingress<br/>api.goodgo.com]
    end

    subgraph ServiceLayer["Service Layer"]
        Service[Service<br/>ClusterIP]
    end

    subgraph DeploymentLayer["Deployment Layer"]
        Deployment[Deployment<br/>auth-service]
        HPA[HorizontalPodAutoscaler<br/>HPA]
    end

    subgraph PodLayer["Pod Layer"]
        Pod1[Pod 1<br/>Container]
        Pod2[Pod 2<br/>Container]
        Pod3[Pod 3<br/>Container]
    end

    subgraph ConfigLayer["Configuration Layer"]
        ConfigMap[ConfigMap<br/>app-config]
        Secret[Secret<br/>database-secrets]
    end

    Client -->|HTTPS| Ingress
    Ingress -->|Route /auth| Service
    Service -->|Load Balance| Pod1
    Service -->|Load Balance| Pod2
    Service -->|Load Balance| Pod3

    Deployment -->|Manages| Pod1
    Deployment -->|Manages| Pod2
    Deployment -->|Manages| Pod3

    HPA -->|Scales| Deployment

    Pod1 -->|Reads| ConfigMap
    Pod2 -->|Reads| ConfigMap
    Pod3 -->|Reads| ConfigMap

    Pod1 -->|Reads| Secret
    Pod2 -->|Reads| Secret
    Pod3 -->|Reads| Secret
```

### Chiến Lược Triển Khai

- Rolling updates để triển khai không downtime
- Resource limits và requests để đảm bảo ổn định
- Health checks (liveness/readiness probes)
- Horizontal Pod Autoscaler (HPA) để tự động scale
- ConfigMaps cho cấu hình
- Secrets cho dữ liệu nhạy cảm

### Vòng Đời Pod

Pods trải qua các trạng thái khác nhau trong vòng đời của chúng. Health checks (liveness và readiness probes) xác định khả năng sẵn sàng của pod:

```mermaid
stateDiagram-v2
    [*] --> Pending: Pod Created

    Pending --> ContainerCreating: Scheduler Assigned
    ContainerCreating --> Running: Containers Started

    Running --> Running: Liveness Check Pass
    Running --> Restarting: Liveness Check Fail (3x)
    Restarting --> Running: Container Restarted

    Running --> Ready: Readiness Check Pass
    Ready --> Running: Readiness Check Fail (3x)

    Ready --> Terminating: Pod Deleted
    Terminating --> [*]: Cleanup Complete

    note right of Ready
        Pod receives traffic
        from Service
    end note

    note right of Running
        Liveness probe checks
        if container is alive
    end note

    note right of Restarting
        Container restarted
        after 3 failures
    end note
```

### Luồng Service Discovery

Kubernetes cung cấp service discovery tích hợp thông qua DNS. Sơ đồ sau cho thấy cách requests chảy từ client bên ngoài đến pods:

```mermaid
sequenceDiagram
    participant Client
    participant Ingress
    participant Service
    participant Pod1
    participant Pod2
    participant Pod3

    Client->>Ingress: HTTPS Request<br/>api.goodgo.com/auth/login
    Ingress->>Ingress: TLS Termination
    Ingress->>Ingress: Path Routing<br/>/auth → auth-service

    Ingress->>Service: HTTP Request<br/>auth-service:80
    Service->>Service: DNS Resolution<br/>auth-service.goodgo.svc.cluster.local

    Service->>Service: Endpoint Selection<br/>Load Balancing

    alt Pod1 Selected
        Service->>Pod1: Forward Request
        Pod1->>Pod1: Process Request
        Pod1->>Service: Response
    else Pod2 Selected
        Service->>Pod2: Forward Request
        Pod2->>Pod2: Process Request
        Pod2->>Service: Response
    else Pod3 Selected
        Service->>Pod3: Forward Request
        Pod3->>Pod3: Process Request
        Pod3->>Service: Response
    end

    Service->>Ingress: Response
    Ingress->>Client: HTTPS Response
```

## Các Pattern Thường Dùng

### Manifest Triển Khai Service

Standard deployment manifest structure for GoodGo services.

Cấu trúc deployment manifest chuẩn cho các services GoodGo.

**Ví dụ từ codebase**: [`deployments/production/kubernetes/iam-service.yaml`](../../../deployments/production/kubernetes/iam-service.yaml)

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: iam-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: iam-service
  template:
    metadata:
      labels:
        app: iam-service
    spec:
      containers:
      - name: iam-service
        image: goodgo/iam-service:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 5001
        envFrom:
        - configMapRef:
            name: iam-service-config
        - secretRef:
            name: iam-service-secrets
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health/live
            port: 5001
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 5001
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
  name: iam-service
  namespace: production
spec:
  selector:
    app: iam-service
  ports:
  - protocol: TCP
    port: 5001
    targetPort: 5001
  type: ClusterIP
```

### Tự Động Scale Pod

Configure HPA to automatically scale pods based on CPU and memory utilization.

Cấu hình HPA để tự động scale pods dựa trên CPU và memory utilization.

**Ví dụ từ codebase**: [`deployments/production/kubernetes/iam-service.yaml`](../../../deployments/production/kubernetes/iam-service.yaml)

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: iam-service-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: iam-service
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
```

### ConfigMap và Secrets

Use ConfigMaps for non-sensitive configuration and Secrets for sensitive data.

Sử dụng ConfigMaps cho cấu hình không nhạy cảm và Secrets cho dữ liệu nhạy cảm.

**Ví dụ từ codebase**:
- ConfigMap: [`deployments/production/kubernetes/iam-service-configmap.yaml`](../../../deployments/production/kubernetes/iam-service-configmap.yaml)
- Secrets: [`deployments/production/kubernetes/secrets.yaml.example`](../../../deployments/production/kubernetes/secrets.yaml.example)

```yaml
# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: iam-service-config
  namespace: production
data:
"production"
  PORT: "5001"
  API_VERSION: "v1"
  CORS_ORIGIN: "https://goodgo.vn"
  LOG_LEVEL: "warn"
  SERVICE_NAME: "iam-service"
"true"

---
# Secret (example - use sealed-secrets in production)
apiVersion: v1
kind: Secret
metadata:
  name: iam-service-secrets
  namespace: production
type: Opaque
stringData:
  database-url: "postgresql://user:password@ep-xxx.region.neon.tech/dbname?sslmode=require&pgbouncer=true"
  jwt-secret: "your-production-jwt-secret-min-32-chars"
  jwt-refresh-secret: "your-production-refresh-secret-min-32-chars"
  redis-password: ""
```

### Cấu Hình Ingress

Configure ingress for external access with TLS and path-based routing.

Cấu hình ingress để truy cập từ bên ngoài với TLS và path-based routing.

**Ví dụ từ codebase**: [`deployments/production/kubernetes/ingress.yaml`](../../../deployments/production/kubernetes/ingress.yaml)

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: production
  annotations:
    traefik.ingress.kubernetes.io/rule-type: PathPrefix
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  ingressClassName: traefik
  tls:
  - hosts:
    - api.goodgo.vn
    secretName: api-tls-cert
  rules:
  - host: api.goodgo.vn
    http:
      paths:
      - path: /api/v1/auth
        pathType: Prefix
        backend:
          service:
            name: iam-service
            port:
              number: 5001
```

## Thực Hành Tốt Nhất

### Quản Lý Tài Nguyên

- Always set resource requests and limits
- Monitor actual usage and adjust accordingly
- Use HPA for automatic scaling
- Set appropriate CPU and memory based on service requirements

**VI**:
- Luôn đặt resource requests và limits
- Theo dõi sử dụng thực tế và điều chỉnh phù hợp
- Sử dụng HPA để tự động scale
- Đặt CPU và memory phù hợp dựa trên yêu cầu service

### Cấu Hình

- Use ConfigMaps for non-sensitive config
- Use Secrets for sensitive data
- Never hardcode configuration in images
- Use `envFrom` to load entire ConfigMap/Secret

**VI**:
- Sử dụng ConfigMaps cho cấu hình không nhạy cảm
- Sử dụng Secrets cho dữ liệu nhạy cảm
- Không bao giờ hardcode cấu hình trong images
- Sử dụng `envFrom` để load toàn bộ ConfigMap/Secret

### Kiểm Tra Sức Khỏe

- Implement both liveness and readiness probes
- Set appropriate timeouts and thresholds
- Include dependency checks in readiness probe
- Use HTTP probes for web services

**VI**:
- Triển khai cả liveness và readiness probes
- Đặt timeouts và thresholds phù hợp
- Bao gồm kiểm tra dependencies trong readiness probe
- Sử dụng HTTP probes cho web services

**Ví dụ từ codebase**: [`services/iam-service/src/modules/health/health.controller.ts`](../../../services/iam-service/src/modules/health/health.controller.ts)

```typescript
// Liveness probe - is the service alive?
health = async (_req: Request, res: Response): Promise<void> => {
  res.json({
    success: true,
    data: { status: 'ok', timestamp: new Date().toISOString() },
  });
};

// Readiness probe - is the service ready to accept traffic?
ready = async (_req: Request, res: Response): Promise<void> => {
  try {
    // Check database connection
    await prisma.$queryRaw`SELECT 1`;
    res.json({
      success: true,
      data: { status: 'ready' },
    });
  } catch (error) {
    res.status(503).json({
      success: false,
      error: { code: 'HEALTH_001', message: 'Service not ready' },
    });
  }
};
```

### Triển Khai

- Use rolling updates for zero-downtime
- Set maxSurge and maxUnavailable appropriately
- Test deployments in staging first
- Use image tags instead of `latest` in production

**VI**:
- Sử dụng rolling updates để không downtime
- Đặt maxSurge và maxUnavailable phù hợp
- Test triển khai trong staging trước
- Sử dụng image tags thay vì `latest` trong production

### Bảo Mật

- Run containers as non-root user
- Use network policies to restrict traffic
- Regularly update base images
- Use sealed-secrets or external secret manager
- Never commit secrets to Git

**VI**:
- Chạy containers với user không phải root
- Sử dụng network policies để hạn chế traffic
- Cập nhật base images thường xuyên
- Sử dụng sealed-secrets hoặc external secret manager
- Không bao giờ commit secrets vào Git

### Giám Sát

- Expose metrics endpoint (`/metrics`)
- Set up alerts for critical issues
- Monitor resource usage and performance
- Use ServiceMonitor for Prometheus integration

**VI**:
- Expose metrics endpoint (`/metrics`)
- Thiết lập alerts cho các vấn đề quan trọng
- Theo dõi sử dụng tài nguyên và hiệu suất
- Sử dụng ServiceMonitor cho tích hợp Prometheus

## Ví Dụ Từ Dự Án

### Triển Khai Production

- **IAM Service**: [`deployments/production/kubernetes/iam-service.yaml`](../../../deployments/production/kubernetes/iam-service.yaml)
- **ConfigMap**: [`deployments/production/kubernetes/iam-service-configmap.yaml`](../../../deployments/production/kubernetes/iam-service-configmap.yaml)
- **Ingress**: [`deployments/production/kubernetes/ingress.yaml`](../../../deployments/production/kubernetes/ingress.yaml)

### Triển Khai Staging

- **IAM Service**: [`deployments/staging/kubernetes/iam-service.yaml`](../../../deployments/staging/kubernetes/iam-service.yaml)
- **ConfigMap**: [`deployments/staging/kubernetes/iam-service-configmap.yaml`](../../../deployments/staging/kubernetes/iam-service-configmap.yaml)

### Triển Khai Health Check

- **Health Controller**: [`services/iam-service/src/modules/health/health.controller.ts`](../../../services/iam-service/src/modules/health/health.controller.ts)

## Tham Khảo Nhanh

### Lệnh Thường Dùng

```bash
# Deploy to production
kubectl apply -f deployments/production/kubernetes/ -n production

# Check deployment status
kubectl get deployments -n production
kubectl get pods -n production
kubectl get svc -n production

# View logs
kubectl logs -f deployment/iam-service -n production
kubectl logs -f pod-name -n production --tail=100

# Scale manually
kubectl scale deployment iam-service --replicas=5 -n production

# Update image
kubectl set image deployment/iam-service iam-service=goodgo/iam-service:v1.2.3 -n production

# Rollback
kubectl rollout undo deployment/iam-service -n production

# Port forward for debugging
kubectl port-forward service/iam-service 5001:5001 -n production

# Execute command in pod
kubectl exec -it pod-name -n production -- /bin/sh

# View HPA status
kubectl get hpa -n production
kubectl describe hpa iam-service-hpa -n production

# View resource usage
kubectl top nodes
kubectl top pods -n production
```

### Xử Lý Sự Cố

**Pod Not Starting / Pod Không Khởi Động**:
```bash
# Check pod status
kubectl describe pod pod-name -n production

# Check events
kubectl get events -n production --sort-by='.lastTimestamp'

# Check logs
kubectl logs pod-name -n production --previous
```

**ImagePullBackOff**:
```bash
# Check image name and tag
kubectl describe pod pod-name -n production | grep -i image

# Check image pull secrets
kubectl get secrets -n production
```

**CrashLoopBackOff**:
```bash
# Check logs of crashed container
kubectl logs pod-name -n production --previous

# Check resource limits
kubectl describe pod pod-name -n production | grep -A 5 Limits
```

## Skills Liên Quan

- [Observability & Monitoring](./observability-monitoring.md) - Để giám sát các services đã triển khai
- [Security](./security.md) - Để bảo mật các triển khai Kubernetes
- [Project Rules](./project-rules.md) - Cho cấu trúc và tiêu chuẩn service

## Tài Nguyên

### Tài Liệu Chính Thức

- [Kubernetes Documentation](https://kubernetes.io/docs/)
- [Kubernetes API Reference](https://kubernetes.io/docs/reference/kubernetes-api/)
- [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)

### Tài Nguyên GoodGo

- [Deployment Guide](../guides/deployment.md)
- [Local Deployment Guide](../guides/local-deployment.md)
- [Troubleshooting Guide](../guides/troubleshooting.md)