- Added request/response flow diagrams to api-design and api-gateway-advanced skills for better visualization of processes. - Introduced configuration loading flow in configuration-management skill to clarify the configuration process. - Included error propagation flow in error-handling-patterns skill to illustrate error handling across layers. - Enhanced various skills with additional diagrams to improve understanding of complex concepts. These updates aim to provide clearer guidance and improve the overall documentation experience for developers.
618 lines
14 KiB
Markdown
618 lines
14 KiB
Markdown
---
|
|
name: deployment-kubernetes
|
|
description: Kubernetes deployment patterns for GoodGo microservices. Use when deploying to staging/production, creating K8s manifests, configuring HPA, setting up ingress, or troubleshooting K8s deployments.
|
|
---
|
|
|
|
# Kubernetes Deployment Patterns
|
|
|
|
## When to Use This Skill
|
|
|
|
Use this skill when:
|
|
- Deploying services to staging/production environments
|
|
- Creating or updating Kubernetes manifests
|
|
- Configuring autoscaling (HPA/VPA)
|
|
- Setting up ingress and load balancing
|
|
- Managing secrets and configmaps
|
|
- Troubleshooting deployment issues
|
|
- Implementing health checks and probes
|
|
- Setting up monitoring and logging
|
|
|
|
## Core Concepts
|
|
|
|
### Kubernetes Architecture
|
|
|
|
The following diagram illustrates the key Kubernetes components and their relationships in a typical GoodGo service deployment:
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph External["External Traffic"]
|
|
Client[Client Request]
|
|
end
|
|
|
|
subgraph IngressLayer["Ingress Layer"]
|
|
Ingress[Ingress<br/>api.goodgo.com]
|
|
end
|
|
|
|
subgraph ServiceLayer["Service Layer"]
|
|
Service[Service<br/>ClusterIP]
|
|
end
|
|
|
|
subgraph DeploymentLayer["Deployment Layer"]
|
|
Deployment[Deployment<br/>auth-service]
|
|
HPA[HorizontalPodAutoscaler<br/>HPA]
|
|
end
|
|
|
|
subgraph PodLayer["Pod Layer"]
|
|
Pod1[Pod 1<br/>Container]
|
|
Pod2[Pod 2<br/>Container]
|
|
Pod3[Pod 3<br/>Container]
|
|
end
|
|
|
|
subgraph ConfigLayer["Configuration Layer"]
|
|
ConfigMap[ConfigMap<br/>app-config]
|
|
Secret[Secret<br/>database-secrets]
|
|
end
|
|
|
|
Client -->|HTTPS| Ingress
|
|
Ingress -->|Route /auth| Service
|
|
Service -->|Load Balance| Pod1
|
|
Service -->|Load Balance| Pod2
|
|
Service -->|Load Balance| Pod3
|
|
|
|
Deployment -->|Manages| Pod1
|
|
Deployment -->|Manages| Pod2
|
|
Deployment -->|Manages| Pod3
|
|
|
|
HPA -->|Scales| Deployment
|
|
|
|
Pod1 -->|Reads| ConfigMap
|
|
Pod2 -->|Reads| ConfigMap
|
|
Pod3 -->|Reads| ConfigMap
|
|
|
|
Pod1 -->|Reads| Secret
|
|
Pod2 -->|Reads| Secret
|
|
Pod3 -->|Reads| Secret
|
|
```
|
|
|
|
### Deployment Strategy
|
|
- Rolling updates for zero-downtime deployments
|
|
- Resource limits and requests for stability
|
|
- Health checks (liveness/readiness probes)
|
|
- Horizontal Pod Autoscaler (HPA) for auto-scaling
|
|
- ConfigMaps for configuration
|
|
- Secrets for sensitive data
|
|
|
|
### Pod Lifecycle
|
|
|
|
Pods go through various states during their lifecycle. Health checks (liveness and readiness probes) determine pod availability:
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> Pending: Pod Created
|
|
|
|
Pending --> ContainerCreating: Scheduler Assigned
|
|
ContainerCreating --> Running: Containers Started
|
|
|
|
Running --> Running: Liveness Check Pass
|
|
Running --> Restarting: Liveness Check Fail (3x)
|
|
Restarting --> Running: Container Restarted
|
|
|
|
Running --> Ready: Readiness Check Pass
|
|
Ready --> Running: Readiness Check Fail (3x)
|
|
|
|
Ready --> Terminating: Pod Deleted
|
|
Terminating --> [*]: Cleanup Complete
|
|
|
|
note right of Ready
|
|
Pod receives traffic
|
|
from Service
|
|
end note
|
|
|
|
note right of Running
|
|
Liveness probe checks
|
|
if container is alive
|
|
end note
|
|
|
|
note right of Restarting
|
|
Container restarted
|
|
after 3 failures
|
|
end note
|
|
```
|
|
|
|
### Service Discovery Flow
|
|
|
|
Kubernetes provides built-in service discovery through DNS. The following diagram shows how requests flow from external clients to pods:
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant Ingress
|
|
participant Service
|
|
participant Pod1
|
|
participant Pod2
|
|
participant Pod3
|
|
|
|
Client->>Ingress: HTTPS Request<br/>api.goodgo.com/auth/login
|
|
Ingress->>Ingress: TLS Termination
|
|
Ingress->>Ingress: Path Routing<br/>/auth → auth-service
|
|
|
|
Ingress->>Service: HTTP Request<br/>auth-service:80
|
|
Service->>Service: DNS Resolution<br/>auth-service.goodgo.svc.cluster.local
|
|
|
|
Service->>Service: Endpoint Selection<br/>Load Balancing
|
|
|
|
alt Pod1 Selected
|
|
Service->>Pod1: Forward Request
|
|
Pod1->>Pod1: Process Request
|
|
Pod1->>Service: Response
|
|
else Pod2 Selected
|
|
Service->>Pod2: Forward Request
|
|
Pod2->>Pod2: Process Request
|
|
Pod2->>Service: Response
|
|
else Pod3 Selected
|
|
Service->>Pod3: Forward Request
|
|
Pod3->>Pod3: Process Request
|
|
Pod3->>Service: Response
|
|
end
|
|
|
|
Service->>Ingress: Response
|
|
Ingress->>Client: HTTPS Response
|
|
```
|
|
|
|
## Service Deployment Manifest
|
|
|
|
```yaml
|
|
# kubernetes/auth-service.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: auth-service
|
|
namespace: goodgo
|
|
labels:
|
|
app: auth-service
|
|
version: v1
|
|
spec:
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: auth-service
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: auth-service
|
|
version: v1
|
|
spec:
|
|
containers:
|
|
- name: auth-service
|
|
image: goodgo/auth-service:latest
|
|
imagePullPolicy: IfNotPresent
|
|
ports:
|
|
- containerPort: 3000
|
|
name: http
|
|
env:
|
|
- name: NODE_ENV
|
|
value: "production"
|
|
- name: PORT
|
|
value: "3000"
|
|
- name: DATABASE_URL
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: database-secrets
|
|
key: url
|
|
- name: JWT_SECRET
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: auth-secrets
|
|
key: jwt-secret
|
|
- name: REDIS_URL
|
|
valueFrom:
|
|
configMapKeyRef:
|
|
name: redis-config
|
|
key: url
|
|
resources:
|
|
requests:
|
|
memory: "256Mi"
|
|
cpu: "250m"
|
|
limits:
|
|
memory: "512Mi"
|
|
cpu: "500m"
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 3000
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 10
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /ready
|
|
port: 3000
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
---
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: auth-service
|
|
namespace: goodgo
|
|
spec:
|
|
type: ClusterIP
|
|
selector:
|
|
app: auth-service
|
|
ports:
|
|
- port: 80
|
|
targetPort: 3000
|
|
protocol: TCP
|
|
```
|
|
|
|
## Horizontal Pod Autoscaler
|
|
|
|
```yaml
|
|
# kubernetes/hpa.yaml
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: auth-service-hpa
|
|
namespace: goodgo
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: auth-service
|
|
minReplicas: 2
|
|
maxReplicas: 10
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
- type: Resource
|
|
resource:
|
|
name: memory
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 80
|
|
behavior:
|
|
scaleDown:
|
|
stabilizationWindowSeconds: 300
|
|
policies:
|
|
- type: Percent
|
|
value: 50
|
|
periodSeconds: 60
|
|
scaleUp:
|
|
stabilizationWindowSeconds: 0
|
|
policies:
|
|
- type: Percent
|
|
value: 100
|
|
periodSeconds: 15
|
|
```
|
|
|
|
## ConfigMap & Secrets
|
|
|
|
```yaml
|
|
# kubernetes/configmap.yaml
|
|
apiVersion: v1
|
|
kind: ConfigMap
|
|
metadata:
|
|
name: app-config
|
|
namespace: goodgo
|
|
data:
|
|
NODE_ENV: "production"
|
|
LOG_LEVEL: "info"
|
|
REDIS_URL: "redis://redis-service:6379"
|
|
METRICS_ENABLED: "true"
|
|
|
|
---
|
|
# kubernetes/secrets.yaml (example - use sealed-secrets in production)
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: database-secrets
|
|
namespace: goodgo
|
|
type: Opaque
|
|
stringData:
|
|
url: "postgresql://user:pass@postgres:5432/db"
|
|
|
|
---
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: auth-secrets
|
|
namespace: goodgo
|
|
type: Opaque
|
|
stringData:
|
|
jwt-secret: "your-secret-key"
|
|
refresh-secret: "your-refresh-secret"
|
|
```
|
|
|
|
## Ingress Configuration
|
|
|
|
```yaml
|
|
# kubernetes/ingress.yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: api-ingress
|
|
namespace: goodgo
|
|
annotations:
|
|
kubernetes.io/ingress.class: nginx
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
nginx.ingress.kubernetes.io/rate-limit: "100"
|
|
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
|
spec:
|
|
tls:
|
|
- hosts:
|
|
- api.goodgo.com
|
|
secretName: api-tls-secret
|
|
rules:
|
|
- host: api.goodgo.com
|
|
http:
|
|
paths:
|
|
- path: /auth
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: auth-service
|
|
port:
|
|
number: 80
|
|
- path: /users
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: user-service
|
|
port:
|
|
number: 80
|
|
```
|
|
|
|
## Database Deployment (Development Only)
|
|
|
|
```yaml
|
|
# kubernetes/postgres.yaml
|
|
apiVersion: apps/v1
|
|
kind: StatefulSet
|
|
metadata:
|
|
name: postgres
|
|
namespace: goodgo
|
|
spec:
|
|
serviceName: postgres
|
|
replicas: 1
|
|
selector:
|
|
matchLabels:
|
|
app: postgres
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: postgres
|
|
spec:
|
|
containers:
|
|
- name: postgres
|
|
image: postgres:14-alpine
|
|
ports:
|
|
- containerPort: 5432
|
|
env:
|
|
- name: POSTGRES_DB
|
|
value: goodgo
|
|
- name: POSTGRES_USER
|
|
value: postgres
|
|
- name: POSTGRES_PASSWORD
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: postgres-secret
|
|
key: password
|
|
volumeMounts:
|
|
- name: postgres-storage
|
|
mountPath: /var/lib/postgresql/data
|
|
volumeClaimTemplates:
|
|
- metadata:
|
|
name: postgres-storage
|
|
spec:
|
|
accessModes: ["ReadWriteOnce"]
|
|
resources:
|
|
requests:
|
|
storage: 10Gi
|
|
```
|
|
|
|
## Deployment Scripts
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# scripts/deploy-k8s.sh
|
|
|
|
# Set namespace
|
|
NAMESPACE="goodgo"
|
|
ENVIRONMENT="${1:-staging}"
|
|
|
|
# Create namespace if not exists
|
|
kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f -
|
|
|
|
# Apply configurations
|
|
echo "Applying ConfigMaps..."
|
|
kubectl apply -f kubernetes/configmap-$ENVIRONMENT.yaml
|
|
|
|
echo "Applying Secrets..."
|
|
kubectl apply -f kubernetes/secrets-$ENVIRONMENT.yaml
|
|
|
|
echo "Deploying services..."
|
|
kubectl apply -f kubernetes/auth-service.yaml
|
|
kubectl apply -f kubernetes/user-service.yaml
|
|
|
|
echo "Configuring autoscaling..."
|
|
kubectl apply -f kubernetes/hpa.yaml
|
|
|
|
echo "Setting up ingress..."
|
|
kubectl apply -f kubernetes/ingress.yaml
|
|
|
|
# Wait for rollout
|
|
kubectl rollout status deployment/auth-service -n $NAMESPACE
|
|
kubectl rollout status deployment/user-service -n $NAMESPACE
|
|
|
|
echo "Deployment complete!"
|
|
```
|
|
|
|
## Health Check Implementation
|
|
|
|
```typescript
|
|
// src/modules/health/health.controller.ts
|
|
export class HealthController {
|
|
constructor(
|
|
private prisma: PrismaClient,
|
|
private redis: Redis
|
|
) {}
|
|
|
|
// Liveness probe - is the service alive?
|
|
async liveness(req: Request, res: Response) {
|
|
res.status(200).json({ status: 'ok' });
|
|
}
|
|
|
|
// Readiness probe - is the service ready to accept traffic?
|
|
async readiness(req: Request, res: Response) {
|
|
try {
|
|
// Check database connection
|
|
await this.prisma.$queryRaw`SELECT 1`;
|
|
|
|
// Check Redis connection
|
|
await this.redis.ping();
|
|
|
|
res.status(200).json({
|
|
status: 'ready',
|
|
checks: {
|
|
database: 'ok',
|
|
redis: 'ok'
|
|
}
|
|
});
|
|
} catch (error) {
|
|
res.status(503).json({
|
|
status: 'not ready',
|
|
error: error.message
|
|
});
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Monitoring with Prometheus
|
|
|
|
```yaml
|
|
# kubernetes/servicemonitor.yaml
|
|
apiVersion: monitoring.coreos.com/v1
|
|
kind: ServiceMonitor
|
|
metadata:
|
|
name: auth-service-monitor
|
|
namespace: goodgo
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: auth-service
|
|
endpoints:
|
|
- port: http
|
|
path: /metrics
|
|
interval: 30s
|
|
```
|
|
|
|
## Common Commands
|
|
|
|
```bash
|
|
# Deploy to staging
|
|
kubectl apply -f kubernetes/ -n goodgo-staging
|
|
|
|
# Check deployment status
|
|
kubectl get deployments -n goodgo
|
|
kubectl get pods -n goodgo
|
|
kubectl get svc -n goodgo
|
|
|
|
# View logs
|
|
kubectl logs -f deployment/auth-service -n goodgo
|
|
kubectl logs -f pod-name -n goodgo --tail=100
|
|
|
|
# Scale manually
|
|
kubectl scale deployment auth-service --replicas=5 -n goodgo
|
|
|
|
# Update image
|
|
kubectl set image deployment/auth-service auth-service=goodgo/auth-service:v1.2.3 -n goodgo
|
|
|
|
# Rollback
|
|
kubectl rollout undo deployment/auth-service -n goodgo
|
|
|
|
# Port forward for debugging
|
|
kubectl port-forward service/auth-service 3000:80 -n goodgo
|
|
|
|
# Execute command in pod
|
|
kubectl exec -it pod-name -n goodgo -- /bin/sh
|
|
|
|
# View HPA status
|
|
kubectl get hpa -n goodgo
|
|
kubectl describe hpa auth-service-hpa -n goodgo
|
|
|
|
# View resource usage
|
|
kubectl top nodes
|
|
kubectl top pods -n goodgo
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Pod Not Starting
|
|
|
|
```bash
|
|
# Check pod status
|
|
kubectl describe pod pod-name -n goodgo
|
|
|
|
# Check events
|
|
kubectl get events -n goodgo --sort-by='.lastTimestamp'
|
|
|
|
# Check logs
|
|
kubectl logs pod-name -n goodgo --previous
|
|
```
|
|
|
|
### ImagePullBackOff
|
|
|
|
```bash
|
|
# Check image name and tag
|
|
kubectl describe pod pod-name -n goodgo | grep -i image
|
|
|
|
# Check image pull secrets
|
|
kubectl get secrets -n goodgo
|
|
```
|
|
|
|
### CrashLoopBackOff
|
|
|
|
```bash
|
|
# Check logs of crashed container
|
|
kubectl logs pod-name -n goodgo --previous
|
|
|
|
# Check resource limits
|
|
kubectl describe pod pod-name -n goodgo | grep -A 5 Limits
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Resource Management**
|
|
- Always set resource requests and limits
|
|
- Monitor actual usage and adjust accordingly
|
|
- Use HPA for automatic scaling
|
|
|
|
2. **Configuration**
|
|
- Use ConfigMaps for non-sensitive config
|
|
- Use Secrets for sensitive data
|
|
- Never hardcode configuration in images
|
|
|
|
3. **Health Checks**
|
|
- Implement both liveness and readiness probes
|
|
- Set appropriate timeouts and thresholds
|
|
- Include dependency checks in readiness probe
|
|
|
|
4. **Deployment**
|
|
- Use rolling updates for zero-downtime
|
|
- Set maxSurge and maxUnavailable appropriately
|
|
- Test deployments in staging first
|
|
|
|
5. **Security**
|
|
- Run containers as non-root user
|
|
- Use network policies to restrict traffic
|
|
- Regularly update base images
|
|
- Use sealed-secrets or external secret manager
|
|
|
|
6. **Monitoring**
|
|
- Expose metrics endpoint
|
|
- Set up alerts for critical issues
|
|
- Monitor resource usage and performance |