- Translated and revised architecture documents to enhance clarity and accessibility for both English and Vietnamese audiences. - Improved diagrams and descriptions for caching, data consistency, event-driven architecture, microservices communication, observability, and security architecture. - Ensured consistent formatting and terminology across all documents to facilitate better understanding and navigation. - Added quick tips and troubleshooting sections to assist developers in implementing and managing the architecture effectively.
23 KiB
Kiến Trúc Thiết Kế Hệ Thống
Kiến trúc tổng thể của nền tảng GoodGo Microservices
Sơ đồ Tổng quan
%%{init: {'theme':'base', 'themeVariables': {
'primaryTextColor':'#000',
'secondaryTextColor':'#000',
'tertiaryTextColor':'#000',
'textColor':'#000',
'mainBkg':'#fff',
'secondBkg':'#fff',
'lineColor':'#333',
'border1':'#000',
'border2':'#000',
'clusterBkg':'#fff',
'clusterBorder':'#000',
'titleColor':'#000',
'edgeLabelBackground':'#fff',
'nodeTextColor':'#fff'
}}}%%
graph TD
subgraph "Client Layer"
Web[Web App<br/>Next.js]
Mobile[Mobile App<br/>Flutter]
end
subgraph "API Gateway Layer"
Traefik[Traefik<br/>API Gateway]
end
subgraph "Services Layer"
IAM[IAM Service<br/>Auth & RBAC]
Future1[Future Service 1]
Future2[Future Service 2]
end
subgraph "Infrastructure Layer"
DB[(Neon PostgreSQL<br/>Primary Database)]
Cache[(Redis<br/>Cache & Session)]
Kafka[Apache Kafka<br/>Event Streaming]
end
subgraph "Observability Layer"
Prom[Prometheus<br/>Metrics]
Loki[Loki<br/>Logs]
Jaeger[Jaeger<br/>Tracing]
Grafana[Grafana<br/>Dashboards]
end
Web --> Traefik
Mobile --> Traefik
Traefik --> IAM
Traefik --> Future1
Traefik --> Future2
IAM --> DB
IAM --> Cache
IAM --> Kafka
Future1 --> DB
Future1 --> Cache
Future1 --> Kafka
Future2 --> DB
Future2 --> Cache
Future2 --> Kafka
IAM -.->|metrics| Prom
Future1 -.->|metrics| Prom
Future2 -.->|metrics| Prom
IAM -.->|logs| Loki
Future1 -.->|logs| Loki
Future2 -.->|logs| Loki
IAM -.->|traces| Jaeger
Future1 -.->|traces| Jaeger
Future2 -.->|traces| Jaeger
Prom --> Grafana
Loki --> Grafana
Jaeger --> Grafana
style Web fill:#1565c0,stroke:#fff,stroke-width:2px,color:#fff
style Mobile fill:#1565c0,stroke:#fff,stroke-width:2px,color:#fff
style Traefik fill:#0f4c81,stroke:#fff,stroke-width:2px,color:#fff
style IAM fill:#283593,stroke:#fff,stroke-width:2px,color:#fff
style Future1 fill:#4527a0,stroke:#fff,stroke-width:2px,color:#fff
style Future2 fill:#4527a0,stroke:#fff,stroke-width:2px,color:#fff
style DB fill:#5e35b1,stroke:#fff,stroke-width:2px,color:#fff
style Cache fill:#ef6c00,stroke:#fff,stroke-width:2px,color:#fff
style Kafka fill:#2e7d32,stroke:#fff,stroke-width:2px,color:#fff
style Prom fill:#c62828,stroke:#fff,stroke-width:2px,color:#fff
style Loki fill:#d84315,stroke:#fff,stroke-width:2px,color:#fff
style Jaeger fill:#e65100,stroke:#fff,stroke-width:2px,color:#fff
style Grafana fill:#b71c1c,stroke:#fff,stroke-width:2px,color:#fff
Mô tả Kiến trúc
GoodGo Platform được xây dựng theo kiến trúc microservices với các nguyên tắc sau:
Nguyên tắc Cốt lõi:
- Độc Lập Service: Mỗi service có database riêng và có thể deploy độc lập
- API Gateway Pattern: Traefik xử lý routing, load balancing, và cross-cutting concerns
- Clean Architecture: Mỗi service tuân theo Clean Architecture (API, Domain, Infrastructure)
- Infrastructure as Code: Tất cả cấu hình infrastructure được version control
- Observability First: Đầy đủ metrics, logging, và health checks
Công nghệ Stack:
- Frontend: Next.js 14+ (App Router), Flutter 3.x
- Backend: .NET 10, ASP.NET Core, MediatR (CQRS)
- Database: Neon PostgreSQL (serverless), Entity Framework Core
- Cache: Redis (StackExchange.Redis)
- Message Broker: MediatR Domain Events (RabbitMQ planned)
- API Gateway: Traefik v3
- Observability: Prometheus, Grafana, Loki, Serilog
Bối cảnh Hệ thống
C4Context
title Sơ đồ Bối cảnh Hệ thống GoodGo Platform
Person(user, "Người dùng / User", "End users accessing the platform")
Person(admin, "Quản trị viên / Admin", "System administrators")
Person(developer, "Nhà phát triển / Developer", "Platform developers")
System(platform, "GoodGo Platform", "Microservices platform for business applications")
System_Ext(neon, "Neon PostgreSQL", "Serverless PostgreSQL database")
System_Ext(redis, "Redis", "In-memory cache and session store")
System_Ext(kafka, "Apache Kafka", "Event streaming platform")
System_Ext(monitoring, "Monitoring Stack", "Prometheus + Grafana + Loki + Jaeger")
Rel(user, platform, "Uses", "HTTPS")
Rel(admin, platform, "Manages", "HTTPS")
Rel(developer, platform, "Develops & Deploys", "Git, CI/CD")
Rel(platform, neon, "Stores data", "PostgreSQL Protocol")
Rel(platform, redis, "Caches data", "Redis Protocol")
Rel(platform, kafka, "Publishes/Consumes events", "Kafka Protocol")
Rel(platform, monitoring, "Sends metrics, logs, traces", "HTTP, gRPC")
Thành phần
Frontend Layer
Web App (Next.js)
Mô tả: Ứng dụng web sử dụng Next.js 14+ với App Router
Tính năng chính:
- Server-side rendering (SSR) và Static Site Generation (SSG)
- API routes cho BFF (Backend for Frontend) pattern
- Optimized image loading với next/image
- Built-in routing và code splitting
Công nghệ sử dụng:
- Next.js 14+, React 18+, TypeScript
- Tailwind CSS, Zustand (state management)
@goodgo/http-client,@goodgo/types
Vị trí File: apps/web-client/
Mobile App (Flutter)
Mô tả: Ứng dụng mobile cross-platform sử dụng Flutter
Tính năng chính:
- Cross-platform (iOS, Android)
- Native performance
- Provider pattern cho state management
- Offline-first với local storage
Công nghệ sử dụng:
- Flutter 3.x, Dart
- Provider, Dio (HTTP client)
Vị trí File: apps/mobile-client/
API Gateway Layer
Traefik
Mô tả: Reverse proxy và API gateway xử lý routing, load balancing, SSL termination
Tính năng chính:
- Dynamic service discovery
- Automatic HTTPS với Let's Encrypt
- Load balancing và health checks
- Rate limiting và circuit breaker
- Middleware chains (CORS, auth, logging)
Công nghệ sử dụng:
- Traefik 2.x
- Docker labels cho dynamic configuration
Vị trí File: infra/traefik/
Services Layer
IAM Service (.NET)
Mô tả: Identity and Access Management service xử lý authentication và authorization
Tính năng chính:
- OAuth2/OpenID Connect với OpenIddict
- JWT authentication (RS256)
- RBAC (Role-Based Access Control)
- ASP.NET Core Identity cho user management
- MFA support (TOTP)
Công nghệ sử dụng:
- .NET 10, ASP.NET Core, MediatR
- Entity Framework Core, OpenIddict
- Serilog, FluentValidation
Vị trí File: services/iam-service-net/
Các Services Đã Triển Khai
| Service | Mô tả | Vị trí |
|---|---|---|
| Storage Service | File storage với MinIO/Aliyun OSS | services/storage-service-net/ |
| Membership Service | Quản lý membership và subscriptions | services/membership-service-net/ |
| Organization Service | Quản lý tổ chức | services/organization-service-net/ |
| Chat Service | Chat và messaging | services/chat-service-net/ |
| Social Service | Social features | services/social-service-net/ |
| Wallet Service | Ví điện tử | services/wallet-service-net/ |
Infrastructure Layer
Neon PostgreSQL
Mô tả: Serverless PostgreSQL database với auto-scaling
Tính năng chính:
- Serverless với auto-scaling
- Branching cho development/staging
- Point-in-time recovery
- Connection pooling
Vị trí File: Database schemas trong mỗi service (services/*/prisma/schema.prisma)
Redis
Mô tả: In-memory cache và session store
Tính năng chính:
- Multi-layer caching (L1: Memory, L2: Redis)
- Session storage
- Rate limiting counters
- Pub/Sub cho real-time features
Vị trí File: infra/redis/
Apache Kafka
Mô tả: Event streaming platform cho asynchronous communication
Tính năng chính:
- Event-driven architecture
- Event sourcing
- Eventual consistency
- Dead letter queue (DLQ)
Vị trí File: infra/kafka/
Luồng Dữ liệu
sequenceDiagram
participant Client
participant Traefik as API Gateway
participant Service
participant Cache as Redis
participant DB as PostgreSQL
participant Kafka
Client->>Traefik: HTTPS Request
Traefik->>Traefik: Rate Limiting
Traefik->>Traefik: JWT Validation
Traefik->>Service: Route to Service
Service->>Cache: Check Cache
alt Cache Hit
Cache-->>Service: Return Cached Data
else Cache Miss
Service->>DB: Query Database
DB-->>Service: Return Data
Service->>Cache: Store in Cache (TTL: 5min)
end
Service->>Service: Process Business Logic
Service->>DB: Update Data (if needed)
Service->>Kafka: Publish Event (async)
Service-->>Traefik: Response
Traefik-->>Client: HTTPS Response
Note over Kafka: Event consumers process asynchronously
Giải thích chi tiết:
- Request: Client gửi HTTPS request đến Traefik
- Gateway Processing: Traefik thực hiện rate limiting và JWT validation
- Routing: Traefik route request đến service phù hợp
- Cache Check: Service kiểm tra L1 (memory) → L2 (Redis) cache
- Database Query: Nếu cache miss, query từ PostgreSQL
- Cache Update: Lưu kết quả vào cache với TTL phù hợp
- Business Logic: Xử lý logic nghiệp vụ
- Event Publishing: Publish domain events đến Kafka (async)
- Response: Trả về response cho client qua Traefik
Kiến trúc Database
erDiagram
User ||--o{ Session : has
User ||--o{ UserRole : has
User ||--o{ UserPermission : has
User ||--o{ MFADevice : has
User ||--o{ AuditEvent : triggers
Role ||--o{ UserRole : assigned_to
Role ||--o{ RolePermission : has
Permission ||--o{ RolePermission : granted_to
Permission ||--o{ UserPermission : granted_to
Organization ||--o{ User : contains
Organization ||--o{ Role : defines
User {
string id PK
string email UK
string passwordHash
string organizationId FK
boolean mfaEnabled
datetime createdAt
datetime updatedAt
}
Session {
string id PK
string userId FK
string refreshTokenHash
string deviceFingerprint
string ipAddress
datetime expiresAt
datetime createdAt
}
Role {
string id PK
string name
string organizationId FK
int hierarchy
datetime createdAt
}
Permission {
string id PK
string resource
string action
string scope
datetime createdAt
}
AuditEvent {
string id PK
string userId FK
string eventType
json eventData
datetime timestamp
}
Mô tả:
- Database per Service: Mỗi service có database schema riêng
- Shared Database: Hiện tại sử dụng shared Neon PostgreSQL, schemas isolated bằng Prisma
- Event Sourcing: Audit events lưu tất cả thay đổi quan trọng
- Soft Delete: Sử dụng
deletedAtfield thay vì hard delete
Quyết định Thiết kế
Quyết định 1: Microservices Architecture
Bối cảnh: Cần khả năng scale độc lập và deploy riêng biệt cho từng business domain
Quyết định: Sử dụng microservices architecture với database per service pattern
Hậu quả:
- ✅ Tích cực:
- Scale độc lập từng service theo nhu cầu
- Deploy riêng biệt, giảm risk khi release
- Fault isolation - lỗi một service không ảnh hưởng toàn bộ
- Technology flexibility - mỗi service có thể dùng tech stack khác
- ❌ Tiêu cực:
- Phức tạp hơn monolith (distributed systems challenges)
- Eventual consistency thay vì strong consistency
- Distributed transactions phức tạp (Saga pattern)
- Operational overhead (monitoring, deployment)
Các lựa chọn thay thế: Monolith, Modular Monolith
Quyết định 2: Traefik as API Gateway
Bối cảnh: Cần reverse proxy, load balancing, SSL termination, và service discovery
Quyết định: Sử dụng Traefik thay vì Kong, NGINX, hoặc AWS API Gateway
Hậu quả:
- ✅ Tích cực:
- Auto service discovery với Docker labels
- Dynamic configuration không cần restart
- Built-in Let's Encrypt support
- Native Kubernetes integration
- Built-in metrics và tracing
- ❌ Tiêu cực:
- Learning curve cao hơn NGINX
- Plugin ecosystem nhỏ hơn Kong
- Community nhỏ hơn NGINX
Các lựa chọn thay thế: Kong, NGINX, AWS API Gateway, Envoy
Quyết định 3: Neon PostgreSQL (Serverless)
Bối cảnh: Cần database với auto-scaling, branching, và cost-effective cho development
Quyết định: Sử dụng Neon PostgreSQL (serverless) thay vì self-hosted PostgreSQL hoặc AWS RDS
Hậu quả:
- ✅ Tích cực:
- Auto-scaling theo usage
- Database branching cho dev/staging
- Pay-per-use pricing model
- Automatic backups và point-in-time recovery
- No infrastructure management
- ❌ Tiêu cực:
- Vendor lock-in
- Cold start latency (mitigated by connection pooling)
- Limited control over database configuration
Các lựa chọn thay thế: Self-hosted PostgreSQL, AWS RDS, Google Cloud SQL
Đặc điểm Hiệu suất
| Chỉ số / Metric | Mục tiêu / Target | Ghi chú / Notes |
|---|---|---|
| API Response Time (P95) | < 200ms | Excluding external API calls |
| API Response Time (P99) | < 500ms | Peak load scenarios |
| Throughput | 1000 req/s | Per service instance |
| Database Query Time (P95) | < 50ms | Simple queries with indexes |
| Cache Hit Rate (L1) | > 40% | In-memory cache |
| Cache Hit Rate (L2) | > 80% | Redis cache |
| Event Publish Latency (P95) | < 10ms | Kafka fire-and-forget |
| Service Availability | > 99.9% | Monthly uptime target |
| Error Rate | < 1% | 4xx + 5xx errors |
Tối ưu hóa Hiệu suất:
- Multi-layer caching (L1: Memory, L2: Redis)
- Connection pooling cho database
- Pagination cho list endpoints (max 100 items)
- Database indexes cho frequently queried fields
- Async event publishing (fire-and-forget)
- CDN cho static assets (Next.js)
Cân nhắc Bảo mật
Authentication:
- JWT với RS256 (asymmetric signing)
- Access token: 15 phút expiry
- Refresh token: 7 ngày expiry, rotation on use
- httpOnly cookies cho token storage
- MFA support (TOTP, backup codes)
Authorization:
- RBAC (Role-Based Access Control)
- ABAC (Attribute-Based Access Control)
- Permission format:
resource:action:scope - Permission caching (5 min TTL)
- Zero-trust device validation
Network Security:
- TLS 1.2+ enforcement
- HTTPS-only (HSTS headers)
- Rate limiting: 100 req/15min (standard), 10 req/hour (strict)
- CORS whitelist từ environment variables
- Network policies (Kubernetes)
Data Protection:
- AES-256-GCM encryption cho PII at rest
- bcrypt (cost 12) cho password hashing
- SHA-256 hashing cho tokens before storage
- Database encryption at rest (Neon)
- TLS in-transit cho tất cả connections
Secrets Management:
- Kubernetes secrets cho production
- Environment variables validation với Zod
- No hardcoded secrets in code
- Quarterly secret rotation
Audit Trail:
- Event sourcing cho tất cả auth events
- 7-year retention cho compliance
- Immutable audit logs
- Correlation IDs cho request tracing
Triển khai
graph TD
subgraph "Kubernetes Cluster"
subgraph "Ingress"
LB[Load Balancer<br/>External IP]
Traefik[Traefik Pods<br/>Replicas: 2]
end
subgraph "Services"
IAM[IAM Service Pods<br/>Replicas: 2-10 HPA]
Service1[Service 1 Pods<br/>Replicas: 2-10 HPA]
Service2[Service 2 Pods<br/>Replicas: 2-10 HPA]
end
subgraph "Infrastructure"
Redis[Redis Cluster<br/>3 Masters + 3 Slaves]
Kafka[Kafka Cluster<br/>3 Brokers]
end
subgraph "Observability"
Prom[Prometheus<br/>Replicas: 2]
Loki[Loki<br/>Replicas: 2]
Jaeger[Jaeger<br/>Replicas: 2]
Grafana[Grafana<br/>Replicas: 2]
end
end
subgraph "External"
DB[(Neon PostgreSQL<br/>Serverless)]
end
LB --> Traefik
Traefik --> IAM
Traefik --> Service1
Traefik --> Service2
IAM --> Redis
IAM --> Kafka
IAM --> DB
Service1 --> Redis
Service1 --> Kafka
Service1 --> DB
Service2 --> Redis
Service2 --> Kafka
Service2 --> DB
IAM -.->|metrics| Prom
Service1 -.->|metrics| Prom
Service2 -.->|metrics| Prom
IAM -.->|logs| Loki
Service1 -.->|logs| Loki
Service2 -.->|logs| Loki
IAM -.->|traces| Jaeger
Service1 -.->|traces| Jaeger
Service2 -.->|traces| Jaeger
Prom --> Grafana
Loki --> Grafana
Jaeger --> Grafana
style LB fill:#1565c0,stroke:#fff,stroke-width:2px,color:#fff
style Traefik fill:#0f4c81,stroke:#fff,stroke-width:2px,color:#fff
style IAM fill:#283593,stroke:#fff,stroke-width:2px,color:#fff
style Service1 fill:#4527a0,stroke:#fff,stroke-width:2px,color:#fff
style Service2 fill:#4527a0,stroke:#fff,stroke-width:2px,color:#fff
style DB fill:#5e35b1,stroke:#fff,stroke-width:2px,color:#fff
style Redis fill:#ef6c00,stroke:#fff,stroke-width:2px,color:#fff
style Kafka fill:#2e7d32,stroke:#fff,stroke-width:2px,color:#fff
style Prom fill:#c62828,stroke:#fff,stroke-width:2px,color:#fff
style Loki fill:#d84315,stroke:#fff,stroke-width:2px,color:#fff
style Jaeger fill:#e65100,stroke:#fff,stroke-width:2px,color:#fff
style Grafana fill:#b71c1c,stroke:#fff,stroke-width:2px,color:#fff
Chiến lược Triển khai
Deployment Strategy:
- Rolling updates (maxSurge: 1, maxUnavailable: 0)
- Zero-downtime deployments
- Blue-green deployment cho major releases
- Canary deployment cho high-risk changes
Auto-scaling:
- Horizontal Pod Autoscaler (HPA)
- Min replicas: 2
- Max replicas: 10
- Target CPU: 70%
- Target Memory: 80%
Resource Allocation:
| Service | Requests | Limits |
|---|---|---|
| Microservices | 256Mi RAM, 250m CPU | 512Mi RAM, 500m CPU |
| Traefik | 512Mi RAM, 500m CPU | 1Gi RAM, 1000m CPU |
| Redis | 2Gi RAM, 1 CPU | 4Gi RAM, 2 CPU |
| Prometheus | 4Gi RAM, 2 CPU | 8Gi RAM, 4 CPU |
Health Checks:
- Liveness probe:
/health/live(K8s restarts if fails) - Readiness probe:
/health/ready(K8s removes from LB if fails) - Startup probe:
/health/live(initial delay 30s)
Environments:
- Local: Docker Compose
- Staging: Kubernetes cluster (shared)
- Production: Kubernetes cluster (dedicated)
Giám sát & Khả năng quan sát
Chỉ số Chính
Application Metrics:
http_requests_total- Total HTTP requests (counter)http_request_duration_seconds- Request duration (histogram)http_requests_active- Active requests (gauge)cache_hits_total/cache_misses_total- Cache performancedb_query_duration_seconds- Database query duration
Infrastructure Metrics:
- CPU usage, Memory usage per pod
- Network I/O, Disk I/O
- Pod restart count
- Node resource utilization
Business Metrics:
- User registrations per day
- Login success/failure rate
- API usage by endpoint
- Error rate by service
Kiểm tra Sức khỏe:
/health/live- Liveness probe (service running?)/health/ready- Readiness probe (ready for traffic?)/metrics- Prometheus metrics endpoint
Alerting Rules:
# High error rate
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 2m
severity: warning
# High latency
- alert: HighLatency
expr: histogram_quantile(0.95, http_request_duration_seconds_bucket) > 0.5
for: 5m
severity: warning
# Service down
- alert: ServiceDown
expr: up == 0
for: 1m
severity: critical
# High memory usage
- alert: HighMemoryUsage
expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.85
for: 5m
severity: warning
Logging:
- Structured JSON logging với Winston
- Correlation IDs cho request tracing
- Log levels: error, warn, info, debug
- Log aggregation với Loki
- 7 days retention
Distributed Tracing:
- OpenTelemetry instrumentation
- Jaeger backend
- Trace sampling: 10% in production, 100% in staging
- Span attributes: service, operation, user_id, correlation_id
Tài liệu Liên quan
- Event-Driven Architecture - Kiến trúc hướng sự kiện
- Caching Architecture - Chiến lược caching
- Security Architecture - Kiến trúc bảo mật
- Observability Architecture - Khả năng quan sát
- Data Consistency Patterns - Mẫu nhất quán dữ liệu
- Microservices Communication - Giao tiếp microservices
Tham khảo
- Microservices Patterns - Microservices pattern catalog
- Twelve-Factor App - Best practices for cloud-native apps
- C4 Model - Software architecture diagrams
- Kubernetes Documentation - Kubernetes official docs
- Traefik Documentation - Traefik official docs
Cập nhật Lần cuối: 2026-01-14
Tác giả: GoodGo Architecture Team
Người review: GoodGo Development Team
Quick Tips
Mermaid Common Issues
- Arrow Syntax: Use
-->for solid arrows,-.->for dotted arrows. - Node IDs: Avoid spaces/special chars in IDs (e.g.,
Node-AnotNode A). - Subgraphs: Ensure
subgraphnames are unique and descriptive.
Color Pattern Quick Reference
| Element | Dark Color | Text Color |
|---|---|---|
| Blue (Primary) | #0f4c81 |
#ffffff |
| Purple (DB) | #5e35b1 |
#ffffff |
| Orange (Cache) | #ef6c00 |
#ffffff |
| Green (Success) | #2e7d32 |
#ffffff |
| Red (Alert) | #c62828 |
#ffffff |
Visual Indicators
- ✅ Khuyên dùng
- ❌ Không khuyên dùng
- ⚠️ Cảnh báo