- Translated and revised architecture documents to enhance clarity and accessibility for both English and Vietnamese audiences. - Improved diagrams and descriptions for caching, data consistency, event-driven architecture, microservices communication, observability, and security architecture. - Ensured consistent formatting and terminology across all documents to facilitate better understanding and navigation. - Added quick tips and troubleshooting sections to assist developers in implementing and managing the architecture effectively.
44 KiB
Kiến trúc Bảo mật / Security Architecture
VI: Kiến trúc bảo mật toàn diện cho nền tảng GoodGo với mô hình zero-trust, RBAC và compliance EN: Comprehensive security architecture for GoodGo platform with zero-trust model, RBAC, and compliance
Sơ đồ Tổng quan / Overview Diagram
graph TD
Request[Client Request] --> TLS[TLS/HTTPS Layer]
TLS --> RateLimit[Rate Limiting]
RateLimit --> JWT[JWT Validation]
JWT --> RBAC[RBAC Authorization]
RBAC --> ZeroTrust[Zero-Trust Checks]
ZeroTrust --> Service[Service Logic]
Service --> Encrypt[Data Encryption<br/>AES-256-GCM]
Encrypt --> DB[(Encrypted Data)]
Service --> Audit[Audit Logging]
Audit --> AuditDB[(Audit Trail<br/>7-year retention)]
style TLS fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
style JWT fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
style Encrypt fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
style Audit fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
Mô tả Kiến trúc / Architecture Description
VI: Phần Tiếng Việt
Kiến trúc Bảo mật GoodGo triển khai defense-in-depth với nhiều tầng bảo mật:
Nguyên tắc Bảo mật:
- Zero Trust: Không bao giờ tin tưởng, luôn xác minh
- Least Privilege: Quyền tối thiểu cần thiết
- Defense in Depth: Nhiều tầng bảo mật
- Audit Everything: Audit trail hoàn chỉnh
- Encryption: Mã hóa dữ liệu at rest và in transit
Thành phần Chính:
- ASP.NET Core Identity (User Management)
- OpenIddict (OAuth2/OIDC Server)
- JWT Authentication (15min access, 7 ngày refresh)
- RBAC Authorization
- MFA Support (TOTP)
- Compliance (GDPR, SOC2, ISO27001)
EN: English Section
The GoodGo Security Architecture implements defense-in-depth with multiple security layers:
Security Principles:
- Zero Trust: Never trust, always verify
- Least Privilege: Minimum required permissions
- Defense in Depth: Multiple security layers
- Audit Everything: Complete audit trail
- Encryption: Data encrypted at rest and in transit
Key Components:
- ASP.NET Core Identity (User Management)
- OpenIddict (OAuth2/OIDC Server)
- JWT Authentication (15min access, 7d refresh)
- RBAC Authorization
- MFA Support (TOTP)
- Compliance (GDPR, SOC2, ISO27001)
Luồng Xác thực / Authentication Flow
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
participant Client
participant API as API Gateway
participant IAM as IAM Service
participant DB as Database
participant Cache as Redis
Client->>API: Login Request<br/>(email + password)
API->>IAM: Forward Request
IAM->>DB: Verify Credentials
DB-->>IAM: User + Hash
IAM->>IAM: bcrypt.compare()<br/>(cost 12)
alt Valid Credentials
IAM->>IAM: Generate Tokens<br/>(Access + Refresh)
IAM->>DB: Store Refresh Token<br/>(hashed SHA-256)
IAM->>Cache: Cache Permissions<br/>(5min TTL)
IAM-->>API: Tokens + User
API-->>Client: Set httpOnly Cookies
else Invalid
IAM-->>Client: 401 Unauthorized
end
VI: Chi tiết Xác thực
1. Password Hashing:
- Thuật toán: ASP.NET Core Identity (PBKDF2 với HMAC-SHA256)
- Cost factor: 100,000 iterations
- Password tối thiểu: 8 ký tự với quy tắc phức tạp
2. JWT Tokens (OpenIddict):
- Access Token: 15 phút expiry
- Refresh Token: 7 ngày expiry
- Thuật toán: RS256 (asymmetric signing)
- Payload: sub, name, email, roles
3. Token Storage:
- Access: Bearer token trong Authorization header
- Refresh: Database SHA-256 hash (OpenIddict stores)
4. MFA Support (Xác thực Hai yếu tố):
- TOTP (RFC 6238) cho authenticator apps
- QR code để thiết lập (Google Authenticator, Authy)
- Recovery codes (10 mã dùng một lần)
- Secret key lưu qua UserManager.SetAuthenticationTokenAsync
5. Email Verification (Xác minh Email):
- Gửi email xác minh qua SMTP (MailKit)
- Token generation: UserManager.GenerateEmailConfirmationTokenAsync
- Link xác minh với token và userId
- Đặt EmailConfirmed = true khi xác nhận
6. Social Login (Đăng nhập Mạng xã hội):
- Tích hợp Google OAuth 2.0
- Tích hợp Facebook OAuth
- Liên kết tài khoản cho users hiện có (theo email)
- Tự động xác nhận email cho social logins
- Lưu provider info qua UserManager.AddLoginAsync
EN: Authentication Details
1. Password Hashing:
- Algorithm: bcrypt with cost factor 12
- Never store plaintext passwords
- Minimum password: 8 chars with complexity rules
2. JWT Tokens:
- Access Token: 15 minutes expiry
- Refresh Token: 7 days expiry
- Algorithm: RS256 (asymmetric signing)
- Payload: userId, roles, permissions
3. Token Storage:
- Access: httpOnly cookie (secure, sameSite)
- Refresh: Database SHA-256 hash
- Rotation: New refresh token on each use
4. MFA Support (Two-Factor Authentication):
- TOTP (Time-based One-Time Password) using RFC 6238
- QR code generation for authenticator apps (Google Authenticator, Authy)
- Recovery codes (10 single-use codes)
- Secret key storage: UserManager.SetAuthenticationTokenAsync
5. Email Verification:
- SMTP-based verification emails via MailKit
- Token generation using UserManager.GenerateEmailConfirmationTokenAsync
- Verification link with token and userId
- EmailConfirmed flag set true upon confirmation
6. Social Login (OAuth2 Providers):
- Google OAuth 2.0 integration
- Facebook OAuth integration
- Account linking for existing users (by email match)
- Auto email confirmation for social logins
- Provider info stored via UserManager.AddLoginAsync
Mô hình Phân quyền / Authorization Model
graph TD
User[User] --> Roles[Roles]
User --> DirectPerms[Direct Permissions]
Roles --> RolePerms[Role Permissions]
RolePerms --> Check{Permission Check}
DirectPerms --> Check
Check -->|Granted| Resource[Access Resource]
Check -->|Denied| Reject[403 Forbidden]
subgraph "Permission Model"
Perm[Permission<br/>resource:action:scope]
end
style Check fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
style Perm fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
VI: RBAC (Role-Based Access Control)
1. Cấp bậc Role:
SuperAdmin > OrgAdmin > Manager > User > Guest
2. Format Permission: resource:action:scope
- Resource:
users,roles,permissions - Action:
create,read,update,delete - Scope:
own,org,global
Ví dụ:
users:read:own- Đọc profile của chính mìnhusers:update:org- Update users trong organizationroles:create:global- Tạo roles globally
3. Permission Caching:
Cache key: user:{userId}:permissions
TTL: 5 phút
Invalidate khi: role change, permission change
EN: RBAC (Role-Based Access Control)
1. Role Hierarchy:
SuperAdmin > OrgAdmin > Manager > User > Guest
2. Permission Format: resource:action:scope
- Resource:
users,roles,permissions - Action:
create,read,update,delete - Scope:
own,org,global
Examples:
users:read:own- Read own user profileusers:update:org- Update users in organizationroles:create:global- Create roles globally
3. Permission Caching:
Cache key: user:{userId}:permissions
TTL: 5 minutes
Invalidate on: role change, permission change
Kiến trúc Zero-Trust / Zero-Trust Architecture
graph TD
Request[Request] --> Device[Device Fingerprint]
Device --> IP[IP Address Validation]
IP --> Behavior[Behavioral Analysis]
Behavior --> Session[Session Binding]
Session -->|Valid| Allow[Allow Request]
Session -->|Suspicious| MFA[Require MFA]
Session -->|Anomaly| Block[Block + Alert]
style Block fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
style MFA fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
style Allow fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
VI: Thành phần Zero-Trust
1. Device Fingerprinting:
- Browser: User-Agent, Canvas, WebGL
- Screen resolution, timezone, language
- Phát hiện plugin, fonts có sẵn
- Hash fingerprint → Lưu với session
2. IP Address Validation:
- Whitelist IPs đã biết cho user
- Alert với IP mới + require MFA
- Block IPs đáng ngờ (VPN, Tor)
3. Behavioral Analysis:
- Login patterns (time, location)
- API usage patterns
- Failed auth attempts
- Alert với anomalies
4. Session Binding:
- Bind session với device fingerprint
- Bind session với IP address
- Invalidate khi mismatch
EN: Zero-Trust Components
1. Device Fingerprinting:
- Browser: User-Agent, Canvas, WebGL
- Screen resolution, timezone, language
- Plugin detection, fonts available
- Hash fingerprint → Store with session
2. IP Address Validation:
- Whitelist known IPs per user
- Alert on new IP + require MFA
- Block suspicious IPs (VPN, Tor)
3. Behavioral Analysis:
- Login patterns (time, location)
- API usage patterns
- Failed auth attempts
- Alert on anomalies
4. Session Binding:
- Bind session to device fingerprint
- Bind session to IP address
- Invalidate on mismatch
Bảo vệ Dữ liệu / Data Protection
VI: Chiến lược Mã hóa
1. Data at Rest:
- PII: AES-256-GCM encryption
- Passwords: bcrypt (cost 12)
- Tokens: SHA-256 hash
- Keys: Environment variables + K8s secrets
2. Data in Transit:
- TLS 1.2+ cho mọi giao tiếp
- HTTPS enforcement
- Certificate pinning (mobile clients)
3. Key Management:
- Unique key per encryption operation
- 32+ character ENCRYPTION_KEY
- Rotate keys hàng quý / quarterly
- Không bao giờ hardcode secrets
EN: Encryption Strategy
1. Data at Rest:
- PII: AES-256-GCM encryption
- Passwords: bcrypt (cost 12)
- Tokens: SHA-256 hash
- Keys: Environment variables + K8s secrets
2. Data in Transit:
- TLS 1.2+ for all communications
- HTTPS enforcement
- Certificate pinning (mobile clients)
3. Key Management:
- Unique key per encryption operation
- 32+ character ENCRYPTION_KEY
- Rotate keys quarterly
- Never hardcode secrets
Tuân thủ & Kiểm toán / Compliance & Audit
VI: Yêu cầu Tuân thủ
1. GDPR:
- Right to erasure (soft delete + hard delete sau 90 ngày)
- Data portability (export dữ liệu user)
- Quản lý consent
- Thông báo breach (72 giờ)
2. SOC2:
- Access controls (RBAC)
- Encryption at rest và in transit
- Audit logging (7 năm retention)
- Incident response plan
3. Audit Trail:
{
eventType: 'auth.login.success',
userId: 'user_123',
timestamp: '2024-01-15T10:30:00Z',
ipAddress: '192.168.1.1',
deviceFingerprint: 'fp_xyz',
metadata: {...}
}
EN: Compliance Requirements
1. GDPR:
- Right to erasure (soft delete + hard delete after 90 days)
- Data portability (export user data)
- Consent management
- Breach notification (72 hours)
2. SOC2:
- Access controls (RBAC)
- Encryption at rest and in transit
- Audit logging (7-year retention)
- Incident response plan
{
eventType: 'auth.login.success',
userId: 'user_123',
timestamp: '2024-01-15T10:30:00Z',
ipAddress: '192.168.1.1',
deviceFingerprint: 'fp_xyz',
metadata: {...}
}
Bối cảnh Hệ thống / System Context
%%{init: {'theme': 'dark'}}%%
C4Context
title Sơ đồ Bối cảnh Security Architecture
Person(user, "Người dùng / User", "End user accessing platform")
Person(admin, "Quản trị viên / Admin", "System administrator")
Person(attacker, "Kẻ tấn công / Attacker", "Potential threat actor")
System(iam, "IAM Service", "Authentication & Authorization")
System_Ext(db, "Neon PostgreSQL", "Encrypted user credentials & sessions")
System_Ext(cache, "Redis", "Permission & session cache")
System_Ext(audit, "Audit Service", "Security event logging")
System_Ext(mfa, "MFA Provider", "TOTP verification")
System_Ext(monitoring, "Security Monitoring", "SIEM & alerting")
Rel(user, iam, "Authenticates", "HTTPS + TLS 1.2+")
Rel(admin, iam, "Manages permissions", "HTTPS + TLS 1.2+")
Rel(attacker, iam, "Blocked by security layers", "")
Rel(iam, db, "Stores credentials", "PostgreSQL + TLS")
Rel(iam, cache, "Caches permissions", "Redis + TLS")
Rel(iam, audit, "Logs security events", "Kafka")
Rel(iam, mfa, "Verifies MFA", "HTTPS")
Rel(iam, monitoring, "Sends security metrics", "Prometheus + Loki")
VI Mô tả:
- IAM Service: Trung tâm xác thực và phân quyền
- Database: Lưu trữ credentials đã mã hóa, sessions, permissions
- Cache: Cache permissions và sessions để giảm database load
- Audit Service: Nhận và lưu trữ tất cả security events
- MFA Provider: External TOTP verification service (Google Authenticator compatible)
- Security Monitoring: SIEM (Security Information and Event Management) và alerting
EN Description:
- IAM Service: Central authentication and authorization
- Database: Stores encrypted credentials, sessions, permissions
- Cache: Caches permissions and sessions to reduce database load
- Audit Service: Receives and stores all security events
- MFA Provider: External TOTP verification service (Google Authenticator compatible)
- Security Monitoring: SIEM (Security Information and Event Management) and alerting
Kiến trúc Database / Database Architecture
%%{init: {'theme': 'dark'}}%%
erDiagram
User ||--o{ Session : has
User ||--o{ UserRole : has
User ||--o{ UserPermission : has
User ||--o{ MFADevice : has
User ||--o{ LoginHistory : has
User ||--o{ DeviceFingerprint : has
Role ||--o{ UserRole : assigned_to
Role ||--o{ RolePermission : has
Permission ||--o{ RolePermission : granted_to
Permission ||--o{ UserPermission : granted_to
Organization ||--o{ User : contains
Organization ||--o{ Role : defines
User {
string id PK "CUID"
string email UK "Unique, indexed"
string passwordHash "bcrypt cost 12"
string organizationId FK
boolean mfaEnabled "MFA required?"
datetime lastLoginAt "Tracking"
datetime createdAt "Timestamp"
datetime updatedAt "Timestamp"
datetime deletedAt "Soft delete"
}
Session {
string id PK "CUID"
string userId FK
string refreshTokenHash "SHA-256"
string deviceFingerprint "Hashed"
string ipAddress "IPv4/IPv6"
string userAgent "Browser info"
datetime expiresAt "7 days TTL"
datetime lastActivityAt "Tracking"
datetime createdAt "Timestamp"
}
Role {
string id PK "CUID"
string name "role-name"
string organizationId FK
int hierarchy "Priority level"
boolean isSystem "Built-in?"
datetime createdAt "Timestamp"
}
Permission {
string id PK "CUID"
string resource "users, roles, etc"
string action "create, read, update, delete"
string scope "own, org, global"
datetime createdAt "Timestamp"
}
MFADevice {
string id PK "CUID"
string userId FK
string type "totp, backup"
string secret "Encrypted TOTP secret"
boolean verified "Verified?"
datetime lastUsedAt "Tracking"
datetime createdAt "Timestamp"
}
LoginHistory {
string id PK "CUID"
string userId FK
boolean success "Success/Failure"
string ipAddress "IPv4/IPv6"
string deviceFingerprint "Hashed"
string failureReason "If failed"
datetime timestamp "Event time"
}
DeviceFingerprint {
string id PK "CUID"
string userId FK
string fingerprint "Hashed"
boolean trusted "Auto-approved?"
datetime firstSeenAt "First use"
datetime lastSeenAt "Last use"
}
VI Mô tả:
- User: Lưu credentials đã hash, MFA settings, organization membership
- Session: Lưu refresh tokens đã hash, device fingerprint, IP tracking
- Role & Permission: RBAC hierarchy với system roles và custom roles
- MFADevice: TOTP secrets (encrypted), backup codes
- LoginHistory: Audit trail cho tất cả login attempts (success/failure)
- DeviceFingerprint: Trusted device tracking cho zero-trust model
Bảo mật Database:
- Password hashes: bcrypt với cost factor 12
- Token hashes: SHA-256
- MFA secrets: AES-256-GCM encryption
- Soft deletes:
deletedAtfield, hard delete sau 90 ngày (GDPR) - Indexes: email (unique), userId (foreign keys), timestamps
EN Description:
- User: Stores hashed credentials, MFA settings, organization membership
- Session: Stores hashed refresh tokens, device fingerprint, IP tracking
- Role & Permission: RBAC hierarchy with system roles and custom roles
- MFADevice: TOTP secrets (encrypted), backup codes
- LoginHistory: Audit trail for all login attempts (success/failure)
- DeviceFingerprint: Trusted device tracking for zero-trust model
Database Security:
- Password hashes: bcrypt with cost factor 12
- Token hashes: SHA-256
- MFA secrets: AES-256-GCM encryption
- Soft deletes:
deletedAtfield, hard delete after 90 days (GDPR) - Indexes: email (unique), userId (foreign keys), timestamps
Quyết định Thiết kế / Design Decisions
Quyết định 1: JWT với RS256 (Asymmetric)
VI Bối cảnh: Cần stateless authentication với khả năng verify tokens ở multiple services
VI Quyết định: Sử dụng JWT với RS256 (RSA asymmetric signing) thay vì HS256 (HMAC symmetric)
VI Hậu quả:
- ✅ Tích cực:
- Services có thể verify tokens với public key, không cần secret
- Key rotation dễ dàng hơn (chỉ cần distribute public key mới)
- Bảo mật cao hơn (private key chỉ ở IAM service)
- Compliance: Audit trail rõ ràng về ai sign tokens
- ❌ Tiêu cực:
- Chậm hơn HS256 một chút (~10-20% slower)
- Phức tạp hơn trong key management
- Public/private key pair phải được bảo vệ cẩn thận
VI Các lựa chọn thay thế: HS256 (symmetric), EdDSA, OAuth 2.0 with Opaque Tokens
EN Context: Need stateless authentication with ability to verify tokens in multiple services
EN Decision: Use JWT with RS256 (RSA asymmetric signing) instead of HS256 (HMAC symmetric)
EN Consequences:
- ✅ Positive:
- Services can verify tokens with public key, don't need secret
- Easier key rotation (only distribute new public key)
- Higher security (private key only in IAM service)
- Compliance: Clear audit trail of who signs tokens
- ❌ Negative:
- Slightly slower than HS256 (~10-20% slower)
- More complex key management
- Public/private key pair must be carefully protected
EN Alternatives: HS256 (symmetric), EdDSA, OAuth 2.0 with Opaque Tokens
Quyết định 2: Zero-Trust Model với Device Fingerprinting
VI Bối cảnh: Cần bảo vệ chống lại credential theft, session hijacking và unauthorized access
VI Quyết định: Triển khai zero-trust model với device fingerprinting, IP validation, behavioral analysis
VI Hậu quả:
- ✅ Tích cực:
- Phát hiện được anomalies (new device, new IP, unusual behavior)
- Tăng security khi detect và block suspicious activities
- Compliance: SOC2, ISO27001 requirements
- User experience: Auto-approve trusted devices
- ❌ Tiêu cực:
- Complexity cao hơn
- Potential false positives (legitimate users blocked)
- Performance overhead (fingerprint hash, IP check)
- Privacy concerns (tracking devices, IPs)
VI Các lựa chọn thay thế: Basic authentication only, IP whitelist only, MFA required for all
EN Context: Need to protect against credential theft, session hijacking, and unauthorized access
EN Decision: Implement zero-trust model with device fingerprinting, IP validation, behavioral analysis
EN Consequences:
- ✅ Positive:
- Detect anomalies (new device, new IP, unusual behavior)
- Increased security by detecting and blocking suspicious activities
- Compliance: SOC2, ISO27001 requirements
- User experience: Auto-approve trusted devices
- ❌ Negative:
- Higher complexity
- Potential false positives (legitimate users blocked)
- Performance overhead (fingerprint hash, IP check)
- Privacy concerns (tracking devices, IPs)
EN Alternatives: Basic authentication only, IP whitelist only, MFA required for all
Quyết định 3: Event Sourcing cho Audit Trail
VI Bối cảnh: Cần immutable audit trail cho compliance (GDPR, SOC2, HIPAA) và security forensics
VI Quyết định: Sử dụng event sourcing pattern để lưu tất cả auth/security events
VI Hậu quả:
- ✅ Tích cực:
- Immutable audit trail (không thể modify/delete)
- Complete history của tất cả security events
- Compliance: GDPR (7-year retention), SOC2, HIPAA
- Security forensics: Trace back attacks, breaches
- Replay events để reconstruct state
- ❌ Tiêu cực:
- Storage cost cao (retain 7 years)
- Complexity trong event schema versioning
- Performance: Event publishing overhead
- Data privacy: Must anonymize PII after retention period
VI Các lựa chọn thay thế: Database audit logs only, External SIEM only, No audit trail
EN Context: Need immutable audit trail for compliance (GDPR, SOC2, HIPAA) and security forensics
EN Decision: Use event sourcing pattern to store all auth/security events
EN Consequences:
- ✅ Positive:
- Immutable audit trail (cannot modify/delete)
- Complete history of all security events
- Compliance: GDPR (7-year retention), SOC2, HIPAA
- Security forensics: Trace back attacks, breaches
- Replay events to reconstruct state
- ❌ Negative:
- High storage cost (retain 7 years)
- Complexity in event schema versioning
- Performance: Event publishing overhead
- Data privacy: Must anonymize PII after retention period
EN Alternatives: Database audit logs only, External SIEM only, No audit trail
Đặc điểm Hiệu suất / Performance Characteristics
| Chỉ số / Metric | Mục tiêu / Target | Ghi chú / Notes |
|---|---|---|
| Login Time (P95) | < 500ms | Including bcrypt verification |
| Login Time (P99) | < 1s | Peak load |
| Token Generation (P95) | < 50ms | JWT sign with RS256 |
| Token Verification (P95) | < 10ms | JWT verify with public key |
| Permission Check (P95) | < 5ms | From cache (L1 or L2) |
| Permission Check (Cache Miss) | < 50ms | Database query |
| MFA Verification (P95) | < 100ms | TOTP validation |
| Session Lookup (P95) | < 10ms | Redis cache |
| Password Hash (P95) | < 200ms | bcrypt cost 12 |
| Device Fingerprint Hash | < 5ms | SHA-256 |
| Failed Login Rate Limit | 5 attempts / 15min | Per user |
| Auth Throughput | 500 req/s | Per IAM instance |
VI Tối ưu hóa Hiệu suất:
- Permission Caching: L1 (memory) + L2 (Redis), TTL 5 phút
- Token Caching: Cache public key in memory for JWT verification
- Connection Pooling: Reuse database connections
- Async Operations: Event publishing, audit logging (fire-and-forget)
- Rate Limiting: Prevent brute force attacks, reduce load
- Horizontal Scaling: Multiple IAM service instances
EN Performance Optimizations:
- Permission Caching: L1 (memory) + L2 (Redis), TTL 5 minutes
- Token Caching: Cache public key in memory for JWT verification
- Connection Pooling: Reuse database connections
- Async Operations: Event publishing, audit logging (fire-and-forget)
- Rate Limiting: Prevent brute force attacks, reduce load
- Horizontal Scaling: Multiple IAM service instances
Triển khai / Deployment
graph TD
subgraph "Security Layer"
LB[Load Balancer<br/>TLS Termination]
WAF[WAF / Firewall<br/>Rate Limiting<br/>DDoS Protection]
end
subgraph "IAM Service Layer"
IAM1[IAM Service Pod 1<br/>Stateless]
IAM2[IAM Service Pod 2<br/>Stateless]
IAM3[IAM Service Pod 3<br/>Stateless]
end
subgraph "Data Layer"
DB[(Neon PostgreSQL<br/>Encrypted at Rest)]
Cache[(Redis Cluster<br/>TLS Enabled)]
Vault[Secrets Manager<br/>K8s Secrets]
end
subgraph "Security Monitoring"
SIEM[SIEM / Security Monitoring]
Alerts[Alerting System]
end
Client[Clients] --> LB
LB --> WAF
WAF --> IAM1
WAF --> IAM2
WAF --> IAM3
IAM1 --> DB
IAM1 --> Cache
IAM1 --> Vault
IAM2 --> DB
IAM2 --> Cache
IAM2 --> Vault
IAM3 --> DB
IAM3 --> Cache
IAM3 --> Vault
IAM1 -.->|Security Events| SIEM
IAM2 -.->|Security Events| SIEM
IAM3 -.->|Security Events| SIEM
SIEM -.->|Alerts| Alerts
style LB fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
style WAF fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
style DB fill:#7e22ce,stroke:#fff,stroke-width:2px,color:#fff
style Cache fill:#1f2937,stroke:#fff,stroke-width:2px,color:#fff
style Vault fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
style SIEM fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
VI: Chiến lược Triển khai
Security Deployment:
- TLS 1.2+ Enforcement: All connections require TLS
- Network Policies (K8s): Deny all by default, whitelist specific services
- Pod Security Policies: Non-root user, read-only filesystem, no privilege escalation
- Secrets Management: Kubernetes secrets with encryption at rest
- Image Scanning: Trivy/Clair scan before deployment
- RBAC (K8s): Least privilege for service accounts
Resource Allocation:
| Component | CPU | Memory | Replicas |
|---|---|---|---|
| IAM Service | 500m | 1GB | 3-10 (HPA) |
| Redis | 1 core | 2GB | 3 masters + 3 slaves |
Security Configuration:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: iam-service-policy
spec:
podSelector:
matchLabels:
app: iam-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 5000
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
Deployment Security Checklist:
- TLS 1.2+ enforced
- Network policies configured
- Pod security policies applied
- Secrets encrypted at rest
- Container images scanned
- Non-root user in containers
- Read-only root filesystem
- Resource limits set
- Health checks configured
- Security monitoring enabled
EN: Deployment Strategy
Security Deployment:
- TLS 1.2+ Enforcement: All connections require TLS
- Network Policies (K8s): Deny all by default, whitelist specific services
- Pod Security Policies: Non-root user, read-only filesystem, no privilege escalation
- Secrets Management: Kubernetes secrets with encryption at rest
- Image Scanning: Trivy/Clair scan before deployment
- RBAC (K8s): Least privilege for service accounts
Resource Allocation:
| Component | CPU | Memory | Replicas |
|---|---|---|---|
| IAM Service | 500m | 1GB | 3-10 (HPA) |
| Redis | 1 core | 2GB | 3 masters + 3 slaves |
Security Configuration:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: iam-service-policy
spec:
podSelector:
matchLabels:
app: iam-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 5000
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
Deployment Security Checklist:
- TLS 1.2+ enforced
- Network policies configured
- Pod security policies applied
- Secrets encrypted at rest
- Container images scanned
- Non-root user in containers
- Read-only root filesystem
- Resource limits set
- Health checks configured
- Security monitoring enabled
Giám sát & Khả năng quan sát / Monitoring & Observability
VI: Chỉ số Chính
Authentication Metrics:
auth_login_attempts_total- Total login attempts (counter, labels: status=success/failure)auth_login_duration_seconds- Login duration (histogram)auth_token_generations_total- Token generations (counter)auth_token_verifications_total- Token verifications (counter, labels: status=valid/invalid/expired)auth_mfa_verifications_total- MFA verifications (counter, labels: status=success/failure)
Authorization Metrics:
auth_permission_checks_total- Permission checks (counter, labels: result=granted/denied)auth_permission_cache_hits_total- Permission cache hits (counter)auth_permission_cache_misses_total- Permission cache misses (counter)
Security Metrics:
auth_failed_login_rate- Failed login rate per user (gauge)auth_account_lockouts_total- Account lockouts (counter)auth_suspicious_activities_total- Suspicious activities detected (counter, labels: type)auth_anomalies_detected_total- Anomalies detected (counter, labels: anomaly_type)auth_password_reset_requests_total- Password reset requests (counter)
Session Metrics:
auth_active_sessions- Active sessions (gauge)auth_session_creations_total- Session creations (counter)auth_session_invalidations_total- Session invalidations (counter, labels: reason)
Application Code:
import { Counter, Histogram, Gauge } from 'prom-client';
export const loginAttempts = new Counter({
name: 'auth_login_attempts_total',
help: 'Total login attempts',
labelNames: ['status']
});
export const loginDuration = new Histogram({
name: 'auth_login_duration_seconds',
help: 'Login duration in seconds',
buckets: [0.1, 0.3, 0.5, 0.7, 1, 2, 5]
});
export const permissionChecks = new Counter({
name: 'auth_permission_checks_total',
help: 'Total permission checks',
labelNames: ['result']
});
export const suspiciousActivities = new Counter({
name: 'auth_suspicious_activities_total',
help: 'Suspicious activities detected',
labelNames: ['type']
});
loginAttempts.inc({ status: 'success' });
loginDuration.observe(duration);
permissionChecks.inc({ result: 'granted' });
suspiciousActivities.inc({ type: 'new_device' });
Alerting Rules:
groups:
- name: security_alerts
interval: 30s
rules:
- alert: HighFailedLoginRate
expr: rate(auth_login_attempts_total{status="failure"}[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: "High failed login rate detected"
description: "Failed login rate is {{ $value }}/sec"
- alert: BruteForceAttack
expr: |
sum by (user_id) (
rate(auth_login_attempts_total{status="failure"}[1m])
) > 5
for: 1m
labels:
severity: critical
annotations:
summary: "Potential brute force attack"
description: "User {{ $labels.user_id }} has > 5 failed logins/min"
- alert: AccountLockoutSpike
expr: rate(auth_account_lockouts_total[5m]) > 5
for: 2m
labels:
severity: warning
annotations:
summary: "Account lockout spike detected"
description: "Lockout rate is {{ $value }}/sec"
- alert: SuspiciousActivity
expr: rate(auth_suspicious_activities_total[5m]) > 10
for: 1m
labels:
severity: warning
annotations:
summary: "Suspicious activity detected"
description: "Suspicious activity rate: {{ $value }}/sec"
- alert: AnomalyDetected
expr: auth_anomalies_detected_total > 0
for: 1m
labels:
severity: critical
annotations:
summary: "Security anomaly detected"
description: "{{ $labels.anomaly_type }} detected"
- alert: PermissionDeniedSpike
expr: rate(auth_permission_checks_total{result="denied"}[5m]) > 50
for: 2m
labels:
severity: warning
annotations:
summary: "High permission denied rate"
description: "Permission denied rate: {{ $value }}/sec"
Security Dashboards:
- Authentication Overview: Login success/failure rate, login duration, MFA usage
- Authorization Overview: Permission checks, cache hit rate, denied requests
- Security Events: Suspicious activities, anomalies, account lockouts
- Session Management: Active sessions, session creations/invalidations
- Compliance: Audit trail completeness, retention policy compliance
Logging:
logger.info('Login successful', {
eventType: 'auth.login.success',
userId: user.id,
email: user.email,
ipAddress: req.ip,
deviceFingerprint: fingerprint,
mfaUsed: user.mfaEnabled,
correlationId: req.correlationId
});
logger.warn('Suspicious activity detected', {
eventType: 'security.suspicious_activity',
userId: user.id,
activityType: 'new_device',
ipAddress: req.ip,
deviceFingerprint: newFingerprint,
correlationId: req.correlationId
});
logger.error('Login failed', {
eventType: 'auth.login.failure',
email: email,
reason: 'invalid_credentials',
ipAddress: req.ip,
attemptCount: failedAttempts,
correlationId: req.correlationId
});
Audit Trail Monitoring:
- Event publishing rate and latency
- Event consumption lag
- Audit log completeness (no gaps)
- Retention policy compliance
- Anonymization after retention period
EN: Key Metrics
Authentication Metrics:
auth_login_attempts_total- Total login attempts (counter, labels: status=success/failure)auth_login_duration_seconds- Login duration (histogram)auth_token_generations_total- Token generations (counter)auth_token_verifications_total- Token verifications (counter, labels: status=valid/invalid/expired)auth_mfa_verifications_total- MFA verifications (counter, labels: status=success/failure)
Authorization Metrics:
auth_permission_checks_total- Permission checks (counter, labels: result=granted/denied)auth_permission_cache_hits_total- Permission cache hits (counter)auth_permission_cache_misses_total- Permission cache misses (counter)
Security Metrics:
auth_failed_login_rate- Failed login rate per user (gauge)auth_account_lockouts_total- Account lockouts (counter)auth_suspicious_activities_total- Suspicious activities detected (counter, labels: type)auth_anomalies_detected_total- Anomalies detected (counter, labels: anomaly_type)auth_password_reset_requests_total- Password reset requests (counter)
Session Metrics:
auth_active_sessions- Active sessions (gauge)auth_session_creations_total- Session creations (counter)auth_session_invalidations_total- Session invalidations (counter, labels: reason)
Application Code:
import { Counter, Histogram, Gauge } from 'prom-client';
export const loginAttempts = new Counter({
name: 'auth_login_attempts_total',
help: 'Total login attempts',
labelNames: ['status']
});
export const loginDuration = new Histogram({
name: 'auth_login_duration_seconds',
help: 'Login duration in seconds',
buckets: [0.1, 0.3, 0.5, 0.7, 1, 2, 5]
});
export const permissionChecks = new Counter({
name: 'auth_permission_checks_total',
help: 'Total permission checks',
labelNames: ['result']
});
export const suspiciousActivities = new Counter({
name: 'auth_suspicious_activities_total',
help: 'Suspicious activities detected',
labelNames: ['type']
});
loginAttempts.inc({ status: 'success' });
loginDuration.observe(duration);
permissionChecks.inc({ result: 'granted' });
suspiciousActivities.inc({ type: 'new_device' });
Alerting Rules:
groups:
- name: security_alerts
interval: 30s
rules:
- alert: HighFailedLoginRate
expr: rate(auth_login_attempts_total{status="failure"}[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: "High failed login rate detected"
description: "Failed login rate is {{ $value }}/sec"
- alert: BruteForceAttack
expr: |
sum by (user_id) (
rate(auth_login_attempts_total{status="failure"}[1m])
) > 5
for: 1m
labels:
severity: critical
annotations:
summary: "Potential brute force attack"
description: "User {{ $labels.user_id }} has > 5 failed logins/min"
- alert: AccountLockoutSpike
expr: rate(auth_account_lockouts_total[5m]) > 5
for: 2m
labels:
severity: warning
annotations:
summary: "Account lockout spike detected"
description: "Lockout rate is {{ $value }}/sec"
- alert: SuspiciousActivity
expr: rate(auth_suspicious_activities_total[5m]) > 10
for: 1m
labels:
severity: warning
annotations:
summary: "Suspicious activity detected"
description: "Suspicious activity rate: {{ $value }}/sec"
- alert: AnomalyDetected
expr: auth_anomalies_detected_total > 0
for: 1m
labels:
severity: critical
annotations:
summary: "Security anomaly detected"
description: "{{ $labels.anomaly_type }} detected"
- alert: PermissionDeniedSpike
expr: rate(auth_permission_checks_total{result="denied"}[5m]) > 50
for: 2m
labels:
severity: warning
annotations:
summary: "High permission denied rate"
description: "Permission denied rate: {{ $value }}/sec"
Security Dashboards:
- Authentication Overview: Login success/failure rate, login duration, MFA usage
- Authorization Overview: Permission checks, cache hit rate, denied requests
- Security Events: Suspicious activities, anomalies, account lockouts
- Session Management: Active sessions, session creations/invalidations
- Compliance: Audit trail completeness, retention policy compliance
Logging:
logger.info('Login successful', {
eventType: 'auth.login.success',
userId: user.id,
email: user.email,
ipAddress: req.ip,
deviceFingerprint: fingerprint,
mfaUsed: user.mfaEnabled,
correlationId: req.correlationId
});
logger.warn('Suspicious activity detected', {
eventType: 'security.suspicious_activity',
userId: user.id,
activityType: 'new_device',
ipAddress: req.ip,
deviceFingerprint: newFingerprint,
correlationId: req.correlationId
});
logger.error('Login failed', {
eventType: 'auth.login.failure',
email: email,
reason: 'invalid_credentials',
ipAddress: req.ip,
attemptCount: failedAttempts,
correlationId: req.correlationId
});
Audit Trail Monitoring:
- Event publishing rate and latency
- Event consumption lag
- Audit log completeness (no gaps)
- Retention policy compliance
- Anonymization after retention period
Tài liệu Liên quan / Related Documentation
- System Design - Kiến trúc tổng thể / Overall architecture
- IAM Architecture - Triển khai IAM service / IAM service implementation
- Event-Driven Architecture - Audit event streaming
Cập nhật Lần cuối / Last Updated: 2026-01-07
Tác giả / Authors: GoodGo Security Team
Quick Tips
🎨 Color Palette Reference (Dark Theme)
| Node Type | Color | Hex | Tailwind | Usage | Example |
|---|---|---|---|---|---|
| Primary | Blue | #1d4ed8 |
bg-blue-700 |
Core components, Identity, IAM, Permission Checks | JWT Validation, Auth Services |
| Secondary | Purple | #7e22ce |
bg-purple-700 |
Data stores, Database, Queues | PostgreSQL, Redis |
| Success | Green | #15803d |
bg-green-700 |
Valid, Allowed, Safe, Completed, TLS | Allow Request, Secure Connection |
| Error | Red | #b91c1c |
bg-red-700 |
Blocked, Invalid, Failed, Critical, Encryption Keys | Block + Alert, Vault, Critical Errors |
| Warning | Orange | #c2410c |
bg-orange-700 |
MFA, Suspicious, Latency, Cache, Alerts | Require MFA, WAF, SIEM |
| Base | Grey | #1f2937 |
bg-gray-800 |
External systems, Infrastructure, Logs | Cache, Monitoring |
🔧 Mermaid Common Issues
| Issue | Sign | Solution |
|---|---|---|
| Parse Error | Unexpected PIPE/character | Check for missing spaces after graph TD |
| Box Not Showing | Node missing in diagram | Verify node syntax: Node[Label] |
| Color Not Applied | Node has no color | Add style: style Node fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff |
| Arrow Issues | Connection not visible | Check arrow syntax: --> (solid), -.->(dashed) |
| Text Not Readable | Dark text on dark bg | Always use color:#fff (white text) |
| Subgraph Issues | Broken layout | Ensure proper indentation and end statement |
📊 Color Pattern Quick Reference
graph LR
A[Input] --> B[Process]
B --> C{Decision}
C -->|Yes| D[Success]
C -->|No| E[Error]
style A fill:#1f2937,stroke:#fff,stroke-width:2px,color:#fff
style B fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
style C fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
style D fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
style E fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
Pattern Template:
style NodeName fill:#color,stroke:#fff,stroke-width:2px,color:#fff
🎯 Visual Indicators
| Emoji | Meaning | Color | Usage |
|---|---|---|---|
| ✅ | Secure/Allowed/Valid | Green (#15803d) | Successful auth, allowed access |
| ❌ | Blocked/Denied/Invalid | Red (#b91c1c) | Failed login, access denied |
| ⚠️ | Warning/MFA/Alert | Orange (#c2410c) | Require MFA, suspicious activity |
| 🔒 | Encrypted/Secure | Blue/Purple (#1d4ed8, #7e22ce) | Encrypted data, secure channel |
| ☁️ | Cloud/External | Grey (#1f2937) | External services, cloud resources |
| 🔑 | Authentication | Orange (#c2410c) | Auth tokens, keys, credentials |
| 🛡️ | Security Layer | Green (#15803d) | Security controls, protection |
| 📊 | Monitoring | Blue (#1d4ed8) | Metrics, dashboards, logs |
🚀 Diagram Best Practices
-
Always use dark palette with white text (
color:#fff) -
Consistent stroke:
stroke:#fff,stroke-width:2px -
Logical color mapping:
- Blue = Core processes
- Green = Success/Allow
- Red = Error/Block
- Orange = Warning/MFA
- Purple = Data stores
- Grey = External systems
-
Readable labels: Use
<br/>for line breaks in labels -
Arrow clarity: Solid (
-->) for main flow, dashed (-.->) for secondary/async -
Subgraph organization: Group related components
🔍 Mermaid Debugging Checklist
- Graph type declared? (
graph TD,sequenceDiagram,erDiagram) - All nodes have unique IDs?
- Arrows have proper syntax? (
-->,-.->,-.->>) - Style definitions after graph content?
- All subgraphs have
endstatement? - Labels escaped properly? (use quotes for special chars)
- Color values correct? (6-digit hex with #)
- White text applied? (
color:#fff)