pos-system/docs/en/architecture/security-architecture.md

# Kiến trúc Bảo mật / Security Architecture

> **VI**: Kiến trúc bảo mật toàn diện cho nền tảng GoodGo với mô hình zero-trust, RBAC và compliance
> **EN**: Comprehensive security architecture for GoodGo platform with zero-trust model, RBAC, and compliance

## Sơ đồ Tổng quan / Overview Diagram

```mermaid
graph TD
    Request[Client Request] --> TLS[TLS/HTTPS Layer]
    TLS --> RateLimit[Rate Limiting]
    RateLimit --> JWT[JWT Validation]
    JWT --> RBAC[RBAC Authorization]
    RBAC --> ZeroTrust[Zero-Trust Checks]
    ZeroTrust --> Service[Service Logic]

    Service --> Encrypt[Data Encryption<br/>AES-256-GCM]
    Encrypt --> DB[(Encrypted Data)]

    Service --> Audit[Audit Logging]
    Audit --> AuditDB[(Audit Trail<br/>7-year retention)]

    style TLS fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
    style JWT fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
    style Encrypt fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
    style Audit fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
```

## Mô tả Kiến trúc / Architecture Description

### VI: Phần Tiếng Việt

Kiến trúc Bảo mật GoodGo triển khai defense-in-depth với nhiều tầng bảo mật:

**Nguyên tắc Bảo mật**:
1. **Zero Trust**: Không bao giờ tin tưởng, luôn xác minh
2. **Least Privilege**: Quyền tối thiểu cần thiết
3. **Defense in Depth**: Nhiều tầng bảo mật
4. **Audit Everything**: Audit trail hoàn chỉnh
5. **Encryption**: Mã hóa dữ liệu at rest và in transit

**Thành phần Chính**:
- ASP.NET Core Identity (User Management)
- OpenIddict (OAuth2/OIDC Server)
- JWT Authentication (15min access, 7 ngày refresh)
- RBAC Authorization
- MFA Support (TOTP)
- Compliance (GDPR, SOC2, ISO27001)

### EN: English Section

The GoodGo Security Architecture implements defense-in-depth with multiple security layers:

**Security Principles**:
1. **Zero Trust**: Never trust, always verify
2. **Least Privilege**: Minimum required permissions
3. **Defense in Depth**: Multiple security layers
4. **Audit Everything**: Complete audit trail
5. **Encryption**: Data encrypted at rest and in transit

**Key Components**:
- ASP.NET Core Identity (User Management)
- OpenIddict (OAuth2/OIDC Server)
- JWT Authentication (15min access, 7d refresh)
- RBAC Authorization
- MFA Support (TOTP)
- Compliance (GDPR, SOC2, ISO27001)

## Luồng Xác thực / Authentication Flow

```mermaid
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
    participant Client
    participant API as API Gateway
    participant IAM as IAM Service
    participant DB as Database
    participant Cache as Redis

    Client->>API: Login Request<br/>(email + password)
    API->>IAM: Forward Request
    IAM->>DB: Verify Credentials
    DB-->>IAM: User + Hash
    IAM->>IAM: bcrypt.compare()<br/>(cost 12)

    alt Valid Credentials
        IAM->>IAM: Generate Tokens<br/>(Access + Refresh)
        IAM->>DB: Store Refresh Token<br/>(hashed SHA-256)
        IAM->>Cache: Cache Permissions<br/>(5min TTL)
        IAM-->>API: Tokens + User
        API-->>Client: Set httpOnly Cookies
    else Invalid
        IAM-->>Client: 401 Unauthorized
    end
```

### VI: Chi tiết Xác thực

**1. Password Hashing**:
- Thuật toán: ASP.NET Core Identity (PBKDF2 với HMAC-SHA256)
- Cost factor: 100,000 iterations
- Password tối thiểu: 8 ký tự với quy tắc phức tạp

**2. JWT Tokens (OpenIddict)**:
- Access Token: 15 phút expiry
- Refresh Token: 7 ngày expiry
- Thuật toán: RS256 (asymmetric signing)
- Payload: sub, name, email, roles

**3. Token Storage**:
- Access: Bearer token trong Authorization header
- Refresh: Database SHA-256 hash (OpenIddict stores)

**4. MFA Support (Xác thực Hai yếu tố)**:
- TOTP (RFC 6238) cho authenticator apps
- QR code để thiết lập (Google Authenticator, Authy)
- Recovery codes (10 mã dùng một lần)
- Secret key lưu qua UserManager.SetAuthenticationTokenAsync

**5. Email Verification (Xác minh Email)**:
- Gửi email xác minh qua SMTP (MailKit)
- Token generation: UserManager.GenerateEmailConfirmationTokenAsync
- Link xác minh với token và userId
- Đặt EmailConfirmed = true khi xác nhận

**6. Social Login (Đăng nhập Mạng xã hội)**:
- Tích hợp Google OAuth 2.0
- Tích hợp Facebook OAuth
- Liên kết tài khoản cho users hiện có (theo email)
- Tự động xác nhận email cho social logins
- Lưu provider info qua UserManager.AddLoginAsync

### EN: Authentication Details

**1. Password Hashing**:
- Algorithm: bcrypt with cost factor 12
- Never store plaintext passwords
- Minimum password: 8 chars with complexity rules

**2. JWT Tokens**:
- Access Token: 15 minutes expiry
- Refresh Token: 7 days expiry
- Algorithm: RS256 (asymmetric signing)
- Payload: userId, roles, permissions

**3. Token Storage**:
- Access: httpOnly cookie (secure, sameSite)
- Refresh: Database SHA-256 hash
- Rotation: New refresh token on each use

**4. MFA Support (Two-Factor Authentication)**:
- TOTP (Time-based One-Time Password) using RFC 6238
- QR code generation for authenticator apps (Google Authenticator, Authy)
- Recovery codes (10 single-use codes)
- Secret key storage: UserManager.SetAuthenticationTokenAsync

**5. Email Verification**:
- SMTP-based verification emails via MailKit
- Token generation using UserManager.GenerateEmailConfirmationTokenAsync
- Verification link with token and userId
- EmailConfirmed flag set true upon confirmation

**6. Social Login (OAuth2 Providers)**:
- Google OAuth 2.0 integration
- Facebook OAuth integration
- Account linking for existing users (by email match)
- Auto email confirmation for social logins
- Provider info stored via UserManager.AddLoginAsync

## Mô hình Phân quyền / Authorization Model

```mermaid
graph TD
    User[User] --> Roles[Roles]
    User --> DirectPerms[Direct Permissions]

    Roles --> RolePerms[Role Permissions]

    RolePerms --> Check{Permission Check}
    DirectPerms --> Check

    Check -->|Granted| Resource[Access Resource]
    Check -->|Denied| Reject[403 Forbidden]

    subgraph "Permission Model"
        Perm[Permission<br/>resource:action:scope]
    end

    style Check fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
    style Perm fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
```

### VI: RBAC (Role-Based Access Control)

**1. Cấp bậc Role**:
```
SuperAdmin > OrgAdmin > Manager > User > Guest
```

**2. Format Permission**: `resource:action:scope`
- Resource: `users`, `roles`, `permissions`
- Action: `create`, `read`, `update`, `delete`
- Scope: `own`, `org`, `global`

**Ví dụ**:
- `users:read:own` - Đọc profile của chính mình
- `users:update:org` - Update users trong organization
- `roles:create:global` - Tạo roles globally

**3. Permission Caching**:
```typescript
Cache key: user:{userId}:permissions
TTL: 5 phút
Invalidate khi: role change, permission change
```

### EN: RBAC (Role-Based Access Control)

**1. Role Hierarchy**:
```
SuperAdmin > OrgAdmin > Manager > User > Guest
```

**2. Permission Format**: `resource:action:scope`
- Resource: `users`, `roles`, `permissions`
- Action: `create`, `read`, `update`, `delete`
- Scope: `own`, `org`, `global`

**Examples**:
- `users:read:own` - Read own user profile
- `users:update:org` - Update users in organization
- `roles:create:global` - Create roles globally

**3. Permission Caching**:
```typescript
Cache key: user:{userId}:permissions
TTL: 5 minutes
Invalidate on: role change, permission change
```

## Kiến trúc Zero-Trust / Zero-Trust Architecture

```mermaid
graph TD
    Request[Request] --> Device[Device Fingerprint]
    Device --> IP[IP Address Validation]
    IP --> Behavior[Behavioral Analysis]
    Behavior --> Session[Session Binding]

    Session -->|Valid| Allow[Allow Request]
    Session -->|Suspicious| MFA[Require MFA]
    Session -->|Anomaly| Block[Block + Alert]

    style Block fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
    style MFA fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
    style Allow fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
```

### VI: Thành phần Zero-Trust

**1. Device Fingerprinting**:
- Browser: User-Agent, Canvas, WebGL
- Screen resolution, timezone, language
- Phát hiện plugin, fonts có sẵn
- Hash fingerprint → Lưu với session

**2. IP Address Validation**:
- Whitelist IPs đã biết cho user
- Alert với IP mới + require MFA
- Block IPs đáng ngờ (VPN, Tor)

**3. Behavioral Analysis**:
- Login patterns (time, location)
- API usage patterns
- Failed auth attempts
- Alert với anomalies

**4. Session Binding**:
- Bind session với device fingerprint
- Bind session với IP address
- Invalidate khi mismatch

### EN: Zero-Trust Components

**1. Device Fingerprinting**:
- Browser: User-Agent, Canvas, WebGL
- Screen resolution, timezone, language
- Plugin detection, fonts available
- Hash fingerprint → Store with session

**2. IP Address Validation**:
- Whitelist known IPs per user
- Alert on new IP + require MFA
- Block suspicious IPs (VPN, Tor)

**3. Behavioral Analysis**:
- Login patterns (time, location)
- API usage patterns
- Failed auth attempts
- Alert on anomalies

**4. Session Binding**:
- Bind session to device fingerprint
- Bind session to IP address
- Invalidate on mismatch

## Bảo vệ Dữ liệu / Data Protection

### VI: Chiến lược Mã hóa

**1. Data at Rest**:
- PII: AES-256-GCM encryption
- Passwords: bcrypt (cost 12)
- Tokens: SHA-256 hash
- Keys: Environment variables + K8s secrets

**2. Data in Transit**:
- TLS 1.2+ cho mọi giao tiếp
- HTTPS enforcement
- Certificate pinning (mobile clients)

**3. Key Management**:
- Unique key per encryption operation
- 32+ character ENCRYPTION_KEY
- Rotate keys hàng quý / quarterly
- Không bao giờ hardcode secrets

### EN: Encryption Strategy

**1. Data at Rest**:
- PII: AES-256-GCM encryption
- Passwords: bcrypt (cost 12)
- Tokens: SHA-256 hash
- Keys: Environment variables + K8s secrets

**2. Data in Transit**:
- TLS 1.2+ for all communications
- HTTPS enforcement
- Certificate pinning (mobile clients)

**3. Key Management**:
- Unique key per encryption operation
- 32+ character ENCRYPTION_KEY
- Rotate keys quarterly
- Never hardcode secrets

## Tuân thủ & Kiểm toán / Compliance & Audit

### VI: Yêu cầu Tuân thủ

**1. GDPR**:
- Right to erasure (soft delete + hard delete sau 90 ngày)
- Data portability (export dữ liệu user)
- Quản lý consent
- Thông báo breach (72 giờ)

**2. SOC2**:
- Access controls (RBAC)
- Encryption at rest và in transit
- Audit logging (7 năm retention)
- Incident response plan

**3. Audit Trail**:
```typescript
{
  eventType: 'auth.login.success',
  userId: 'user_123',
  timestamp: '2024-01-15T10:30:00Z',
  ipAddress: '192.168.1.1',
  deviceFingerprint: 'fp_xyz',
  metadata: {...}
}
```

### EN: Compliance Requirements

**1. GDPR**:
- Right to erasure (soft delete + hard delete after 90 days)
- Data portability (export user data)
- Consent management
- Breach notification (72 hours)

**2. SOC2**:
- Access controls (RBAC)
- Encryption at rest and in transit
- Audit logging (7-year retention)
- Incident response plan

```typescript
{
  eventType: 'auth.login.success',
  userId: 'user_123',
  timestamp: '2024-01-15T10:30:00Z',
  ipAddress: '192.168.1.1',
  deviceFingerprint: 'fp_xyz',
  metadata: {...}
}
```

## Bối cảnh Hệ thống / System Context

```mermaid
%%{init: {'theme': 'dark'}}%%
C4Context
    title Sơ đồ Bối cảnh Security Architecture

    Person(user, "Người dùng / User", "End user accessing platform")
    Person(admin, "Quản trị viên / Admin", "System administrator")
    Person(attacker, "Kẻ tấn công / Attacker", "Potential threat actor")

    System(iam, "IAM Service", "Authentication & Authorization")

    System_Ext(db, "Neon PostgreSQL", "Encrypted user credentials & sessions")
    System_Ext(cache, "Redis", "Permission & session cache")
    System_Ext(audit, "Audit Service", "Security event logging")
    System_Ext(mfa, "MFA Provider", "TOTP verification")
    System_Ext(monitoring, "Security Monitoring", "SIEM & alerting")

    Rel(user, iam, "Authenticates", "HTTPS + TLS 1.2+")
    Rel(admin, iam, "Manages permissions", "HTTPS + TLS 1.2+")
    Rel(attacker, iam, "Blocked by security layers", "")

    Rel(iam, db, "Stores credentials", "PostgreSQL + TLS")
    Rel(iam, cache, "Caches permissions", "Redis + TLS")
    Rel(iam, audit, "Logs security events", "Kafka")
    Rel(iam, mfa, "Verifies MFA", "HTTPS")
    Rel(iam, monitoring, "Sends security metrics", "Prometheus + Loki")
```

**VI Mô tả**:
- **IAM Service**: Trung tâm xác thực và phân quyền
- **Database**: Lưu trữ credentials đã mã hóa, sessions, permissions
- **Cache**: Cache permissions và sessions để giảm database load
- **Audit Service**: Nhận và lưu trữ tất cả security events
- **MFA Provider**: External TOTP verification service (Google Authenticator compatible)
- **Security Monitoring**: SIEM (Security Information and Event Management) và alerting

**EN Description**:
- **IAM Service**: Central authentication and authorization
- **Database**: Stores encrypted credentials, sessions, permissions
- **Cache**: Caches permissions and sessions to reduce database load
- **Audit Service**: Receives and stores all security events
- **MFA Provider**: External TOTP verification service (Google Authenticator compatible)
- **Security Monitoring**: SIEM (Security Information and Event Management) and alerting

## Kiến trúc Database / Database Architecture

```mermaid
%%{init: {'theme': 'dark'}}%%
erDiagram
    User ||--o{ Session : has
    User ||--o{ UserRole : has
    User ||--o{ UserPermission : has
    User ||--o{ MFADevice : has
    User ||--o{ LoginHistory : has
    User ||--o{ DeviceFingerprint : has

    Role ||--o{ UserRole : assigned_to
    Role ||--o{ RolePermission : has

    Permission ||--o{ RolePermission : granted_to
    Permission ||--o{ UserPermission : granted_to

    Organization ||--o{ User : contains
    Organization ||--o{ Role : defines

    User {
        string id PK "CUID"
        string email UK "Unique, indexed"
        string passwordHash "bcrypt cost 12"
        string organizationId FK
        boolean mfaEnabled "MFA required?"
        datetime lastLoginAt "Tracking"
        datetime createdAt "Timestamp"
        datetime updatedAt "Timestamp"
        datetime deletedAt "Soft delete"
    }

    Session {
        string id PK "CUID"
        string userId FK
        string refreshTokenHash "SHA-256"
        string deviceFingerprint "Hashed"
        string ipAddress "IPv4/IPv6"
        string userAgent "Browser info"
        datetime expiresAt "7 days TTL"
        datetime lastActivityAt "Tracking"
        datetime createdAt "Timestamp"
    }

    Role {
        string id PK "CUID"
        string name "role-name"
        string organizationId FK
        int hierarchy "Priority level"
        boolean isSystem "Built-in?"
        datetime createdAt "Timestamp"
    }

    Permission {
        string id PK "CUID"
        string resource "users, roles, etc"
        string action "create, read, update, delete"
        string scope "own, org, global"
        datetime createdAt "Timestamp"
    }

    MFADevice {
        string id PK "CUID"
        string userId FK
        string type "totp, backup"
        string secret "Encrypted TOTP secret"
        boolean verified "Verified?"
        datetime lastUsedAt "Tracking"
        datetime createdAt "Timestamp"
    }

    LoginHistory {
        string id PK "CUID"
        string userId FK
        boolean success "Success/Failure"
        string ipAddress "IPv4/IPv6"
        string deviceFingerprint "Hashed"
        string failureReason "If failed"
        datetime timestamp "Event time"
    }

    DeviceFingerprint {
        string id PK "CUID"
        string userId FK
        string fingerprint "Hashed"
        boolean trusted "Auto-approved?"
        datetime firstSeenAt "First use"
        datetime lastSeenAt "Last use"
    }
```

**VI Mô tả**:
- **User**: Lưu credentials đã hash, MFA settings, organization membership
- **Session**: Lưu refresh tokens đã hash, device fingerprint, IP tracking
- **Role & Permission**: RBAC hierarchy với system roles và custom roles
- **MFADevice**: TOTP secrets (encrypted), backup codes
- **LoginHistory**: Audit trail cho tất cả login attempts (success/failure)
- **DeviceFingerprint**: Trusted device tracking cho zero-trust model

**Bảo mật Database**:
- Password hashes: bcrypt với cost factor 12
- Token hashes: SHA-256
- MFA secrets: AES-256-GCM encryption
- Soft deletes: `deletedAt` field, hard delete sau 90 ngày (GDPR)
- Indexes: email (unique), userId (foreign keys), timestamps

**EN Description**:
- **User**: Stores hashed credentials, MFA settings, organization membership
- **Session**: Stores hashed refresh tokens, device fingerprint, IP tracking
- **Role & Permission**: RBAC hierarchy with system roles and custom roles
- **MFADevice**: TOTP secrets (encrypted), backup codes
- **LoginHistory**: Audit trail for all login attempts (success/failure)
- **DeviceFingerprint**: Trusted device tracking for zero-trust model

**Database Security**:
- Password hashes: bcrypt with cost factor 12
- Token hashes: SHA-256
- MFA secrets: AES-256-GCM encryption
- Soft deletes: `deletedAt` field, hard delete after 90 days (GDPR)
- Indexes: email (unique), userId (foreign keys), timestamps

## Quyết định Thiết kế / Design Decisions

### Quyết định 1: JWT với RS256 (Asymmetric)

**VI Bối cảnh**: Cần stateless authentication với khả năng verify tokens ở multiple services

**VI Quyết định**: Sử dụng JWT với RS256 (RSA asymmetric signing) thay vì HS256 (HMAC symmetric)

**VI Hậu quả**:
- ✅ **Tích cực**:
  - Services có thể verify tokens với public key, không cần secret
  - Key rotation dễ dàng hơn (chỉ cần distribute public key mới)
  - Bảo mật cao hơn (private key chỉ ở IAM service)
  - Compliance: Audit trail rõ ràng về ai sign tokens
- ❌ **Tiêu cực**:
  - Chậm hơn HS256 một chút (~10-20% slower)
  - Phức tạp hơn trong key management
  - Public/private key pair phải được bảo vệ cẩn thận

**VI Các lựa chọn thay thế**: HS256 (symmetric), EdDSA, OAuth 2.0 with Opaque Tokens

**EN Context**: Need stateless authentication with ability to verify tokens in multiple services

**EN Decision**: Use JWT with RS256 (RSA asymmetric signing) instead of HS256 (HMAC symmetric)

**EN Consequences**:
- ✅ **Positive**:
  - Services can verify tokens with public key, don't need secret
  - Easier key rotation (only distribute new public key)
  - Higher security (private key only in IAM service)
  - Compliance: Clear audit trail of who signs tokens
- ❌ **Negative**:
  - Slightly slower than HS256 (~10-20% slower)
  - More complex key management
  - Public/private key pair must be carefully protected

**EN Alternatives**: HS256 (symmetric), EdDSA, OAuth 2.0 with Opaque Tokens

---

### Quyết định 2: Zero-Trust Model với Device Fingerprinting

**VI Bối cảnh**: Cần bảo vệ chống lại credential theft, session hijacking và unauthorized access

**VI Quyết định**: Triển khai zero-trust model với device fingerprinting, IP validation, behavioral analysis

**VI Hậu quả**:
- ✅ **Tích cực**:
  - Phát hiện được anomalies (new device, new IP, unusual behavior)
  - Tăng security khi detect và block suspicious activities
  - Compliance: SOC2, ISO27001 requirements
  - User experience: Auto-approve trusted devices
- ❌ **Tiêu cực**:
  - Complexity cao hơn
  - Potential false positives (legitimate users blocked)
  - Performance overhead (fingerprint hash, IP check)
  - Privacy concerns (tracking devices, IPs)

**VI Các lựa chọn thay thế**: Basic authentication only, IP whitelist only, MFA required for all

**EN Context**: Need to protect against credential theft, session hijacking, and unauthorized access

**EN Decision**: Implement zero-trust model with device fingerprinting, IP validation, behavioral analysis

**EN Consequences**:
- ✅ **Positive**:
  - Detect anomalies (new device, new IP, unusual behavior)
  - Increased security by detecting and blocking suspicious activities
  - Compliance: SOC2, ISO27001 requirements
  - User experience: Auto-approve trusted devices
- ❌ **Negative**:
  - Higher complexity
  - Potential false positives (legitimate users blocked)
  - Performance overhead (fingerprint hash, IP check)
  - Privacy concerns (tracking devices, IPs)

**EN Alternatives**: Basic authentication only, IP whitelist only, MFA required for all

---

### Quyết định 3: Event Sourcing cho Audit Trail

**VI Bối cảnh**: Cần immutable audit trail cho compliance (GDPR, SOC2, HIPAA) và security forensics

**VI Quyết định**: Sử dụng event sourcing pattern để lưu tất cả auth/security events

**VI Hậu quả**:
- ✅ **Tích cực**:
  - Immutable audit trail (không thể modify/delete)
  - Complete history của tất cả security events
  - Compliance: GDPR (7-year retention), SOC2, HIPAA
  - Security forensics: Trace back attacks, breaches
  - Replay events để reconstruct state
- ❌ **Tiêu cực**:
  - Storage cost cao (retain 7 years)
  - Complexity trong event schema versioning
  - Performance: Event publishing overhead
  - Data privacy: Must anonymize PII after retention period

**VI Các lựa chọn thay thế**: Database audit logs only, External SIEM only, No audit trail

**EN Context**: Need immutable audit trail for compliance (GDPR, SOC2, HIPAA) and security forensics

**EN Decision**: Use event sourcing pattern to store all auth/security events

**EN Consequences**:
- ✅ **Positive**:
  - Immutable audit trail (cannot modify/delete)
  - Complete history of all security events
  - Compliance: GDPR (7-year retention), SOC2, HIPAA
  - Security forensics: Trace back attacks, breaches
  - Replay events to reconstruct state
- ❌ **Negative**:
  - High storage cost (retain 7 years)
  - Complexity in event schema versioning
  - Performance: Event publishing overhead
  - Data privacy: Must anonymize PII after retention period

**EN Alternatives**: Database audit logs only, External SIEM only, No audit trail

## Đặc điểm Hiệu suất / Performance Characteristics

| Chỉ số / Metric | Mục tiêu / Target | Ghi chú / Notes |
|-----------------|-------------------|-----------------|
| **Login Time (P95)** | < 500ms | Including bcrypt verification |
| **Login Time (P99)** | < 1s | Peak load |
| **Token Generation (P95)** | < 50ms | JWT sign with RS256 |
| **Token Verification (P95)** | < 10ms | JWT verify with public key |
| **Permission Check (P95)** | < 5ms | From cache (L1 or L2) |
| **Permission Check (Cache Miss)** | < 50ms | Database query |
| **MFA Verification (P95)** | < 100ms | TOTP validation |
| **Session Lookup (P95)** | < 10ms | Redis cache |
| **Password Hash (P95)** | < 200ms | bcrypt cost 12 |
| **Device Fingerprint Hash** | < 5ms | SHA-256 |
| **Failed Login Rate Limit** | 5 attempts / 15min | Per user |
| **Auth Throughput** | 500 req/s | Per IAM instance |

**VI Tối ưu hóa Hiệu suất**:
- **Permission Caching**: L1 (memory) + L2 (Redis), TTL 5 phút
- **Token Caching**: Cache public key in memory for JWT verification
- **Connection Pooling**: Reuse database connections
- **Async Operations**: Event publishing, audit logging (fire-and-forget)
- **Rate Limiting**: Prevent brute force attacks, reduce load
- **Horizontal Scaling**: Multiple IAM service instances

**EN Performance Optimizations**:
- **Permission Caching**: L1 (memory) + L2 (Redis), TTL 5 minutes
- **Token Caching**: Cache public key in memory for JWT verification
- **Connection Pooling**: Reuse database connections
- **Async Operations**: Event publishing, audit logging (fire-and-forget)
- **Rate Limiting**: Prevent brute force attacks, reduce load
- **Horizontal Scaling**: Multiple IAM service instances

## Triển khai / Deployment

```mermaid
graph TD
    subgraph "Security Layer"
        LB[Load Balancer<br/>TLS Termination]
        WAF[WAF / Firewall<br/>Rate Limiting<br/>DDoS Protection]
    end

    subgraph "IAM Service Layer"
        IAM1[IAM Service Pod 1<br/>Stateless]
        IAM2[IAM Service Pod 2<br/>Stateless]
        IAM3[IAM Service Pod 3<br/>Stateless]
    end

    subgraph "Data Layer"
        DB[(Neon PostgreSQL<br/>Encrypted at Rest)]
        Cache[(Redis Cluster<br/>TLS Enabled)]
        Vault[Secrets Manager<br/>K8s Secrets]
    end

    subgraph "Security Monitoring"
        SIEM[SIEM / Security Monitoring]
        Alerts[Alerting System]
    end

    Client[Clients] --> LB
    LB --> WAF
    WAF --> IAM1
    WAF --> IAM2
    WAF --> IAM3

    IAM1 --> DB
    IAM1 --> Cache
    IAM1 --> Vault

    IAM2 --> DB
    IAM2 --> Cache
    IAM2 --> Vault

    IAM3 --> DB
    IAM3 --> Cache
    IAM3 --> Vault

    IAM1 -.->|Security Events| SIEM
    IAM2 -.->|Security Events| SIEM
    IAM3 -.->|Security Events| SIEM

    SIEM -.->|Alerts| Alerts

    style LB fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
    style WAF fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
    style DB fill:#7e22ce,stroke:#fff,stroke-width:2px,color:#fff
    style Cache fill:#1f2937,stroke:#fff,stroke-width:2px,color:#fff
    style Vault fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
    style SIEM fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
```

### VI: Chiến lược Triển khai

**Security Deployment**:
- **TLS 1.2+ Enforcement**: All connections require TLS
- **Network Policies (K8s)**: Deny all by default, whitelist specific services
- **Pod Security Policies**: Non-root user, read-only filesystem, no privilege escalation
- **Secrets Management**: Kubernetes secrets with encryption at rest
- **Image Scanning**: Trivy/Clair scan before deployment
- **RBAC (K8s)**: Least privilege for service accounts

**Resource Allocation**:
| Component | CPU | Memory | Replicas |
|-----------|-----|--------|----------|
| **IAM Service** | 500m | 1GB | 3-10 (HPA) |
| **Redis** | 1 core | 2GB | 3 masters + 3 slaves |

**Security Configuration**:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: iam-service-policy
spec:
  podSelector:
    matchLabels:
      app: iam-service
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api-gateway
    ports:
    - protocol: TCP
      port: 5000
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgresql
    ports:
    - protocol: TCP
      port: 5432
  - to:
    - podSelector:
        matchLabels:
          app: redis
    ports:
    - protocol: TCP
      port: 6379
```

**Deployment Security Checklist**:
- [ ] TLS 1.2+ enforced
- [ ] Network policies configured
- [ ] Pod security policies applied
- [ ] Secrets encrypted at rest
- [ ] Container images scanned
- [ ] Non-root user in containers
- [ ] Read-only root filesystem
- [ ] Resource limits set
- [ ] Health checks configured
- [ ] Security monitoring enabled

### EN: Deployment Strategy

**Security Deployment**:
- **TLS 1.2+ Enforcement**: All connections require TLS
- **Network Policies (K8s)**: Deny all by default, whitelist specific services
- **Pod Security Policies**: Non-root user, read-only filesystem, no privilege escalation
- **Secrets Management**: Kubernetes secrets with encryption at rest
- **Image Scanning**: Trivy/Clair scan before deployment
- **RBAC (K8s)**: Least privilege for service accounts

**Resource Allocation**:
| Component | CPU | Memory | Replicas |
|-----------|-----|--------|----------|
| **IAM Service** | 500m | 1GB | 3-10 (HPA) |
| **Redis** | 1 core | 2GB | 3 masters + 3 slaves |

**Security Configuration**:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: iam-service-policy
spec:
  podSelector:
    matchLabels:
      app: iam-service
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api-gateway
    ports:
    - protocol: TCP
      port: 5000
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgresql
    ports:
    - protocol: TCP
      port: 5432
  - to:
    - podSelector:
        matchLabels:
          app: redis
    ports:
    - protocol: TCP
      port: 6379
```

**Deployment Security Checklist**:
- [ ] TLS 1.2+ enforced
- [ ] Network policies configured
- [ ] Pod security policies applied
- [ ] Secrets encrypted at rest
- [ ] Container images scanned
- [ ] Non-root user in containers
- [ ] Read-only root filesystem
- [ ] Resource limits set
- [ ] Health checks configured
- [ ] Security monitoring enabled

## Giám sát & Khả năng quan sát / Monitoring & Observability

### VI: Chỉ số Chính

**Authentication Metrics**:
- `auth_login_attempts_total` - Total login attempts (counter, labels: status=success/failure)
- `auth_login_duration_seconds` - Login duration (histogram)
- `auth_token_generations_total` - Token generations (counter)
- `auth_token_verifications_total` - Token verifications (counter, labels: status=valid/invalid/expired)
- `auth_mfa_verifications_total` - MFA verifications (counter, labels: status=success/failure)

**Authorization Metrics**:
- `auth_permission_checks_total` - Permission checks (counter, labels: result=granted/denied)
- `auth_permission_cache_hits_total` - Permission cache hits (counter)
- `auth_permission_cache_misses_total` - Permission cache misses (counter)

**Security Metrics**:
- `auth_failed_login_rate` - Failed login rate per user (gauge)
- `auth_account_lockouts_total` - Account lockouts (counter)
- `auth_suspicious_activities_total` - Suspicious activities detected (counter, labels: type)
- `auth_anomalies_detected_total` - Anomalies detected (counter, labels: anomaly_type)
- `auth_password_reset_requests_total` - Password reset requests (counter)

**Session Metrics**:
- `auth_active_sessions` - Active sessions (gauge)
- `auth_session_creations_total` - Session creations (counter)
- `auth_session_invalidations_total` - Session invalidations (counter, labels: reason)

**Application Code**:
```typescript
import { Counter, Histogram, Gauge } from 'prom-client';

export const loginAttempts = new Counter({
  name: 'auth_login_attempts_total',
  help: 'Total login attempts',
  labelNames: ['status']
});

export const loginDuration = new Histogram({
  name: 'auth_login_duration_seconds',
  help: 'Login duration in seconds',
  buckets: [0.1, 0.3, 0.5, 0.7, 1, 2, 5]
});

export const permissionChecks = new Counter({
  name: 'auth_permission_checks_total',
  help: 'Total permission checks',
  labelNames: ['result']
});

export const suspiciousActivities = new Counter({
  name: 'auth_suspicious_activities_total',
  help: 'Suspicious activities detected',
  labelNames: ['type']
});

loginAttempts.inc({ status: 'success' });
loginDuration.observe(duration);
permissionChecks.inc({ result: 'granted' });
suspiciousActivities.inc({ type: 'new_device' });
```

**Alerting Rules**:
```yaml
groups:
  - name: security_alerts
    interval: 30s
    rules:
      - alert: HighFailedLoginRate
        expr: rate(auth_login_attempts_total{status="failure"}[5m]) > 10
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High failed login rate detected"
          description: "Failed login rate is {{ $value }}/sec"

      - alert: BruteForceAttack
        expr: |
          sum by (user_id) (
            rate(auth_login_attempts_total{status="failure"}[1m])
          ) > 5
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Potential brute force attack"
          description: "User {{ $labels.user_id }} has > 5 failed logins/min"

      - alert: AccountLockoutSpike
        expr: rate(auth_account_lockouts_total[5m]) > 5
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Account lockout spike detected"
          description: "Lockout rate is {{ $value }}/sec"

      - alert: SuspiciousActivity
        expr: rate(auth_suspicious_activities_total[5m]) > 10
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Suspicious activity detected"
          description: "Suspicious activity rate: {{ $value }}/sec"

      - alert: AnomalyDetected
        expr: auth_anomalies_detected_total > 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Security anomaly detected"
          description: "{{ $labels.anomaly_type }} detected"

      - alert: PermissionDeniedSpike
        expr: rate(auth_permission_checks_total{result="denied"}[5m]) > 50
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High permission denied rate"
          description: "Permission denied rate: {{ $value }}/sec"
```

**Security Dashboards**:
- **Authentication Overview**: Login success/failure rate, login duration, MFA usage
- **Authorization Overview**: Permission checks, cache hit rate, denied requests
- **Security Events**: Suspicious activities, anomalies, account lockouts
- **Session Management**: Active sessions, session creations/invalidations
- **Compliance**: Audit trail completeness, retention policy compliance

**Logging**:
```typescript
logger.info('Login successful', {
  eventType: 'auth.login.success',
  userId: user.id,
  email: user.email,
  ipAddress: req.ip,
  deviceFingerprint: fingerprint,
  mfaUsed: user.mfaEnabled,
  correlationId: req.correlationId
});

logger.warn('Suspicious activity detected', {
  eventType: 'security.suspicious_activity',
  userId: user.id,
  activityType: 'new_device',
  ipAddress: req.ip,
  deviceFingerprint: newFingerprint,
  correlationId: req.correlationId
});

logger.error('Login failed', {
  eventType: 'auth.login.failure',
  email: email,
  reason: 'invalid_credentials',
  ipAddress: req.ip,
  attemptCount: failedAttempts,
  correlationId: req.correlationId
});
```

**Audit Trail Monitoring**:
- Event publishing rate and latency
- Event consumption lag
- Audit log completeness (no gaps)
- Retention policy compliance
- Anonymization after retention period

### EN: Key Metrics

**Authentication Metrics**:
- `auth_login_attempts_total` - Total login attempts (counter, labels: status=success/failure)
- `auth_login_duration_seconds` - Login duration (histogram)
- `auth_token_generations_total` - Token generations (counter)
- `auth_token_verifications_total` - Token verifications (counter, labels: status=valid/invalid/expired)
- `auth_mfa_verifications_total` - MFA verifications (counter, labels: status=success/failure)

**Authorization Metrics**:
- `auth_permission_checks_total` - Permission checks (counter, labels: result=granted/denied)
- `auth_permission_cache_hits_total` - Permission cache hits (counter)
- `auth_permission_cache_misses_total` - Permission cache misses (counter)

**Security Metrics**:
- `auth_failed_login_rate` - Failed login rate per user (gauge)
- `auth_account_lockouts_total` - Account lockouts (counter)
- `auth_suspicious_activities_total` - Suspicious activities detected (counter, labels: type)
- `auth_anomalies_detected_total` - Anomalies detected (counter, labels: anomaly_type)
- `auth_password_reset_requests_total` - Password reset requests (counter)

**Session Metrics**:
- `auth_active_sessions` - Active sessions (gauge)
- `auth_session_creations_total` - Session creations (counter)
- `auth_session_invalidations_total` - Session invalidations (counter, labels: reason)

**Application Code**:
```typescript
import { Counter, Histogram, Gauge } from 'prom-client';

export const loginAttempts = new Counter({
  name: 'auth_login_attempts_total',
  help: 'Total login attempts',
  labelNames: ['status']
});

export const loginDuration = new Histogram({
  name: 'auth_login_duration_seconds',
  help: 'Login duration in seconds',
  buckets: [0.1, 0.3, 0.5, 0.7, 1, 2, 5]
});

export const permissionChecks = new Counter({
  name: 'auth_permission_checks_total',
  help: 'Total permission checks',
  labelNames: ['result']
});

export const suspiciousActivities = new Counter({
  name: 'auth_suspicious_activities_total',
  help: 'Suspicious activities detected',
  labelNames: ['type']
});

loginAttempts.inc({ status: 'success' });
loginDuration.observe(duration);
permissionChecks.inc({ result: 'granted' });
suspiciousActivities.inc({ type: 'new_device' });
```

**Alerting Rules**:
```yaml
groups:
  - name: security_alerts
    interval: 30s
    rules:
      - alert: HighFailedLoginRate
        expr: rate(auth_login_attempts_total{status="failure"}[5m]) > 10
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High failed login rate detected"
          description: "Failed login rate is {{ $value }}/sec"

      - alert: BruteForceAttack
        expr: |
          sum by (user_id) (
            rate(auth_login_attempts_total{status="failure"}[1m])
          ) > 5
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Potential brute force attack"
          description: "User {{ $labels.user_id }} has > 5 failed logins/min"

      - alert: AccountLockoutSpike
        expr: rate(auth_account_lockouts_total[5m]) > 5
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Account lockout spike detected"
          description: "Lockout rate is {{ $value }}/sec"

      - alert: SuspiciousActivity
        expr: rate(auth_suspicious_activities_total[5m]) > 10
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Suspicious activity detected"
          description: "Suspicious activity rate: {{ $value }}/sec"

      - alert: AnomalyDetected
        expr: auth_anomalies_detected_total > 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Security anomaly detected"
          description: "{{ $labels.anomaly_type }} detected"

      - alert: PermissionDeniedSpike
        expr: rate(auth_permission_checks_total{result="denied"}[5m]) > 50
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High permission denied rate"
          description: "Permission denied rate: {{ $value }}/sec"
```

**Security Dashboards**:
- **Authentication Overview**: Login success/failure rate, login duration, MFA usage
- **Authorization Overview**: Permission checks, cache hit rate, denied requests
- **Security Events**: Suspicious activities, anomalies, account lockouts
- **Session Management**: Active sessions, session creations/invalidations
- **Compliance**: Audit trail completeness, retention policy compliance

**Logging**:
```typescript
logger.info('Login successful', {
  eventType: 'auth.login.success',
  userId: user.id,
  email: user.email,
  ipAddress: req.ip,
  deviceFingerprint: fingerprint,
  mfaUsed: user.mfaEnabled,
  correlationId: req.correlationId
});

logger.warn('Suspicious activity detected', {
  eventType: 'security.suspicious_activity',
  userId: user.id,
  activityType: 'new_device',
  ipAddress: req.ip,
  deviceFingerprint: newFingerprint,
  correlationId: req.correlationId
});

logger.error('Login failed', {
  eventType: 'auth.login.failure',
  email: email,
  reason: 'invalid_credentials',
  ipAddress: req.ip,
  attemptCount: failedAttempts,
  correlationId: req.correlationId
});
```

**Audit Trail Monitoring**:
- Event publishing rate and latency
- Event consumption lag
- Audit log completeness (no gaps)
- Retention policy compliance
- Anonymization after retention period

## Tài liệu Liên quan / Related Documentation

- [System Design](./system-design.md) - Kiến trúc tổng thể / Overall architecture
- [IAM Architecture](./iam-proposal.md) - Triển khai IAM service / IAM service implementation
- [Event-Driven Architecture](./event-driven-architecture.md) - Audit event streaming

---

**Cập nhật Lần cuối / Last Updated**: 2026-01-07
**Tác giả / Authors**: GoodGo Security Team

## Quick Tips

### 🎨 Color Palette Reference (Dark Theme)

| Node Type | Color | Hex | Tailwind | Usage | Example |
|-----------|-------|-----|----------|-------|---------|
| **Primary** | Blue | `#1d4ed8` | `bg-blue-700` | Core components, Identity, IAM, Permission Checks | JWT Validation, Auth Services |
| **Secondary**| Purple| `#7e22ce` | `bg-purple-700`| Data stores, Database, Queues | PostgreSQL, Redis |
| **Success** | Green | `#15803d` | `bg-green-700` | Valid, Allowed, Safe, Completed, TLS | Allow Request, Secure Connection |
| **Error** | Red | `#b91c1c` | `bg-red-700` | Blocked, Invalid, Failed, Critical, Encryption Keys | Block + Alert, Vault, Critical Errors |
| **Warning** | Orange| `#c2410c` | `bg-orange-700`| MFA, Suspicious, Latency, Cache, Alerts | Require MFA, WAF, SIEM |
| **Base** | Grey | `#1f2937` | `bg-gray-800` | External systems, Infrastructure, Logs | Cache, Monitoring |

### 🔧 Mermaid Common Issues

| Issue | Sign | Solution |
|-------|------|----------|
| **Parse Error** | Unexpected PIPE/character | Check for missing spaces after `graph TD` |
| **Box Not Showing** | Node missing in diagram | Verify node syntax: `Node[Label]` |
| **Color Not Applied** | Node has no color | Add style: `style Node fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff` |
| **Arrow Issues** | Connection not visible | Check arrow syntax: `-->` (solid), `-.->`(dashed) |
| **Text Not Readable** | Dark text on dark bg | Always use `color:#fff` (white text) |
| **Subgraph Issues** | Broken layout | Ensure proper indentation and `end` statement |

### 📊 Color Pattern Quick Reference

```mermaid
graph LR
    A[Input] --> B[Process]
    B --> C{Decision}
    C -->|Yes| D[Success]
    C -->|No| E[Error]

    style A fill:#1f2937,stroke:#fff,stroke-width:2px,color:#fff
    style B fill:#1d4ed8,stroke:#fff,stroke-width:2px,color:#fff
    style C fill:#c2410c,stroke:#fff,stroke-width:2px,color:#fff
    style D fill:#15803d,stroke:#fff,stroke-width:2px,color:#fff
    style E fill:#b91c1c,stroke:#fff,stroke-width:2px,color:#fff
```

**Pattern Template**:
```
style NodeName fill:#color,stroke:#fff,stroke-width:2px,color:#fff
```

### 🎯 Visual Indicators

| Emoji | Meaning | Color | Usage |
|-------|---------|-------|-------|
| ✅ | Secure/Allowed/Valid | Green (#15803d) | Successful auth, allowed access |
| ❌ | Blocked/Denied/Invalid | Red (#b91c1c) | Failed login, access denied |
| ⚠️ | Warning/MFA/Alert | Orange (#c2410c) | Require MFA, suspicious activity |
| 🔒 | Encrypted/Secure | Blue/Purple (#1d4ed8, #7e22ce) | Encrypted data, secure channel |
| ☁️ | Cloud/External | Grey (#1f2937) | External services, cloud resources |
| 🔑 | Authentication | Orange (#c2410c) | Auth tokens, keys, credentials |
| 🛡️ | Security Layer | Green (#15803d) | Security controls, protection |
| 📊 | Monitoring | Blue (#1d4ed8) | Metrics, dashboards, logs |

### 🚀 Diagram Best Practices

1. **Always use dark palette** with white text (`color:#fff`)
2. **Consistent stroke**: `stroke:#fff,stroke-width:2px`
3. **Logical color mapping**:
   - Blue = Core processes
   - Green = Success/Allow
   - Red = Error/Block
   - Orange = Warning/MFA
   - Purple = Data stores
   - Grey = External systems

4. **Readable labels**: Use `<br/>` for line breaks in labels
5. **Arrow clarity**: Solid (`-->`) for main flow, dashed (`-.->`) for secondary/async
6. **Subgraph organization**: Group related components

### 🔍 Mermaid Debugging Checklist

- [ ] Graph type declared? (`graph TD`, `sequenceDiagram`, `erDiagram`)
- [ ] All nodes have unique IDs?
- [ ] Arrows have proper syntax? (`-->`, `-.->`, `-.->>`)
- [ ] Style definitions after graph content?
- [ ] All subgraphs have `end` statement?
- [ ] Labels escaped properly? (use quotes for special chars)
- [ ] Color values correct? (6-digit hex with #)
- [ ] White text applied? (`color:#fff`)