Files

Ho Ngoc Hai d8411abd24 Revise IAM Service Architecture documentation for clarity and comprehensiveness

- Updated the document title to reflect the focus on IAM Service Architecture.
- Expanded the overview section to provide a clearer description of the IAM Service's capabilities.
- Enhanced the table of contents for better navigation.
- Added detailed architecture diagrams illustrating the layered architecture and key components.
- Included comprehensive sections on authentication flows, authorization models, caching strategies, and security architecture.
- Improved overall structure and readability to facilitate understanding of the IAM Service's design and functionality.

These changes aim to provide developers with a thorough understanding of the IAM Service architecture and its components.

2026-01-02 00:30:26 +07:00

37 KiB

Raw Blame History

IAM Service Architecture

Enterprise-Grade Identity & Access Management Platform

This document describes the complete architecture of the IAM Service, including system components, data flows, security model, and integration patterns.

Overview
Overall Architecture
Authentication Flows
Authorization Model
Caching Strategy
Module Dependencies
Data Architecture
Security Architecture
Observability
Deployment Architecture

Overview

The IAM Service is a comprehensive Identity and Access Management platform providing:

Core Capabilities

Capability	Description	Modules
Authentication	User identity verification	Auth, Social, OIDC, MFA
Authorization	Access control and permissions	RBAC, ABAC, Policy Engine
Identity Management	User lifecycle and profiles	Users, Profiles, Organizations, Groups
Access Governance	Request workflows and reviews	Requests, Reviews, Analytics
Compliance	Regulatory compliance reporting	GDPR, SOC2, ISO27001, Risk Management
Security	Zero-trust and threat protection	Zero-Trust, Audit, Encryption
Performance	High-speed access with caching	Multi-layer Cache, Connection Pool

Key Metrics

50+ API Endpoints across 10 feature modules
30+ Database Models with comprehensive relationships
7 Authentication Methods (password, social x3, OIDC, MFA x2)
Multi-layer Caching (Memory → Redis → PostgreSQL)
Zero-Trust Security with device fingerprinting
Enterprise Compliance (GDPR, SOC2, ISO27001, HIPAA)

Overall Architecture

The IAM Service follows a layered architecture pattern with clear separation of concerns:

graph TB
    subgraph "Client Layer"
        WebApp[Web Application<br/>React/Vue/Angular]
        MobileApp[Mobile App<br/>iOS/Android/React Native]
        Service[Other Microservices<br/>Product/Order/Payment]
    end

    subgraph "API Gateway"
        Traefik[Traefik Gateway<br/>Load Balancer + Routing<br/>Port 80/443]
    end

    subgraph "IAM Service - Port 3001"
        subgraph "Middleware Stack"
            Correlation[Correlation ID<br/>Request Tracking]
            Logger[Request Logger<br/>Winston]
            ZeroTrust[Zero-Trust Validator<br/>Device + Location + Behavior]
            RateLimit[Dynamic Rate Limiter<br/>Redis-backed<br/>50-1000 req/15min]
            Auth[Authentication<br/>JWT Verification]
            RBAC[Authorization<br/>RBAC + ABAC]
        end

        subgraph "Core Authentication Modules"
            AuthModule[Auth Module<br/>Login/Register/Logout]
            TokenModule[Token Module<br/>JWT + Cookie Management]
            SessionModule[Session Module<br/>Device Fingerprinting]
            SocialModule[Social Auth<br/>Google/FB/GitHub]
            OIDCModule[OIDC Provider<br/>OpenID Connect]
            MFAModule[MFA Module<br/>TOTP/WebAuthn]
            RBACModule[RBAC Module<br/>Roles + Permissions]
        end

        subgraph "Identity Management Modules"
            UserModule[User Lifecycle<br/>CRUD + Bulk Ops]
            ProfileModule[Profile Management<br/>Avatar + Custom Fields]
            VerificationModule[Identity Verification<br/>Email/Phone/Document]
            OrgModule[Organizations<br/>Multi-tenant Hierarchy]
            GroupModule[Groups<br/>Group-based Access]
        end

        subgraph "Access Management Modules"
            RequestModule[Access Requests<br/>Approval Workflows]
            ReviewModule[Access Reviews<br/>Certification Campaigns]
            AnalyticsModule[Access Analytics<br/>Usage + Risk Analysis]
        end

        subgraph "Governance Modules"
            ComplianceModule[Compliance<br/>GDPR/SOC2/ISO27001]
            PolicyModule[Policy Governance<br/>Templates + Versioning]
            RiskModule[Risk Management<br/>Scoring + Detection]
            ReportingModule[Reporting<br/>Dashboards + Exports]
        end

        subgraph "Core Services"
            CacheService[Multi-layer Cache<br/>Memory → Redis<br/>TTL: 60s-15min]
            AuditService[Audit Logging<br/>Event Sourcing<br/>AuthEvent Model]
            SecurityService[Security Services<br/>Encryption + Hashing]
        end
    end

    subgraph "Data Layer"
        PostgreSQL[(PostgreSQL 14+<br/>Primary Database<br/>Connection Pool<br/>30+ Models)]
        Redis[(Redis 6+<br/>Cache + Sessions<br/>Rate Limits + Locks)]
    end

    subgraph "External Services (Future)"
        EmailService[Email Service<br/>Verification + Notifications]
        SMSService[SMS Service<br/>OTP + Phone Verification]
        FileStorage[File Storage<br/>S3/GCS for Avatars]
    end

    subgraph "Observability Stack"
        Prometheus[Prometheus<br/>Metrics Collection]
        Jaeger[Jaeger<br/>Distributed Tracing]
        Logs[Structured Logs<br/>Winston + Loki]
    end

    %% Client to Gateway
    WebApp --> Traefik
    MobileApp --> Traefik
    Service --> Traefik

    %% Gateway to Middleware
    Traefik --> Correlation
    Correlation --> Logger
    Logger --> ZeroTrust
    ZeroTrust --> RateLimit
    RateLimit --> Auth
    Auth --> RBAC

    %% Middleware to Modules
    RBAC --> AuthModule
    RBAC --> UserModule
    RBAC --> RequestModule
    RBAC --> ComplianceModule

    %% Module Dependencies
    AuthModule --> TokenModule
    AuthModule --> SessionModule
    AuthModule --> SocialModule
    AuthModule --> OIDCModule
    AuthModule --> MFAModule
    AuthModule --> RBACModule

    UserModule --> ProfileModule
    UserModule --> VerificationModule
    UserModule --> OrgModule
    UserModule --> GroupModule

    RequestModule --> ReviewModule
    RequestModule --> AnalyticsModule

    ComplianceModule --> PolicyModule
    ComplianceModule --> RiskModule
    ComplianceModule --> ReportingModule

    %% Core Services
    AuthModule --> CacheService
    UserModule --> CacheService
    RBACModule --> CacheService

    AuthModule --> AuditService
    RequestModule --> AuditService
    ComplianceModule --> AuditService

    AuthModule --> SecurityService

    %% Data Layer
    CacheService --> Redis
    CacheService --> PostgreSQL

    AuthModule --> PostgreSQL
    UserModule --> PostgreSQL
    RequestModule --> PostgreSQL
    ComplianceModule --> PostgreSQL

    %% External Services (dashed - not yet integrated)
    VerificationModule -.-> EmailService
    VerificationModule -.-> SMSService
    ProfileModule -.-> FileStorage

    %% Observability
    AuthModule -.-> Prometheus
    UserModule -.-> Prometheus
    RequestModule -.-> Prometheus
    ComplianceModule -.-> Prometheus

    CacheService -.-> Jaeger
    AuditService -.-> Logs

    style ZeroTrust fill:#ff9999
    style CacheService fill:#99ccff
    style PostgreSQL fill:#cc99ff
    style Redis fill:#ffcc99
    style AuditService fill:#99ff99

Architecture Highlights

Layered Middleware: Every request passes through 6 middleware layers (correlation, logging, zero-trust, rate limiting, authentication, authorization)
Modular Design: 10 independent feature modules with clear boundaries
Multi-layer Caching: Memory (60s) → Redis (5-15min) → PostgreSQL for optimal performance
Event Sourcing: All authentication and authorization events logged for audit compliance
Zero-Trust Security: Continuous validation of device, location, and behavior
Dynamic Rate Limiting: Role-based limits (50-1000 req/15min)

Authentication Flows

Password-Based Authentication

Standard email/password authentication with bcrypt hashing and JWT token generation:

sequenceDiagram
    participant Client
    participant Middleware
    participant AuthService
    participant RBACService
    participant Database
    participant Redis
    participant AuditService

    Client->>Middleware: POST /api/v1/auth/login<br/>{email, password}

    Middleware->>Middleware: 1. Correlation ID
    Middleware->>Middleware: 2. Zero-Trust Validation<br/>(Device + IP + Behavior)
    Middleware->>Middleware: 3. Rate Limit Check<br/>(5 login/15min)

    Middleware->>AuthService: login(email, password)

    AuthService->>Database: Find user by email
    Database-->>AuthService: User record

    AuthService->>AuthService: Verify password<br/>(bcrypt compare, cost=12)

    alt Password Invalid
        AuthService->>AuditService: Log LOGIN_FAILED event
        AuditService->>Database: Save AuthEvent
        AuthService-->>Client: 401 Unauthorized<br/>"Invalid credentials"
    else Password Valid & MFA Disabled
        AuthService->>RBACService: Get user roles + permissions
        RBACService->>Database: Query UserRole, UserPermission
        Database-->>RBACService: Roles + Permissions
        RBACService-->>AuthService: ["ADMIN"], ["users:read:all"]

        AuthService->>AuthService: Generate JWT tokens<br/>Access: 15min<br/>Refresh: 7 days

        AuthService->>Database: Save refresh token
        AuthService->>Redis: Cache user data<br/>TTL: 15min
        AuthService->>Redis: Cache permissions<br/>TTL: 5min
        AuthService->>Database: Create session<br/>(device fingerprint)

        AuthService->>Database: Update lastLoginAt, loginCount
        AuthService->>AuditService: Log LOGIN_SUCCESS event

        AuthService-->>Client: 200 OK<br/>{user, tokens}<br/>Set-Cookie: refresh_token
    else Password Valid & MFA Enabled
        AuthService-->>Client: 200 OK<br/>{mfaRequired: true}
        Note over Client: Prompt for MFA code
        Client->>Middleware: POST /api/v1/mfa/verify<br/>{token: "123456"}
        Note over Middleware,AuthService: MFA verification flow...
    end

Key Security Features:

Bcrypt password hashing (cost factor 12 in production)
Token family tracking for rotation security
Device fingerprinting for session management
Zero-trust validation before authentication
Comprehensive audit logging

OAuth 2.0 integration with Google, Facebook, and GitHub:

sequenceDiagram
    participant Client
    participant IAM
    participant Google as Google OAuth
    participant Database
    participant Redis

    Client->>IAM: GET /api/v1/auth/social/google
    IAM->>IAM: Generate state token<br/>(CSRF protection)
    IAM->>Redis: Store state token<br/>TTL: 10min
    IAM-->>Client: 302 Redirect to Google<br/>with state param

    Client->>Google: OAuth consent screen
    Google-->>Client: Authorization code + state

    Client->>IAM: GET /api/v1/auth/social/google/callback<br/>?code=xxx&state=yyy

    IAM->>Redis: Verify state token
    alt State Invalid
        IAM-->>Client: 401 CSRF token invalid
    else State Valid
        IAM->>Google: Exchange code for tokens
        Google-->>IAM: Access token + User profile

        IAM->>Database: Find or create user<br/>by provider + providerId

        alt User Found
            IAM->>Database: Update social account tokens
        else New User
            IAM->>Database: Create user + social account
            IAM->>Database: Assign default role (USER)
        end

        IAM->>IAM: Generate IAM JWT tokens
        IAM->>Redis: Cache user + permissions
        IAM->>Database: Create session

        IAM-->>Client: 302 Redirect to app<br/>with tokens in URL/cookie
    end

Supported Providers:

Google OAuth 2.0
Facebook OAuth
GitHub OAuth
Apple Sign-In (future)
Microsoft OAuth (future)

MFA (Multi-Factor Authentication) Flow

TOTP-based two-factor authentication using authenticator apps:

sequenceDiagram
    participant Client
    participant IAM
    participant Authenticator as Authenticator App
    participant Database

    Note over Client,Database: MFA Enrollment Phase

    Client->>IAM: POST /api/v1/mfa/totp/enable
    Note over Client: User must be authenticated

    IAM->>IAM: Generate TOTP secret (32 chars)
    IAM->>IAM: Generate QR code<br/>(otpauth://totp/...)
    IAM-->>Client: {secret, qrCode, backupCodes}

    Client->>Authenticator: Scan QR code
    Authenticator-->>Client: Display 6-digit TOTP

    Client->>IAM: POST /api/v1/mfa/totp/verify<br/>{token: "123456"}

    IAM->>IAM: Verify TOTP token<br/>(30s window, ±1 interval)

    alt Token Valid
        IAM->>Database: Create MFADevice record<br/>(type: TOTP, secret: encrypted)
        IAM->>Database: Update user.mfaEnabled = true
        IAM-->>Client: 200 OK MFA enabled
    else Token Invalid
        IAM-->>Client: 401 Invalid TOTP token
    end

    Note over Client,Database: MFA Login Phase

    Client->>IAM: POST /api/v1/auth/login<br/>{email, password}
    IAM->>Database: Verify credentials

    alt MFA Required
        IAM-->>Client: 200 OK {mfaRequired: true}

        Client->>Authenticator: Get current TOTP
        Authenticator-->>Client: Current 6-digit code

        Client->>IAM: POST /api/v1/mfa/totp/validate<br/>{userId, token}

        IAM->>Database: Get MFADevice for user
        IAM->>IAM: Verify TOTP token

        alt Token Valid
            IAM->>IAM: Generate full JWT tokens
            IAM->>Database: Create session
            IAM->>Database: Update device.lastUsedAt
            IAM-->>Client: 200 OK {user, tokens}
        else Token Invalid
            IAM-->>Client: 401 Invalid MFA token
        end
    end

MFA Features:

TOTP (Time-based One-Time Password) via authenticator apps
QR code generation for easy enrollment
Backup codes for account recovery
Multiple MFA devices per user
WebAuthn/FIDO2 framework (future implementation)

Authorization Model

The IAM Service implements a hybrid authorization model combining RBAC (Role-Based Access Control) and ABAC (Attribute-Based Access Control):

Authorization Decision Flow

flowchart TD
    Start[Request to Protected Resource] --> Auth{Authenticated?}

    Auth -->|No| Deny401[401 Unauthorized<br/>Authentication required]
    Auth -->|Yes| Cache{Permission<br/>in Cache?}

    Cache -->|Hit| CacheValue{Cache<br/>Value?}
    CacheValue -->|Allow| Allow[Access Granted]
    CacheValue -->|Deny| Deny403Cache[403 Forbidden<br/>Cached denial]

    Cache -->|Miss| LoadPerms[Load Permissions<br/>from Database]

    LoadPerms --> DirectPerms{Has Direct<br/>User Permission?}

    DirectPerms -->|Explicit Deny| Deny403A[403 Forbidden<br/>Explicit permission denial]
    DirectPerms -->|Explicit Allow| Allow
    DirectPerms -->|None| RolePerms{Has Role<br/>Permission?}

    RolePerms -->|Yes| CheckExpiry{Role<br/>Expired?}
    RolePerms -->|No| GroupPerms{Has Group<br/>Permission?}

    CheckExpiry -->|Yes| GroupPerms
    CheckExpiry -->|No| Allow

    GroupPerms -->|Yes| Allow
    GroupPerms -->|No| ABACCheck{ABAC Policy<br/>Exists?}

    ABACCheck -->|No Policies| DefaultDeny[403 Forbidden<br/>Default deny - No permissions]
    ABACCheck -->|Has Policies| EvaluatePolicies[Evaluate Policies<br/>by Priority]

    EvaluatePolicies --> EvalConditions{Policy<br/>Conditions?}

    EvalConditions -->|Time-based| CheckTime{Current time<br/>in range?}
    EvalConditions -->|Location-based| CheckLocation{IP in<br/>allowed range?}
    EvalConditions -->|Attribute-based| CheckAttrs{Attributes<br/>match?}

    CheckTime -->|Yes| PolicyEffect{Policy<br/>Effect?}
    CheckTime -->|No| NextPolicy{More<br/>Policies?}

    CheckLocation -->|Yes| PolicyEffect
    CheckLocation -->|No| NextPolicy

    CheckAttrs -->|Yes| PolicyEffect
    CheckAttrs -->|No| NextPolicy

    PolicyEffect -->|ALLOW| Allow
    PolicyEffect -->|DENY| Deny403Policy[403 Forbidden<br/>Policy denial]

    NextPolicy -->|Yes| EvaluatePolicies
    NextPolicy -->|No| DefaultDeny

    Allow --> CacheResult[Cache Result<br/>TTL: 5min]
    CacheResult --> AuditLog[Log Access Event<br/>to AuthEvent]
    AuditLog --> Success[200 OK<br/>Process request]

    style Auth fill:#e1f5fe
    style DirectPerms fill:#fff3e0
    style RolePerms fill:#f3e5f5
    style GroupPerms fill:#e8f5e9
    style ABACCheck fill:#fce4ec
    style Allow fill:#c8e6c9
    style Deny401 fill:#ffcdd2
    style Deny403A fill:#ffcdd2
    style Deny403Cache fill:#ffcdd2
    style Deny403Policy fill:#ffcdd2
    style DefaultDeny fill:#ffcdd2

Permission Model

The IAM Service uses a hierarchical permission model:

Permission Format: resource:action:scope

Resource: The entity being accessed (e.g., users, products, orders)
Action: The operation being performed (e.g., read, create, update, delete)
Scope: The access boundary (e.g., own, team, all)

Examples:

users:read:all - Read all users
users:update:own - Update own user profile
products:create:team - Create products for team
orders:delete:all - Delete any order (admin)

Authorization Hierarchy

1. Direct User Permissions (HIGHEST PRIORITY)
   ↓ (if not found or not denied)
2. Role Permissions
   ↓ (if not found)
3. Group Permissions
   ↓ (if not found)
4. ABAC Policies (evaluated by priority)
   ↓ (if no policies match)
5. Default Deny (LOWEST PRIORITY)

RBAC (Role-Based Access Control)

graph LR
    User[User] --> UserRole1[UserRole]
    User --> UserRole2[UserRole]
    User --> UserPerm[UserPermission<br/>Direct Override]

    UserRole1 --> Role1[ADMIN Role]
    UserRole2 --> Role2[MANAGER Role]

    Role1 --> RolePerm1[RolePermission]
    Role1 --> RolePerm2[RolePermission]
    Role2 --> RolePerm3[RolePermission]

    RolePerm1 --> Perm1[users:*:all]
    RolePerm2 --> Perm2[products:*:all]
    RolePerm3 --> Perm3[orders:read:team]

    UserPerm --> Perm4[analytics:read:all]

    Group[Group] --> GroupMember[GroupMember]
    GroupMember --> User
    Group --> GroupPerm[GroupPermission]
    GroupPerm --> Perm5[reports:read:all]

    style User fill:#e1f5fe
    style Role1 fill:#f3e5f5
    style Role2 fill:#f3e5f5
    style Group fill:#e8f5e9
    style Perm1 fill:#fff3e0
    style Perm2 fill:#fff3e0
    style Perm3 fill:#fff3e0
    style Perm4 fill:#ffccbc
    style Perm5 fill:#c8e6c9

Features:

Multiple roles per user
Temporary role assignments with expiration
Direct user permissions (can override roles)
Group-based permissions
Permission caching (5 minutes TTL)

ABAC (Attribute-Based Access Control)

graph TD
    Request[Access Request] --> PolicyEngine[Policy Engine]

    PolicyEngine --> LoadPolicies[Load Policies<br/>for Resource]
    LoadPolicies --> SortPolicies[Sort by Priority<br/>Descending]

    SortPolicies --> EvaluatePolicy[Evaluate Policy<br/>Conditions]

    EvaluatePolicy --> TimeCondition{Time-based<br/>Condition?}
    EvaluatePolicy --> LocationCondition{Location-based<br/>Condition?}
    EvaluatePolicy --> AttributeCondition{Attribute-based<br/>Condition?}

    TimeCondition --> CheckTime[Check current time<br/>vs allowed hours]
    LocationCondition --> CheckIP[Check IP address<br/>vs allowed ranges]
    AttributeCondition --> CheckAttrs[Check user attributes<br/>vs required values]

    CheckTime --> ConditionMet{All Conditions<br/>Met?}
    CheckIP --> ConditionMet
    CheckAttrs --> ConditionMet

    ConditionMet -->|Yes| ApplyEffect{Policy<br/>Effect?}
    ConditionMet -->|No| NextPolicy{More<br/>Policies?}

    ApplyEffect -->|ALLOW| Allow[Access Granted]
    ApplyEffect -->|DENY| Deny[Access Denied]

    NextPolicy -->|Yes| EvaluatePolicy
    NextPolicy -->|No| DefaultDeny[Default Deny]

    style PolicyEngine fill:#f3e5f5
    style ConditionMet fill:#fff3e0
    style Allow fill:#c8e6c9
    style Deny fill:#ffcdd2
    style DefaultDeny fill:#ffcdd2

Policy Example (JSON Logic):

{
  "name": "Business Hours Only",
  "resource": "sensitive_data",
  "condition": {
    "and": [
      {
        ">=": [{"var": "hour"}, 9]
      },
      {
        "<=": [{"var": "hour"}, 17]
      },
      {
        "in": [{"var": "day"}, [1, 2, 3, 4, 5]]
      }
    ]
  },
  "effect": "ALLOW",
  "priority": 100
}

Caching Strategy

The IAM Service implements a multi-layer caching strategy for optimal performance:

graph TB
    Request[Incoming Request] --> CheckL1{L1 Cache<br/>Node.js Memory}

    CheckL1 -->|Cache Hit| L1Hit[Fast Response<br/>< 1ms latency]
    CheckL1 -->|Cache Miss| CheckL2{L2 Cache<br/>Redis}

    CheckL2 -->|Cache Hit| L2Hit[Medium Response<br/>< 10ms latency]
    CheckL2 -->|Cache Miss| Database[(PostgreSQL<br/>Source of Truth)]

    Database --> UpdateL2[Update L2 Cache<br/>Write to Redis]
    UpdateL2 --> UpdateL1[Update L1 Cache<br/>Write to Memory]

    UpdateL1 --> Response[Return Response]
    L2Hit --> UpdateL1
    L1Hit --> Response

    subgraph "Cache Layers"
        L1Cache[L1: In-Memory Cache<br/>node-cache<br/>TTL: 60 seconds<br/>Hot data only]
        L2Cache[L2: Redis Cache<br/>ioredis<br/>TTL: 5-15 minutes<br/>Distributed]
        DBLayer[L3: PostgreSQL<br/>Prisma ORM<br/>Source of truth<br/>Persistent]
    end

    subgraph "Cached Data Types"
        UserData[User Data<br/>TTL: 15min]
        Permissions[Permissions<br/>TTL: 5min]
        Tokens[Token Validation<br/>TTL: Token lifetime]
        Sessions[Sessions<br/>TTL: Session lifetime]
        Roles[User Roles<br/>TTL: 5min]
    end

    style L1Hit fill:#90EE90
    style L2Hit fill:#87CEEB
    style Database fill:#FFB6C1
    style L1Cache fill:#c8e6c9
    style L2Cache fill:#bbdefb
    style DBLayer fill:#f8bbd0

Cache Configuration

Data Type	L1 TTL	L2 TTL	Invalidation Strategy
User Data	60s	15min	On update/delete
Permissions	60s	5min	On role/permission change
User Roles	60s	5min	On role assignment/revocation
Token Validation	N/A	Token lifetime	On logout/revocation
Sessions	N/A	Session lifetime	On logout
Rate Limit Counters	N/A	15min window	Time-based expiry

Cache Invalidation

// Example: Invalidate user caches when permissions change
await rbacService.grantPermission(userId, permissionId);

// Automatically invalidates:
// - cacheService.keys.userPermissions(userId)
// - cacheService.keys.userRoles(userId)

Module Dependencies

The IAM Service is organized into 10 feature modules with clear dependencies:

graph TD
    subgraph "Presentation Layer"
        Routes[Routes/Controllers<br/>50+ Endpoints]
    end

    subgraph "Business Logic Layer"
        subgraph "Core Auth"
            AuthModule[Auth Module<br/>Login/Register/Logout]
            TokenModule[Token Module<br/>JWT + Cookies]
            SessionModule[Session Module<br/>Device Tracking]
        end

        subgraph "Extended Auth"
            RBACModule[RBAC Module<br/>Roles + Permissions]
            SocialModule[Social Auth<br/>Google/FB/GitHub]
            OIDCModule[OIDC Module<br/>Provider + Client]
            MFAModule[MFA Module<br/>TOTP/WebAuthn]
        end

        subgraph "Identity"
            IdentityModule[Identity Module<br/>Users + Profiles]
            OrgModule[Organization Module<br/>Multi-tenancy]
            GroupModule[Group Module<br/>Group Permissions]
        end

        subgraph "Access Governance"
            AccessModule[Access Module<br/>Requests + Reviews]
            AnalyticsModule[Analytics Module<br/>Usage Analysis]
        end

        subgraph "Compliance"
            GovernanceModule[Governance Module<br/>Compliance + Risk]
        end
    end

    subgraph "Core Services Layer"
        CacheService[Cache Service<br/>Multi-layer]
        AuditService[Audit Service<br/>Event Sourcing]
        SecurityService[Security Service<br/>Encryption]
        FeatureService[Feature Flags<br/>Runtime Config]
    end

    subgraph "Data Access Layer"
        Repositories[Repositories<br/>Data Access<br/>Prisma ORM]
    end

    subgraph "Infrastructure"
        Database[(PostgreSQL<br/>30+ Models)]
        Redis[(Redis<br/>Cache + Locks)]
    end

    %% Routes to Modules
    Routes --> AuthModule
    Routes --> IdentityModule
    Routes --> AccessModule
    Routes --> GovernanceModule

    %% Core Auth Dependencies
    AuthModule --> TokenModule
    AuthModule --> SessionModule
    AuthModule --> SocialModule
    AuthModule --> OIDCModule
    AuthModule --> MFAModule
    AuthModule --> RBACModule

    %% Identity Dependencies
    IdentityModule --> OrgModule
    IdentityModule --> GroupModule
    IdentityModule --> RBACModule

    %% Access Dependencies
    AccessModule --> AnalyticsModule
    AccessModule --> RBACModule

    %% Governance Dependencies
    GovernanceModule --> RBACModule

    %% Core Services
    AuthModule --> CacheService
    IdentityModule --> CacheService
    RBACModule --> CacheService

    AuthModule --> AuditService
    AccessModule --> AuditService
    GovernanceModule --> AuditService

    AuthModule --> SecurityService

    AuthModule --> FeatureService

    %% Data Access
    AuthModule --> Repositories
    IdentityModule --> Repositories
    AccessModule --> Repositories
    GovernanceModule --> Repositories
    RBACModule --> Repositories

    Repositories --> Database
    CacheService --> Redis
    CacheService --> Database

    style AuthModule fill:#e1f5fe
    style RBACModule fill:#f3e5f5
    style IdentityModule fill:#fff3e0
    style AccessModule fill:#e8f5e9
    style GovernanceModule fill:#fce4ec
    style CacheService fill:#bbdefb
    style AuditService fill:#c8e6c9
    style SecurityService fill:#ffccbc

Module Descriptions

Module	Responsibility	Key Files
Auth	User authentication and token management	`auth.service.ts`, `auth.controller.ts`
Token	JWT generation, validation, and refresh	`jwt.service.ts`, `cookie.service.ts`
Session	Session lifecycle and device management	`session.service.ts`
RBAC	Role and permission management	`rbac.service.ts`, `rbac.middleware.ts`
Social	OAuth integration with external providers	`social.service.ts`, `google.strategy.ts`
OIDC	OpenID Connect provider and client	`oidc-provider.service.ts`
MFA	Multi-factor authentication	`mfa.service.ts`, `totp.service.ts`
Identity	User lifecycle and profile management	`user.service.ts`, `profile.service.ts`
Organization	Multi-tenant organization support	`organization.service.ts`
Group	Group-based access control	`group.service.ts`
Access	Access request workflows and reviews	`request.service.ts`, `review.service.ts`
Analytics	Access analytics and reporting	`analytics.service.ts`
Governance	Compliance, policy, and risk management	`compliance.service.ts`, `risk.service.ts`
Cache	Multi-layer caching (Memory + Redis)	`cache.service.ts`
Audit	Event sourcing and audit logging	`audit.service.ts`
Security	Encryption, hashing, zero-trust	`zero-trust.validator.ts`

Data Architecture

The IAM Service uses PostgreSQL with 30+ Prisma models. See DATA-MODEL.md for complete Entity Relationship Diagram.

Model Categories

Core Authentication (7 models):
- User, Session, RefreshToken, AuthEvent, SocialAccount, MFADevice, Policy
Authorization (6 models):
- Role, Permission, UserRole, RolePermission, UserPermission, GroupPermission
Identity Management (6 models):
- Organization, Group, GroupMember, UserProfile, IdentityVerification
Access Management (4 models):
- AccessRequest, AccessRequestApprover, AccessReview, AccessReviewItem
Governance (3 models):
- ComplianceReport, PolicyTemplate, RiskScore

Key Relationships

User (1) ─── (*) UserRole ─── (*) Role ─── (*) RolePermission ─── (*) Permission
User (1) ─── (*) UserPermission ─── (*) Permission
User (1) ─── (*) Session
User (1) ─── (1) UserProfile
User (1) ─── (*) AccessRequest
Organization (1) ─── (*) User
Organization (1) ─── (*) Group ─── (*) GroupMember ─── (*) User

Security Architecture

The IAM Service implements defense-in-depth security with multiple layers:

Security Layers

graph TB
    Request[Incoming Request] --> Layer1[Layer 1: Network Security<br/>Traefik Gateway + TLS]

    Layer1 --> Layer2[Layer 2: Zero-Trust Validation<br/>Device + Location + Behavior]

    Layer2 --> Layer3[Layer 3: Rate Limiting<br/>Dynamic by Role<br/>50-1000 req/15min]

    Layer3 --> Layer4[Layer 4: Authentication<br/>JWT Validation<br/>Token Expiry: 15min]

    Layer4 --> Layer5[Layer 5: Authorization<br/>RBAC + ABAC<br/>Permission Checking]

    Layer5 --> Layer6[Layer 6: Input Validation<br/>Zod Schemas<br/>Sanitization]

    Layer6 --> Layer7[Layer 7: Audit Logging<br/>Event Sourcing<br/>All Actions Logged]

    Layer7 --> ProcessRequest[Process Request]

    style Layer1 fill:#ffccbc
    style Layer2 fill:#ff9999
    style Layer3 fill:#ffb74d
    style Layer4 fill:#fff176
    style Layer5 fill:#aed581
    style Layer6 fill:#4dd0e1
    style Layer7 fill:#9575cd
    style ProcessRequest fill:#c8e6c9

Security Features

Feature	Implementation	Location
Zero-Trust	Device fingerprinting, location, behavior analysis	`zero-trust.validator.ts`
Password Hashing	bcrypt (cost factor 12)	`auth.service.ts:43`
Token Security	JWT with HS256, 15min expiry, token rotation	`jwt.service.ts`
CSRF Protection	State tokens for OAuth, SameSite cookies	`cookie.service.ts`
Rate Limiting	Redis-backed, dynamic by role	`rate-limit.middleware.ts`
Input Validation	Zod schemas, sanitization	`validation.middleware.ts`
Audit Logging	Event sourcing (AuthEvent model)	`audit.service.ts`
Session Security	Device fingerprinting, IP tracking	`session.service.ts`
MFA	TOTP with 30s window, backup codes	`mfa.service.ts`

Threat Mitigation

Threat	Mitigation
Brute Force	Login rate limiting (5 attempts/15min), account lockout
Token Theft	Short token lifetime (15min), token rotation, device binding
CSRF	SameSite cookies, state tokens for OAuth
XSS	Content Security Policy, HttpOnly cookies
SQL Injection	Prisma ORM parameterized queries
Session Hijacking	Device fingerprinting, IP validation
Privilege Escalation	Strict permission checks, audit logging
Replay Attacks	Token expiry, nonce for OAuth

Observability

The IAM Service provides comprehensive observability with metrics, logs, and traces:

Observability Stack

graph TB
    subgraph "IAM Service"
        Application[Application Code]

        Application --> MetricsCollector[Metrics Collector<br/>Prometheus Format]
        Application --> Logger[Structured Logger<br/>Winston]
        Application --> Tracer[Distributed Tracer<br/>Jaeger Client]
    end

    subgraph "Collection Layer"
        Prometheus[Prometheus<br/>Metrics Storage]
        Loki[Loki<br/>Log Aggregation]
        Jaeger[Jaeger<br/>Trace Storage]
    end

    subgraph "Visualization Layer"
        Grafana[Grafana<br/>Dashboards + Alerts]
    end

    MetricsCollector --> Prometheus
    Logger --> Loki
    Tracer --> Jaeger

    Prometheus --> Grafana
    Loki --> Grafana
    Jaeger --> Grafana

    Grafana --> Alerts[Alert Manager<br/>Notifications]

    style Application fill:#e1f5fe
    style Prometheus fill:#f3e5f5
    style Loki fill:#fff3e0
    style Jaeger fill:#e8f5e9
    style Grafana fill:#fce4ec

Metrics (Prometheus)

Collected Metrics:

HTTP request duration (histogram)
HTTP request count (counter)
HTTP response status codes (counter)
Active sessions (gauge)
Cache hit/miss ratio (counter)
Database query duration (histogram)
Authentication success/failure rate (counter)
Permission check duration (histogram)

Endpoints:

/metrics - Prometheus metrics endpoint
/health/live - Liveness probe
/health/ready - Readiness probe

Logging (Winston)

Log Levels: ERROR, WARN, INFO, DEBUG

Structured Log Format:

{
  "level": "info",
  "message": "User logged in",
  "timestamp": "2024-01-01T00:00:00.000Z",
  "correlationId": "req-123-456",
  "userId": "user-789",
  "email": "user@example.com",
  "service": "iam-service"
}

Tracing (Jaeger)

Trace Spans:

HTTP request handling
Database queries
Cache operations
External API calls
Authentication flow
Authorization checks

Correlation IDs:

Every request gets a unique correlation ID
Propagated across service calls
Included in all logs and traces

Deployment Architecture

The IAM Service can be deployed in multiple configurations:

Local Development

graph LR
    Developer[Developer<br/>Localhost] --> LocalIAM[IAM Service<br/>pnpm dev<br/>Port 3001]

    LocalIAM --> LocalDB[(PostgreSQL<br/>Docker<br/>Port 5432)]
    LocalIAM --> LocalRedis[(Redis<br/>Docker<br/>Port 6379)]

    style LocalIAM fill:#e1f5fe
    style LocalDB fill:#f3e5f5
    style LocalRedis fill:#fff3e0

Docker Compose (Multi-Service)

graph TB
    subgraph "Docker Compose Network"
        Traefik[Traefik Gateway<br/>Port 80/443]

        Traefik --> IAMService[IAM Service<br/>Port 3001]
        Traefik --> ProductService[Product Service<br/>Port 3002]
        Traefik --> OrderService[Order Service<br/>Port 3003]

        IAMService --> SharedDB[(PostgreSQL<br/>Port 5432)]
        IAMService --> SharedRedis[(Redis<br/>Port 6379)]

        ProductService --> SharedDB
        ProductService --> SharedRedis

        OrderService --> SharedDB
        OrderService --> SharedRedis
    end

    style Traefik fill:#ffecb3
    style IAMService fill:#e1f5fe
    style SharedDB fill:#f3e5f5
    style SharedRedis fill:#fff3e0

Kubernetes (Production)

graph TB
    subgraph "Ingress Layer"
        Ingress[Ingress Controller<br/>NGINX/Traefik<br/>TLS Termination]
    end

    subgraph "Application Layer"
        IAMPod1[IAM Pod 1<br/>Replica 1]
        IAMPod2[IAM Pod 2<br/>Replica 2]
        IAMPod3[IAM Pod 3<br/>Replica 3]

        IAMService[IAM Service<br/>ClusterIP]
    end

    subgraph "Data Layer"
        PostgreSQL[(PostgreSQL<br/>StatefulSet<br/>Persistent Volume)]
        Redis[(Redis<br/>StatefulSet<br/>Sentinel HA)]
    end

    subgraph "Observability"
        Prometheus[Prometheus<br/>Metrics]
        Jaeger[Jaeger<br/>Tracing]
        Loki[Loki<br/>Logs]
    end

    Ingress --> IAMService
    IAMService --> IAMPod1
    IAMService --> IAMPod2
    IAMService --> IAMPod3

    IAMPod1 --> PostgreSQL
    IAMPod1 --> Redis
    IAMPod2 --> PostgreSQL
    IAMPod2 --> Redis
    IAMPod3 --> PostgreSQL
    IAMPod3 --> Redis

    IAMPod1 -.-> Prometheus
    IAMPod1 -.-> Jaeger
    IAMPod1 -.-> Loki

    style Ingress fill:#ffecb3
    style IAMPod1 fill:#e1f5fe
    style IAMPod2 fill:#e1f5fe
    style IAMPod3 fill:#e1f5fe
    style PostgreSQL fill:#f3e5f5
    style Redis fill:#fff3e0

Production Best Practices

High Availability:
- Multiple IAM service replicas (3+)
- PostgreSQL replication (primary + standby)
- Redis Sentinel for failover
Security:
- TLS/SSL for all connections
- Network policies for pod-to-pod communication
- Secrets management (HashiCorp Vault, AWS Secrets Manager)
- Non-root containers

Resource Limits:

resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 2000m
    memory: 2Gi

Health Checks:

livenessProbe:
  httpGet:
    path: /health/live
    port: 3001
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 3001
  initialDelaySeconds: 10
  periodSeconds: 5

Horizontal Pod Autoscaling:

minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70

Next Steps

User Guides: See GETTING-STARTED.md for setup instructions
API Reference: See API_REFERENCE.md for complete endpoint documentation
Security Model: See SECURITY-MODEL.md for security details
Data Model: See DATA-MODEL.md for database schema
Deployment: See DEPLOYMENT-GUIDE.md for deployment instructions

References

Security Skill: .cursor/skills/security/SKILL.md
IAM Proposal: docs/en/architecture/iam-proposal.md
Migration Guide: docs/en/guides/iam-migration.md
Prisma Schema: prisma/schema.prisma
Routes Definition: src/routes/index.ts

Last Updated: January 2026 Version: 1.0.0 Status: Production Ready

37 KiB Raw Blame History