16 KiB
Security Architecture
Comprehensive security architecture for GoodGo platform with zero-trust model, RBAC, and compliance
Overview Diagram
graph TD
Request[Client Request] --> TLS[TLS/HTTPS Layer]
TLS --> RateLimit[Rate Limiting]
RateLimit --> JWT[JWT Validation]
JWT --> RBAC[RBAC Authorization]
RBAC --> ZeroTrust[Zero-Trust Checks]
ZeroTrust --> Service[Service Logic]
Service --> Encrypt[Data Encryption<br/>AES-256-GCM]
Encrypt --> DB[(Encrypted Data)]
Service --> Audit[Audit Logging]
Audit --> AuditDB[(Audit Trail<br/>7-year retention)]
style TLS fill:#d4edda
style JWT fill:#e1f5ff
style Encrypt fill:#f8d7da
style Audit fill:#fff4e1
Architecture Description
The GoodGo Security Architecture implements defense-in-depth with multiple security layers:
Security Principles:
- Zero Trust: Never trust, always verify
- Least Privilege: Minimum required permissions
- Defense in Depth: Multiple security layers
- Audit Everything: Complete audit trail
- Encryption: Data encrypted at rest and in transit
Key Components:
- JWT Authentication (15min access, 7d refresh)
- RBAC + ABAC Authorization
- Zero-Trust Device Validation
- AES-256-GCM Encryption
- Event Sourcing for Audit Trail
- Compliance (GDPR, SOC2, ISO27001, HIPAA)
Authentication Flow
sequenceDiagram
participant Client
participant API as API Gateway
participant IAM as IAM Service
participant DB as Database
participant Cache as Redis
Client->>API: Login Request<br/>(email + password)
API->>IAM: Forward Request
IAM->>DB: Verify Credentials
DB-->>IAM: User + Hash
IAM->>IAM: bcrypt.compare()<br/>(cost 12)
alt Valid Credentials
IAM->>IAM: Generate Tokens<br/>(Access + Refresh)
IAM->>DB: Store Refresh Token<br/>(hashed SHA-256)
IAM->>Cache: Cache Permissions<br/>(5min TTL)
IAM-->>API: Tokens + User
API-->>Client: Set httpOnly Cookies
else Invalid
IAM-->>Client: 401 Unauthorized
end
Authentication Details:
1. Password Hashing:
- Algorithm: bcrypt with cost factor 12
- Never store plaintext passwords
- Minimum password: 8 chars with complexity rules
2. JWT Tokens:
- Access Token: 15 minutes expiry
- Refresh Token: 7 days expiry
- Algorithm: RS256 (asymmetric signing)
- Payload: userId, roles, permissions
3. Token Storage:
- Access: httpOnly cookie (secure, sameSite)
- Refresh: Database SHA-256 hash
- Rotation: New refresh token on each use
4. MFA Support:
- TOTP (Time-based One-Time Password)
- Backup codes (10 single-use)
- Recovery email verification
Authorization Model
graph TD
User[User] --> Roles[Roles]
User --> DirectPerms[Direct Permissions]
Roles --> RolePerms[Role Permissions]
RolePerms --> Check{Permission Check}
DirectPerms --> Check
Check -->|Granted| Resource[Access Resource]
Check -->|Denied| Reject[403 Forbidden]
subgraph "Permission Model"
Perm[Permission<br/>resource:action:scope]
end
style Check fill:#e1f5ff
style Perm fill:#fff4e1
RBAC (Role-Based Access Control):
1. Role Hierarchy:
SuperAdmin > OrgAdmin > Manager > User > Guest
2. Permission Format: resource:action:scope
- Resource:
users,roles,permissions - Action:
create,read,update,delete - Scope:
own,org,global
Examples:
users:read:own- Read own user profileusers:update:org- Update users in organizationroles:create:global- Create roles globally
3. Permission Caching:
// Cache key: user:{userId}:permissions
// TTL: 5 minutes
// Invalidate on: role change, permission change
Zero-Trust Architecture
graph TD
Request[Request] --> Device[Device Fingerprint]
Device --> IP[IP Address Validation]
IP --> Behavior[Behavioral Analysis]
Behavior --> Session[Session Binding]
Session -->|Valid| Allow[Allow Request]
Session -->|Suspicious| MFA[Require MFA]
Session -->|Anomaly| Block[Block + Alert]
style Block fill:#f8d7da
style MFA fill:#fff3cd
style Allow fill:#d4edda
Zero-Trust Components:
1. Device Fingerprinting:
- Browser: User-Agent, Canvas, WebGL
- Screen resolution, timezone, language
- Plugin detection, fonts available
- Hash fingerprint → Store with session
2. IP Address Validation:
- Whitelist known IPs per user
- Alert on new IP + require MFA
- Block suspicious IPs (VPN, Tor)
3. Behavioral Analysis:
- Login patterns (time, location)
- API usage patterns
- Failed auth attempts
- Alert on anomalies
4. Session Binding:
- Bind session to device fingerprint
- Bind session to IP address
- Invalidate on mismatch
Data Protection
Encryption Strategy:
1. Data at Rest:
- PII: AES-256-GCM encryption
- Passwords: bcrypt (cost 12)
- Tokens: SHA-256 hash
- Keys: Environment variables + K8s secrets
2. Data in Transit:
- TLS 1.2+ for all communications
- HTTPS enforcement
- Certificate pinning (mobile clients)
3. Key Management:
- Unique key per encryption operation
- 32+ character ENCRYPTION_KEY
- Rotate keys quarterly
- Never hardcode secrets
Compliance & Audit
Compliance Requirements:
1. GDPR:
- Right to erasure (soft delete + hard delete after 90 days)
- Data portability (export user data)
- Consent management
- Breach notification (72 hours)
2. SOC2:
- Access controls (RBAC)
- Encryption at rest and in transit
- Audit logging (7-year retention)
- Incident response plan
// Event sourcing for all auth events
{
eventType: 'auth.login.success',
userId: 'user_123',
timestamp: '2024-01-15T10:30:00Z',
ipAddress: '192.168.1.1',
deviceFingerprint: 'fp_xyz',
metadata: {...}
}
System Context
C4Context
title Security Architecture Context
Person(user, "User", "End user accessing platform")
Person(admin, "Admin", "System administrator")
Person(attacker, "Attacker", "Potential threat actor")
System(iam, "IAM Service", "Authentication & Authorization")
System_Ext(db, "Neon PostgreSQL", "Encrypted user credentials & sessions")
System_Ext(cache, "Redis", "Permission & session cache")
System_Ext(audit, "Audit Service", "Security event logging")
System_Ext(mfa, "MFA Provider", "TOTP verification")
System_Ext(monitoring, "Security Monitoring", "SIEM & alerting")
Rel(user, iam, "Authenticates", "HTTPS + TLS 1.2+")
Rel(admin, iam, "Manages permissions", "HTTPS + TLS 1.2+")
Rel(attacker, iam, "Blocked by security layers", "")
Rel(iam, db, "Stores credentials", "PostgreSQL + TLS")
Rel(iam, cache, "Caches permissions", "Redis + TLS")
Rel(iam, audit, "Logs security events", "Kafka")
Rel(iam, mfa, "Verifies MFA", "HTTPS")
Rel(iam, monitoring, "Sends security metrics", "Prometheus + Loki")
Context Description:
- IAM Service: Central authentication and authorization
- Database: Stores encrypted credentials, sessions, permissions
- Cache: Caches permissions and sessions to reduce database load
- Audit Service: Receives and stores all security events
- MFA Provider: External TOTP verification service (Google Authenticator compatible)
- Security Monitoring: SIEM (Security Information and Event Management) and alerting
Database Architecture
erDiagram
User ||--o{ Session : has
User ||--o{ UserRole : has
User ||--o{ UserPermission : has
User ||--o{ MFADevice : has
User ||--o{ LoginHistory : has
User ||--o{ DeviceFingerprint : has
Role ||--o{ UserRole : assigned_to
Role ||--o{ RolePermission : has
Permission ||--o{ RolePermission : granted_to
Permission ||--o{ UserPermission : granted_to
Organization ||--o{ User : contains
Organization ||--o{ Role : defines
User {
string id PK "CUID"
string email UK "Unique, indexed"
string passwordHash "bcrypt cost 12"
string organizationId FK
boolean mfaEnabled "MFA required?"
datetime lastLoginAt "Tracking"
datetime createdAt "Timestamp"
datetime updatedAt "Timestamp"
datetime deletedAt "Soft delete"
}
Session {
string id PK "CUID"
string userId FK
string refreshTokenHash "SHA-256"
string deviceFingerprint "Hashed"
string ipAddress "IPv4/IPv6"
string userAgent "Browser info"
datetime expiresAt "7 days TTL"
datetime lastActivityAt "Tracking"
datetime createdAt "Timestamp"
}
Role {
string id PK "CUID"
string name "role-name"
string organizationId FK
int hierarchy "Priority level"
boolean isSystem "Built-in?"
datetime createdAt "Timestamp"
}
Permission {
string id PK "CUID"
string resource "users, roles, etc"
string action "create, read, update, delete"
string scope "own, org, global"
datetime createdAt "Timestamp"
}
MFADevice {
string id PK "CUID"
string userId FK
string type "totp, backup"
string secret "Encrypted TOTP secret"
boolean verified "Verified?"
datetime lastUsedAt "Tracking"
datetime createdAt "Timestamp"
}
LoginHistory {
string id PK "CUID"
string userId FK
boolean success "Success/Failure"
string ipAddress "IPv4/IPv6"
string deviceFingerprint "Hashed"
string failureReason "If failed"
datetime timestamp "Event time"
}
DeviceFingerprint {
string id PK "CUID"
string userId FK
string fingerprint "Hashed"
boolean trusted "Auto-approved?"
datetime firstSeenAt "First use"
datetime lastSeenAt "Last use"
}
Description:
- User: Stores hashed credentials, MFA settings, organization membership
- Session: Stores hashed refresh tokens, device fingerprint, IP tracking
- Role & Permission: RBAC hierarchy with system roles and custom roles
- MFADevice: TOTP secrets (encrypted), backup codes
- LoginHistory: Audit trail for all login attempts (success/failure)
- DeviceFingerprint: Trusted device tracking for zero-trust model
Database Security:
- Password hashes: bcrypt with cost factor 12
- Token hashes: SHA-256
- MFA secrets: AES-256-GCM encryption
- Soft deletes:
deletedAtfield, hard delete after 90 days (GDPR) - Indexes: email (unique), userId (foreign keys), timestamps
Design Decisions
Decision 1: JWT with RS256 (Asymmetric)
Context: Need stateless authentication with ability to verify tokens in multiple services
Decision: Use JWT with RS256 (RSA asymmetric signing) instead of HS256 (HMAC symmetric)
Consequences:
- ✅ Positive:
- Services can verify tokens with public key, don't need secret
- Easier key rotation (only distribute new public key)
- Higher security (private key only in IAM service)
- Compliance: Clear audit trail of who signs tokens
- ❌ Negative:
- Slightly slower than HS256 (~10-20% slower)
- More complex key management
- Public/private key pair must be carefully protected
Alternatives: HS256 (symmetric), EdDSA, OAuth 2.0 with Opaque Tokens
Decision 2: Zero-Trust Model with Device Fingerprinting
Context: Need to protect against credential theft, session hijacking, and unauthorized access
Decision: Implement zero-trust model with device fingerprinting, IP validation, behavioral analysis
Consequences:
- ✅ Positive:
- Detect anomalies (new device, new IP, unusual behavior)
- Increased security by detecting and blocking suspicious activities
- Compliance: SOC2, ISO27001 requirements
- User experience: Auto-approve trusted devices
- ❌ Negative:
- Higher complexity
- Potential false positives (legitimate users blocked)
- Performance overhead (fingerprint hash, IP check)
- Privacy concerns (tracking devices, IPs)
Alternatives: Basic authentication only, IP whitelist only, MFA required for all
Decision 3: Event Sourcing for Audit Trail
Context: Need immutable audit trail for compliance (GDPR, SOC2, HIPAA) and security forensics
Decision: Use event sourcing pattern to store all auth/security events
Consequences:
- ✅ Positive:
- Immutable audit trail (cannot modify/delete)
- Complete history of all security events
- Compliance: GDPR (7-year retention), SOC2, HIPAA
- Security forensics: Trace back attacks, breaches
- Replay events to reconstruct state
- ❌ Negative:
- High storage cost (retain 7 years)
- Complexity in event schema versioning
- Performance: Event publishing overhead
- Data privacy: Must anonymize PII after retention period
Alternatives: Database audit logs only, External SIEM only, No audit trail
Performance Characteristics
| Metric | Target | Notes |
|---|---|---|
| Login Time (P95) | < 500ms | Including bcrypt verification |
| Login Time (P99) | < 1s | Peak load |
| Token Generation (P95) | < 50ms | JWT sign with RS256 |
| Token Verification (P95) | < 10ms | JWT verify with public key |
| Permission Check (P95) | < 5ms | From cache (L1 or L2) |
| Permission Check (Cache Miss) | < 50ms | Database query |
| MFA Verification (P95) | < 100ms | TOTP validation |
| Session Lookup (P95) | < 10ms | Redis cache |
| Password Hash (P95) | < 200ms | bcrypt cost 12 |
| Device Fingerprint Hash | < 5ms | SHA-256 |
| Failed Login Rate Limit | 5 attempts / 15min | Per user |
| Auth Throughput | 500 req/s | Per IAM instance |
Performance Optimizations:
- Permission Caching: L1 (memory) + L2 (Redis), TTL 5 minutes
- Token Caching: Cache public key in memory for JWT verification
- Connection Pooling: Reuse database connections
- Async Operations: Event publishing, audit logging (fire-and-forget)
- Rate Limiting: Prevent brute force attacks, reduce load
- Horizontal Scaling: Multiple IAM service instances
Deployment
graph TD
subgraph "Security Layer"
LB[Load Balancer<br/>TLS Termination]
WAF[WAF / Firewall<br/>Rate Limiting<br/>DDoS Protection]
end
subgraph "IAM Service Layer"
IAM1[IAM Service Pod 1<br/>Stateless]
IAM2[IAM Service Pod 2<br/>Stateless]
IAM3[IAM Service Pod 3<br/>Stateless]
end
subgraph "Data Layer"
DB[(Neon PostgreSQL<br/>Encrypted at Rest)]
Cache[(Redis Cluster<br/>TLS Enabled)]
Vault[Secrets Manager<br/>K8s Secrets]
end
subgraph "Security Monitoring"
SIEM[SIEM / Security Monitoring]
Alerts[Alerting System]
end
Client[Clients] --> LB
LB --> WAF
WAF --> IAM1
WAF --> IAM2
WAF --> IAM3
IAM1 --> DB
IAM1 --> Cache
IAM1 --> Vault
IAM2 --> DB
IAM2 --> Cache
IAM2 --> Vault
IAM3 --> DB
IAM3 --> Cache
IAM3 --> Vault
IAM1 -.->|Security Events| SIEM
IAM2 -.->|Security Events| SIEM
IAM3 -.->|Security Events| SIEM
SIEM -.->|Alerts| Alerts
style LB fill:#d4edda
style WAF fill:#fff3cd
style DB fill:#f0e1ff
style Cache fill:#fff4e1
style Vault fill:#f8d7da
style SIEM fill:#e1f5ff
Deployment Strategy:
Security Deployment:
- TLS 1.2+ Enforcement: All connections require TLS
- Network Policies (K8s): Deny all by default, whitelist specific services
- Pod Security Policies: Non-root user, read-only filesystem, no privilege escalation
- Secrets Management: Kubernetes secrets with encryption at rest
- Image Scanning: Trivy/Clair scan before deployment
- RBAC (K8s): Least privilege for service accounts
Resource Allocation:
| Component | CPU | Memory | Replicas |
|---|---|---|---|
| IAM Service | 500m | 1GB | 3-10 (HPA) |
| Redis | 1 core | 2GB | 3 masters + 3 slaves |
Security Configuration:
# K8s Network Policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: iam-service-policy
spec:
podSelector:
matchLabels:
app: iam-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 5000
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432