Files
goodgo-platform/monitoring/prometheus/prometheus.yml
Ho Ngoc Hai abeb8fd322 feat(auth): complete MFA grace period for required roles + ops monitoring
Finishes the half-implemented MFA enforcement work and ships the SLO
monitoring rules at the same time.

MFA grace period (auth):
- New `mfa-policy.ts` central source of truth: `MFA_REQUIRED_ROLES = [ADMIN]`,
  `MFA_GRACE_PERIOD_DAYS = 14`, `MFA_REAUTH_WINDOW_MINUTES = 15`.
- New columns `User.mfaGraceStartedAt` + `User.mfaLastVerifiedAt`
  (migration `20260429000000_add_mfa_grace_columns`).
- `JwtPayload.mfa: 'none' | 'grace' | 'enrollment_required'` claim now
  carried in every access token so the FE + admin guards can react.
- `LoginUserHandler.resolveMfaGraceClaim()`:
  * If role requires MFA and user has not enrolled, lazy-stamp
    `mfaGraceStartedAt` on first login (returns `mfa: 'grace'`,
    `remainingDays: 14`).
  * After window expires → `mfa: 'enrollment_required'`, `remainingDays: 0`
    (callers must force enrolment on sensitive routes).
  * Otherwise → `mfa: 'none'`.
- `LocalStrategy` now passes `totpEnabled` + `mfaGraceStartedAt` through
  to the command so the handler can branch without an extra query.
- `IUserRepository` + `PrismaUserRepository` get
  `updateMfaGraceStartedAt` / `updateMfaLastVerifiedAt`.
- `UserEntity` carries the two new fields end-to-end (props, getters,
  `createNew` + `createPasswordless` factories). Fixed an orphan-property
  syntax bug in `createPasswordless` that was breaking typecheck.
- `oauth.service.ts` `UserEntity` construction now includes `deletedAt`
  + the two MFA fields (was missing required props).
- Add missing `jsonwebtoken` + `@types/jsonwebtoken` to `apps/api`
  (transitively pulled in via `jwt-rotation.ts` from commit 3705193 but
  never declared, so `tsc --noEmit` was failing).
- Update `login-user.handler.spec.ts` + `local.strategy.spec.ts` to cover
  grace-window + enrolment-required branches. 338/338 auth tests pass.

Ops monitoring:
- New `monitoring/prometheus/slo-rules.yml` with recording + alerting
  rules for the agreed SLOs.
- Wire it into `prometheus.yml` + alertmanager routing.
- Capture the SLO soak-test results in
  `docs/audits/slo-soak-test-log.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:00:23 +07:00

31 lines
716 B
YAML

global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- 'alert-rules.yml'
- 'slo-rules.yml'
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
scrape_configs:
- job_name: 'goodgo-api'
metrics_path: '/metrics'
static_configs:
# host.docker.internal for dev (API on host), api:3001 for prod (API in container)
- targets: ['host.docker.internal:3001']
labels:
service: 'goodgo-api'
environment: 'development'
- targets: ['api:3001']
labels:
service: 'goodgo-api'
environment: 'production'
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']