Extract shared `verifyWithRotation` helper and `makeSecretOrKeyProvider` into `jwt-rotation.ts` so both REST (passport-jwt strategy) and WebSocket (TokenService.verifyAccessToken) paths honour JWT_SECRET_PREVIOUS during secret rotation. Add env-validation for optional previous secrets and document the rotation policy for WebSocket sessions. Resolves GOO-237 Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
193 lines
8.7 KiB
Markdown
193 lines
8.7 KiB
Markdown
# Payment Gateway Secret Rotation Policy
|
|
|
|
> **Status:** Active — GOO-197 / parent [GOO-102](/GOO/issues/GOO-102) (CLO data-security work).
|
|
> **Owner:** Security Engineer + Platform on-call.
|
|
> **Last reviewed:** 2026-04-24.
|
|
|
|
This document is the canonical policy for rotating all secrets that gate
|
|
access to GoodGo's payment gateways and adjacent integrations (OAuth,
|
|
storage, webhook signing, JWT). It ships alongside the `SecretProvider`
|
|
abstraction in
|
|
`apps/api/src/modules/shared/domain/ports/secret-provider.port.ts` and the
|
|
env-backed implementation in
|
|
`apps/api/src/modules/shared/infrastructure/env-secret-provider.service.ts`.
|
|
|
|
---
|
|
|
|
## 1. Why rotate
|
|
|
|
A stolen or leaked HMAC key for a payment gateway is the most direct path
|
|
to financial fraud against GoodGo. Rotation reduces the **window of abuse**
|
|
when a key is exposed (insider misuse, accidental git commit, third-party
|
|
breach, log scraping, etc.). It also forces us to verify that every
|
|
runtime that relies on the key can still read it — i.e. that we have not
|
|
lost the ability to rotate.
|
|
|
|
## 2. Scope (rotation-sensitive secrets)
|
|
|
|
The following secrets are in scope. Each is registered with the
|
|
`SecretProvider` by default (see `DEFAULT_REGISTERED_SECRETS`) and has a
|
|
matching entry in `env-validation.ts`.
|
|
|
|
| Secret env var | Purpose | Cadence | Owner |
|
|
| ------------------------------ | ------------------------------- | -------- | -------------- |
|
|
| `JWT_SECRET` | Access-token HMAC | 90 days | Auth |
|
|
| `JWT_REFRESH_SECRET` | Refresh-token HMAC | 90 days | Auth |
|
|
| `VNPAY_HASH_SECRET` | VNPay request/callback HMAC | 90 days | Payments |
|
|
| `MOMO_SECRET_KEY` | MoMo request/callback HMAC | 90 days | Payments |
|
|
| `ZALOPAY_KEY1` | ZaloPay order signing | 90 days | Payments |
|
|
| `ZALOPAY_KEY2` | ZaloPay callback signing | 90 days | Payments |
|
|
| `BANK_TRANSFER_WEBHOOK_SECRET` | Bank-transfer webhook signature | 90 days | Payments |
|
|
| `GOOGLE_CLIENT_SECRET` | Google OAuth | 180 days | Auth |
|
|
| `ZALO_APP_SECRET` | Zalo OAuth | 180 days | Auth |
|
|
| `ZALO_OA_ACCESS_TOKEN` | Zalo Official Account API token | 90 days | Notifications |
|
|
| `MINIO_SECRET_KEY` | Object-storage access key | 180 days | Platform |
|
|
| `FIELD_ENCRYPTION_KEY` | At-rest PII encryption key | annually | Platform + CLO |
|
|
|
|
Secrets **not** in this table (e.g. `DATABASE_URL` password, `REDIS_HOST`)
|
|
follow the platform-credential rotation policy and are out of scope here.
|
|
|
|
## 3. Cadence and triggers
|
|
|
|
- **Routine rotation:** every 90 days for HMAC/signing keys, 180 days for
|
|
OAuth client secrets, annually for the field-encryption key (which has
|
|
expensive data-rewrap implications).
|
|
- **Event-driven rotation (always immediately):**
|
|
- any commit accidentally containing a real value of one of the secrets
|
|
above (regardless of how briefly);
|
|
- departure of any individual with production access to the secret store;
|
|
- downstream provider notification that the credential may be exposed;
|
|
- confirmed or strongly suspected breach of any system that handled the
|
|
secret in plaintext (CI runner, dev laptop, log aggregator, …).
|
|
|
|
## 4. Operator workflow (env-backed backend)
|
|
|
|
1. **Generate** a new high-entropy value:
|
|
|
|
```bash
|
|
openssl rand -base64 48
|
|
```
|
|
|
|
2. **Stage the dual-key grace period.** Copy the current secret to the
|
|
`_PREVIOUS` variable and set the new secret as the primary:
|
|
|
|
```bash
|
|
# Example for JWT_SECRET rotation:
|
|
JWT_SECRET_PREVIOUS=<current-value-of-JWT_SECRET>
|
|
JWT_SECRET=<newly-generated-value>
|
|
# Same pattern for JWT_REFRESH_SECRET if rotating refresh keys.
|
|
```
|
|
|
|
The auth layer automatically tries the primary key first and falls
|
|
back to `_PREVIOUS`, so tokens signed with the old key continue to
|
|
validate during the grace period (≤ access-token TTL, typically 15 m).
|
|
|
|
3. **Deploy** the change. On boot, every API instance logs:
|
|
|
|
```
|
|
[EnvSecretProvider] Secret versions at boot: VNPAY_HASH_SECRET=2026-04-24, …
|
|
```
|
|
|
|
Verify the version field matches the staged version on every instance.
|
|
The raw value **must never** appear in this or any other log line.
|
|
|
|
4. **Smoke-test** payment flows for the rotated provider:
|
|
- issue one sandbox payment
|
|
- confirm callback verification succeeds
|
|
- confirm refund signing succeeds
|
|
Record the rotation in the security audit log
|
|
(`docs/security/secret-rotation-log.md` — append-only).
|
|
|
|
5. **Decommission** the old credential in the gateway's merchant portal.
|
|
|
|
6. **Remove the previous secret.** After the grace period (at least one
|
|
full access-token TTL cycle, typically 15 minutes), remove
|
|
`JWT_SECRET_PREVIOUS` (and/or `JWT_REFRESH_SECRET_PREVIOUS`) from the
|
|
environment and redeploy. This closes the dual-key window.
|
|
|
|
## 5. SecretProvider abstraction (developer workflow)
|
|
|
|
All new and existing code that consumes a rotation-sensitive secret MUST
|
|
go through the `SecretProvider` port:
|
|
|
|
```ts
|
|
import { Inject, Injectable } from '@nestjs/common';
|
|
import { SECRET_PROVIDER, type ISecretProvider } from '@modules/shared/domain/ports';
|
|
|
|
@Injectable()
|
|
export class VnpayService {
|
|
constructor(@Inject(SECRET_PROVIDER) private readonly secrets: ISecretProvider) {}
|
|
|
|
async sign(payload: string): Promise<string> {
|
|
const { value } = await this.secrets.getSecret('VNPAY_HASH_SECRET');
|
|
// … HMAC with `value`, never store it on `this`, never log it.
|
|
}
|
|
}
|
|
```
|
|
|
|
Rules:
|
|
|
|
- **Never** capture the raw value into a service field. Always re-read on
|
|
the request path so a rotation takes effect at the next request.
|
|
- **Never** include `material.value` in log messages, error messages, or
|
|
exception payloads. `material.version` is safe to log.
|
|
- **Never** stringify a `SecretMaterial` directly into a response body.
|
|
- For bootstrap-only contexts where `await` is awkward, use
|
|
`getSecretSync` — but note that a future remote backend may throw
|
|
`UnsupportedSyncReadError`.
|
|
|
|
## 6. Backends
|
|
|
|
- **Short term — `EnvSecretProvider` (current).** Reads from `process.env`
|
|
via `ConfigService`. Operationally identical to the pre-existing
|
|
`getOrThrow('VNPAY_HASH_SECRET')` calls, but with a stable audit surface
|
|
(versions logged, port-based DI).
|
|
- **Mid term — `AwsSecretsManagerSecretProvider` / `VaultSecretProvider`.**
|
|
Same port. Adds:
|
|
- automatic refresh from the remote store
|
|
- per-secret IAM / Vault-policy scoping
|
|
- native version ids (`AWSCURRENT` / `AWSPREVIOUS` etc.) surfaced as
|
|
`material.version`
|
|
- `getSecretSync` may throw `UnsupportedSyncReadError`; bootstrap
|
|
callers must migrate to `getSecret`.
|
|
|
|
Switching backends is a one-line change in `SharedModule` (replace
|
|
`EnvSecretProvider` with the new implementation under the
|
|
`SECRET_PROVIDER` token). No call sites change.
|
|
|
|
## 7. Logging discipline
|
|
|
|
- The `EnvSecretProvider` logs only `name=version` pairs at boot.
|
|
- The `version` is either an operator-provided `<NAME>_SECRET_VERSION` env
|
|
var, or a 10-char SHA-256 fingerprint of the value (40 bits of entropy;
|
|
non-invertible; useful for distinguishing rotations across instances).
|
|
- Negative tests in
|
|
`apps/api/src/modules/shared/infrastructure/__tests__/env-secret-provider.service.spec.ts`
|
|
assert the raw value never appears in logger output, error messages, or
|
|
serialized provider state.
|
|
- The repo also has a global `pii-masker` and `GlobalExceptionFilter` —
|
|
those are defence-in-depth, not the primary control. The primary control
|
|
is "never put the value into a string in the first place."
|
|
|
|
## 8. Incident response (suspected leak)
|
|
|
|
1. Open a P1 incident in `#sec-incident`. Page Security on-call.
|
|
2. Rotate the affected secret immediately following §4 — do not wait for
|
|
forensic confirmation.
|
|
3. Search logs / CI artifacts / git history for the leaked value
|
|
fingerprint (NOT the value itself; use `fingerprint()` from
|
|
`env-secret-provider.service.ts`).
|
|
4. Coordinate with the gateway's anti-fraud team where applicable (VNPay,
|
|
MoMo, ZaloPay merchant support).
|
|
5. File a post-mortem within 5 business days; update this policy if
|
|
process gaps were found.
|
|
|
|
## 9. References
|
|
|
|
- Source port: `apps/api/src/modules/shared/domain/ports/secret-provider.port.ts`
|
|
- Env-backed impl: `apps/api/src/modules/shared/infrastructure/env-secret-provider.service.ts`
|
|
- Env validation: `apps/api/src/modules/shared/infrastructure/env-validation.ts`
|
|
- Negative tests: `apps/api/src/modules/shared/infrastructure/__tests__/env-secret-provider.service.spec.ts`
|
|
- Parent issue: [GOO-102](/GOO/issues/GOO-102)
|
|
- This issue: [GOO-197](/GOO/issues/GOO-197)
|