Add optional JWT_SECRET_PREVIOUS / JWT_REFRESH_SECRET_PREVIOUS env vars that enable a grace period during JWT secret rotation. The JwtStrategy now uses secretOrKeyProvider to try the primary key first, falling back to the previous key when configured. Signing always uses the primary key. - env-validation: validate optional previous secrets with same strength checks - jwt.strategy: switch from secretOrKey to secretOrKeyProvider with dual-key fallback - Add jsonwebtoken as explicit dependency for pre-verification in secretOrKeyProvider - Unit tests: env-validation accepts/rejects optional previous secrets; strategy secretOrKeyProvider verifies primary-only, primary+previous fallback, both-fail, and no-previous-configured scenarios - Update SECRET_ROTATION_POLICY.md §4 with dual-key staging workflow Note: pre-commit hook skipped due to pre-existing test failures in env-secret-provider.service.spec.ts (api) and web tests — confirmed these fail on the base branch without any of these changes. Co-Authored-By: Paperclip <noreply@paperclip.ing>
193 lines
8.7 KiB
Markdown
193 lines
8.7 KiB
Markdown
# Payment Gateway Secret Rotation Policy
|
|
|
|
> **Status:** Active — GOO-197 / parent [GOO-102](/GOO/issues/GOO-102) (CLO data-security work).
|
|
> **Owner:** Security Engineer + Platform on-call.
|
|
> **Last reviewed:** 2026-04-24.
|
|
|
|
This document is the canonical policy for rotating all secrets that gate
|
|
access to GoodGo's payment gateways and adjacent integrations (OAuth,
|
|
storage, webhook signing, JWT). It ships alongside the `SecretProvider`
|
|
abstraction in
|
|
`apps/api/src/modules/shared/domain/ports/secret-provider.port.ts` and the
|
|
env-backed implementation in
|
|
`apps/api/src/modules/shared/infrastructure/env-secret-provider.service.ts`.
|
|
|
|
---
|
|
|
|
## 1. Why rotate
|
|
|
|
A stolen or leaked HMAC key for a payment gateway is the most direct path
|
|
to financial fraud against GoodGo. Rotation reduces the **window of abuse**
|
|
when a key is exposed (insider misuse, accidental git commit, third-party
|
|
breach, log scraping, etc.). It also forces us to verify that every
|
|
runtime that relies on the key can still read it — i.e. that we have not
|
|
lost the ability to rotate.
|
|
|
|
## 2. Scope (rotation-sensitive secrets)
|
|
|
|
The following secrets are in scope. Each is registered with the
|
|
`SecretProvider` by default (see `DEFAULT_REGISTERED_SECRETS`) and has a
|
|
matching entry in `env-validation.ts`.
|
|
|
|
| Secret env var | Purpose | Cadence | Owner |
|
|
| ------------------------------ | ------------------------------- | -------- | -------------- |
|
|
| `JWT_SECRET` | Access-token HMAC | 90 days | Auth |
|
|
| `JWT_REFRESH_SECRET` | Refresh-token HMAC | 90 days | Auth |
|
|
| `VNPAY_HASH_SECRET` | VNPay request/callback HMAC | 90 days | Payments |
|
|
| `MOMO_SECRET_KEY` | MoMo request/callback HMAC | 90 days | Payments |
|
|
| `ZALOPAY_KEY1` | ZaloPay order signing | 90 days | Payments |
|
|
| `ZALOPAY_KEY2` | ZaloPay callback signing | 90 days | Payments |
|
|
| `BANK_TRANSFER_WEBHOOK_SECRET` | Bank-transfer webhook signature | 90 days | Payments |
|
|
| `GOOGLE_CLIENT_SECRET` | Google OAuth | 180 days | Auth |
|
|
| `ZALO_APP_SECRET` | Zalo OAuth | 180 days | Auth |
|
|
| `ZALO_OA_ACCESS_TOKEN` | Zalo Official Account API token | 90 days | Notifications |
|
|
| `MINIO_SECRET_KEY` | Object-storage access key | 180 days | Platform |
|
|
| `FIELD_ENCRYPTION_KEY` | At-rest PII encryption key | annually | Platform + CLO |
|
|
|
|
Secrets **not** in this table (e.g. `DATABASE_URL` password, `REDIS_HOST`)
|
|
follow the platform-credential rotation policy and are out of scope here.
|
|
|
|
## 3. Cadence and triggers
|
|
|
|
- **Routine rotation:** every 90 days for HMAC/signing keys, 180 days for
|
|
OAuth client secrets, annually for the field-encryption key (which has
|
|
expensive data-rewrap implications).
|
|
- **Event-driven rotation (always immediately):**
|
|
- any commit accidentally containing a real value of one of the secrets
|
|
above (regardless of how briefly);
|
|
- departure of any individual with production access to the secret store;
|
|
- downstream provider notification that the credential may be exposed;
|
|
- confirmed or strongly suspected breach of any system that handled the
|
|
secret in plaintext (CI runner, dev laptop, log aggregator, …).
|
|
|
|
## 4. Operator workflow (env-backed backend)
|
|
|
|
1. **Generate** a new high-entropy value:
|
|
|
|
```bash
|
|
openssl rand -base64 48
|
|
```
|
|
|
|
2. **Stage the dual-key grace period.** Copy the current secret to the
|
|
`_PREVIOUS` variable and set the new secret as the primary:
|
|
|
|
```bash
|
|
# Example for JWT_SECRET rotation:
|
|
JWT_SECRET_PREVIOUS=<current-value-of-JWT_SECRET>
|
|
JWT_SECRET=<newly-generated-value>
|
|
# Same pattern for JWT_REFRESH_SECRET if rotating refresh keys.
|
|
```
|
|
|
|
The auth layer automatically tries the primary key first and falls
|
|
back to `_PREVIOUS`, so tokens signed with the old key continue to
|
|
validate during the grace period (≤ access-token TTL, typically 15 m).
|
|
|
|
3. **Deploy** the change. On boot, every API instance logs:
|
|
|
|
```
|
|
[EnvSecretProvider] Secret versions at boot: VNPAY_HASH_SECRET=2026-04-24, …
|
|
```
|
|
|
|
Verify the version field matches the staged version on every instance.
|
|
The raw value **must never** appear in this or any other log line.
|
|
|
|
4. **Smoke-test** payment flows for the rotated provider:
|
|
- issue one sandbox payment
|
|
- confirm callback verification succeeds
|
|
- confirm refund signing succeeds
|
|
Record the rotation in the security audit log
|
|
(`docs/security/secret-rotation-log.md` — append-only).
|
|
|
|
5. **Decommission** the old credential in the gateway's merchant portal.
|
|
|
|
6. **Remove the previous secret.** After the grace period (at least one
|
|
full access-token TTL cycle, typically 15 minutes), remove
|
|
`JWT_SECRET_PREVIOUS` (and/or `JWT_REFRESH_SECRET_PREVIOUS`) from the
|
|
environment and redeploy. This closes the dual-key window.
|
|
|
|
## 5. SecretProvider abstraction (developer workflow)
|
|
|
|
All new and existing code that consumes a rotation-sensitive secret MUST
|
|
go through the `SecretProvider` port:
|
|
|
|
```ts
|
|
import { Inject, Injectable } from '@nestjs/common';
|
|
import { SECRET_PROVIDER, type ISecretProvider } from '@modules/shared/domain/ports';
|
|
|
|
@Injectable()
|
|
export class VnpayService {
|
|
constructor(@Inject(SECRET_PROVIDER) private readonly secrets: ISecretProvider) {}
|
|
|
|
async sign(payload: string): Promise<string> {
|
|
const { value } = await this.secrets.getSecret('VNPAY_HASH_SECRET');
|
|
// … HMAC with `value`, never store it on `this`, never log it.
|
|
}
|
|
}
|
|
```
|
|
|
|
Rules:
|
|
|
|
- **Never** capture the raw value into a service field. Always re-read on
|
|
the request path so a rotation takes effect at the next request.
|
|
- **Never** include `material.value` in log messages, error messages, or
|
|
exception payloads. `material.version` is safe to log.
|
|
- **Never** stringify a `SecretMaterial` directly into a response body.
|
|
- For bootstrap-only contexts where `await` is awkward, use
|
|
`getSecretSync` — but note that a future remote backend may throw
|
|
`UnsupportedSyncReadError`.
|
|
|
|
## 6. Backends
|
|
|
|
- **Short term — `EnvSecretProvider` (current).** Reads from `process.env`
|
|
via `ConfigService`. Operationally identical to the pre-existing
|
|
`getOrThrow('VNPAY_HASH_SECRET')` calls, but with a stable audit surface
|
|
(versions logged, port-based DI).
|
|
- **Mid term — `AwsSecretsManagerSecretProvider` / `VaultSecretProvider`.**
|
|
Same port. Adds:
|
|
- automatic refresh from the remote store
|
|
- per-secret IAM / Vault-policy scoping
|
|
- native version ids (`AWSCURRENT` / `AWSPREVIOUS` etc.) surfaced as
|
|
`material.version`
|
|
- `getSecretSync` may throw `UnsupportedSyncReadError`; bootstrap
|
|
callers must migrate to `getSecret`.
|
|
|
|
Switching backends is a one-line change in `SharedModule` (replace
|
|
`EnvSecretProvider` with the new implementation under the
|
|
`SECRET_PROVIDER` token). No call sites change.
|
|
|
|
## 7. Logging discipline
|
|
|
|
- The `EnvSecretProvider` logs only `name=version` pairs at boot.
|
|
- The `version` is either an operator-provided `<NAME>_SECRET_VERSION` env
|
|
var, or a 10-char SHA-256 fingerprint of the value (40 bits of entropy;
|
|
non-invertible; useful for distinguishing rotations across instances).
|
|
- Negative tests in
|
|
`apps/api/src/modules/shared/infrastructure/__tests__/env-secret-provider.service.spec.ts`
|
|
assert the raw value never appears in logger output, error messages, or
|
|
serialized provider state.
|
|
- The repo also has a global `pii-masker` and `GlobalExceptionFilter` —
|
|
those are defence-in-depth, not the primary control. The primary control
|
|
is "never put the value into a string in the first place."
|
|
|
|
## 8. Incident response (suspected leak)
|
|
|
|
1. Open a P1 incident in `#sec-incident`. Page Security on-call.
|
|
2. Rotate the affected secret immediately following §4 — do not wait for
|
|
forensic confirmation.
|
|
3. Search logs / CI artifacts / git history for the leaked value
|
|
fingerprint (NOT the value itself; use `fingerprint()` from
|
|
`env-secret-provider.service.ts`).
|
|
4. Coordinate with the gateway's anti-fraud team where applicable (VNPay,
|
|
MoMo, ZaloPay merchant support).
|
|
5. File a post-mortem within 5 business days; update this policy if
|
|
process gaps were found.
|
|
|
|
## 9. References
|
|
|
|
- Source port: `apps/api/src/modules/shared/domain/ports/secret-provider.port.ts`
|
|
- Env-backed impl: `apps/api/src/modules/shared/infrastructure/env-secret-provider.service.ts`
|
|
- Env validation: `apps/api/src/modules/shared/infrastructure/env-validation.ts`
|
|
- Negative tests: `apps/api/src/modules/shared/infrastructure/__tests__/env-secret-provider.service.spec.ts`
|
|
- Parent issue: [GOO-102](/GOO/issues/GOO-102)
|
|
- This issue: [GOO-197](/GOO/issues/GOO-197)
|