feat(notifications): production-ready WebSocket gateway (TEC-2766)

- Add RedisIoAdapter (shared/infra) for multi-instance Socket.IO fan-out
  with graceful fallback to the in-memory IoAdapter when Redis is
  unreachable.
- Pin Socket.IO heartbeat (pingInterval/pingTimeout/connectTimeout)
  via env-tunable gateway options for reconnect stability.
- Expose Prometheus metrics on /notifications: goodgo_ws_connected_clients
  (Gauge) and goodgo_ws_messages_total (Counter) with namespace/event/
  direction labels. Wired through MetricsService and tracked across
  connect/disconnect + emits.
- Unit tests: RedisIoAdapter connect/fallback/close, new MetricsService
  WS helpers, and gateway metric increments/decrements on auth paths.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
This commit is contained in:
Ho Ngoc Hai
2026-04-18 15:06:25 +07:00
parent 5d4ecdeb2f
commit 329a821b4a
13 changed files with 410 additions and 5 deletions

View File

@@ -15,6 +15,8 @@ import {
DB_QUERY_DURATION,
DB_POOL_ACTIVE_CONNECTIONS,
SEARCH_QUERY_DURATION,
GOODGO_WS_CONNECTED_CLIENTS,
GOODGO_WS_MESSAGES_TOTAL,
WEB_VITALS_LCP,
WEB_VITALS_FCP,
WEB_VITALS_CLS,
@@ -83,6 +85,18 @@ import { HttpMetricsInterceptor } from './presentation/interceptors/http-metrics
labelNames: ['plan'],
}),
// ── WebSocket Metrics ──
makeGaugeProvider({
name: GOODGO_WS_CONNECTED_CLIENTS,
help: 'Number of active WebSocket clients',
labelNames: ['namespace'],
}),
makeCounterProvider({
name: GOODGO_WS_MESSAGES_TOTAL,
help: 'Total number of WebSocket messages emitted/received',
labelNames: ['namespace', 'event', 'direction'],
}),
// ── Services & Interceptors ──
MetricsService,
HttpMetricsInterceptor,