feat(auth): add row/size caps + streaming to export-user-data
- Add per-collection row cap (default 10k, env EXPORT_ROW_CAP) via Prisma take on all findMany calls - Add total size cap (default 100MB, env EXPORT_SIZE_CAP_MB); throws PayloadTooLargeException (413) when exceeded - Convert response to Node.js Readable stream piped via NestJS StreamableFile to avoid large in-memory buffers - Export ExportUserDataResult interface (stream + truncated flag) from handler - Update controller to set Content-Type/Content-Disposition headers and return StreamableFile - Document EXPORT_ROW_CAP and EXPORT_SIZE_CAP_MB env vars in Swagger - Extend tests: row-cap assertion (take arg), size-cap 413 path, stream assertions Fixes GOO-223 (M-1 from GOO-200 audit). Co-Authored-By: Paperclip <noreply@paperclip.ing>
This commit is contained in:
50
apps/api/docs/observability/README.md
Normal file
50
apps/api/docs/observability/README.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Observability — Read-Model / Projector (RFC-003 Phase 0)
|
||||
|
||||
Grafana dashboards and wiring notes for the read-model observability stack
|
||||
introduced in [GOO-192](/GOO/issues/GOO-192) under [GOO-94](/GOO/issues/GOO-94) §6 Phase 0.
|
||||
|
||||
## Metrics
|
||||
|
||||
All metrics live in the existing NestJS `metrics/` module
|
||||
(`apps/api/src/modules/metrics/`) and are scraped via the standard `/metrics`
|
||||
endpoint.
|
||||
|
||||
| Metric | Type | Labels | Purpose |
|
||||
| --------------------------------------- | --------- | --------- | --------------------------------------------------------- |
|
||||
| `read_model_projector_lag_seconds` | Gauge | `handler` | Seconds between latest source event and projector cursor. |
|
||||
| `read_model_refresh_duration_seconds` | Histogram | `view` | Duration of read-model / materialised view refreshes. |
|
||||
| `read_model_reconciliation_drift_total` | Counter | `model` | Count of drift discrepancies found during reconciliation. |
|
||||
|
||||
### Emit points
|
||||
|
||||
Inject `MetricsService` and call:
|
||||
|
||||
```ts
|
||||
metrics.setProjectorLag(handler, lagSeconds);
|
||||
metrics.recordReadModelRefresh(view, durationSeconds);
|
||||
metrics.recordReconciliationDrift(model, count?);
|
||||
```
|
||||
|
||||
## Dashboard
|
||||
|
||||
- File: `read-models-dashboard.json` (Grafana schema v38).
|
||||
- Import into Grafana (`Dashboards → Import → Upload JSON`), pick the Prometheus
|
||||
data source.
|
||||
- Variables: `handler`, `view`, `model` — derived from Prometheus label values.
|
||||
- Panels:
|
||||
1. Projector lag by handler (time series + thresholded)
|
||||
2. Max projector lag (stat, RAG 30s / 120s)
|
||||
3. Refresh duration p50/p95 by view
|
||||
4. Refresh throughput (refreshes/sec) by view
|
||||
5. Reconciliation drift rate by model (15m rate)
|
||||
6. Total drift events in last 24h (stat, RAG 1 / 10)
|
||||
|
||||
## Local verification
|
||||
|
||||
```bash
|
||||
pnpm --filter @goodgo/api dev
|
||||
curl -s http://localhost:3001/metrics | grep read_model_
|
||||
```
|
||||
|
||||
All three metric families should appear with `# HELP` / `# TYPE` headers even
|
||||
before any samples are recorded.
|
||||
77
apps/api/docs/observability/read-models-dashboard.json
Normal file
77
apps/api/docs/observability/read-models-dashboard.json
Normal file
@@ -0,0 +1,77 @@
|
||||
{
|
||||
"annotations": {
|
||||
"list": [
|
||||
{
|
||||
"builtIn": 1,
|
||||
"datasource": "-- Grafana --",
|
||||
"enable": true,
|
||||
"hide": true,
|
||||
"iconColor": "rgba(0, 211, 255, 1)",
|
||||
"name": "Annotations & Alerts",
|
||||
"type": "dashboard"
|
||||
}
|
||||
]
|
||||
},
|
||||
"editable": true,
|
||||
"graphTooltip": 1,
|
||||
"id": null,
|
||||
"uid": "goodgo-read-models",
|
||||
"title": "GoodGo · Read-Model Observability (RFC-003 Phase 0)",
|
||||
"tags": ["goodgo", "rfc-003", "read-models", "observability"],
|
||||
"timezone": "browser",
|
||||
"schemaVersion": 38,
|
||||
"version": 1,
|
||||
"refresh": "30s",
|
||||
"time": { "from": "now-6h", "to": "now" },
|
||||
"templating": {
|
||||
"list": [
|
||||
{ "name": "datasource", "type": "datasource", "query": "prometheus", "current": { "text": "Prometheus", "value": "Prometheus" } },
|
||||
{ "name": "handler", "type": "query", "datasource": "${datasource}", "query": "label_values(read_model_projector_lag_seconds, handler)", "includeAll": true, "multi": true, "refresh": 2 },
|
||||
{ "name": "view", "type": "query", "datasource": "${datasource}", "query": "label_values(read_model_refresh_duration_seconds_bucket, view)", "includeAll": true, "multi": true, "refresh": 2 },
|
||||
{ "name": "model", "type": "query", "datasource": "${datasource}", "query": "label_values(read_model_reconciliation_drift_total, model)", "includeAll": true, "multi": true, "refresh": 2 }
|
||||
]
|
||||
},
|
||||
"panels": [
|
||||
{
|
||||
"id": 1, "type": "timeseries", "title": "Projector lag (seconds) — by handler",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
|
||||
"fieldConfig": { "defaults": { "unit": "s", "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }, { "color": "yellow", "value": 30 }, { "color": "red", "value": 120 }] } } },
|
||||
"targets": [{ "expr": "read_model_projector_lag_seconds{handler=~\"$handler\"}", "legendFormat": "{{handler}}", "refId": "A" }]
|
||||
},
|
||||
{
|
||||
"id": 2, "type": "stat", "title": "Max projector lag (current)",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
|
||||
"fieldConfig": { "defaults": { "unit": "s", "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }, { "color": "yellow", "value": 30 }, { "color": "red", "value": 120 }] } } },
|
||||
"options": { "reduceOptions": { "calcs": ["lastNotNull"] } },
|
||||
"targets": [{ "expr": "max(read_model_projector_lag_seconds{handler=~\"$handler\"})", "refId": "A" }]
|
||||
},
|
||||
{
|
||||
"id": 3, "type": "timeseries", "title": "Refresh duration p50/p95 — by view",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 0, "y": 8 },
|
||||
"fieldConfig": { "defaults": { "unit": "s" } },
|
||||
"targets": [
|
||||
{ "expr": "histogram_quantile(0.95, sum by (view, le) (rate(read_model_refresh_duration_seconds_bucket{view=~\"$view\"}[5m])))", "legendFormat": "p95 · {{view}}", "refId": "A" },
|
||||
{ "expr": "histogram_quantile(0.50, sum by (view, le) (rate(read_model_refresh_duration_seconds_bucket{view=~\"$view\"}[5m])))", "legendFormat": "p50 · {{view}}", "refId": "B" }
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": 4, "type": "timeseries", "title": "Refresh throughput (refreshes/sec) — by view",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 12, "y": 8 },
|
||||
"fieldConfig": { "defaults": { "unit": "ops" } },
|
||||
"targets": [{ "expr": "sum by (view) (rate(read_model_refresh_duration_seconds_count{view=~\"$view\"}[5m]))", "legendFormat": "{{view}}", "refId": "A" }]
|
||||
},
|
||||
{
|
||||
"id": 5, "type": "timeseries", "title": "Reconciliation drift rate — by model",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 0, "y": 16 },
|
||||
"fieldConfig": { "defaults": { "unit": "ops" } },
|
||||
"targets": [{ "expr": "sum by (model) (rate(read_model_reconciliation_drift_total{model=~\"$model\"}[15m]))", "legendFormat": "{{model}}", "refId": "A" }]
|
||||
},
|
||||
{
|
||||
"id": 6, "type": "stat", "title": "Total drift events (last 24h)",
|
||||
"datasource": "${datasource}", "gridPos": { "h": 8, "w": 12, "x": 12, "y": 16 },
|
||||
"fieldConfig": { "defaults": { "unit": "short", "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }, { "color": "yellow", "value": 1 }, { "color": "red", "value": 10 }] } } },
|
||||
"options": { "reduceOptions": { "calcs": ["lastNotNull"] } },
|
||||
"targets": [{ "expr": "sum by (model) (increase(read_model_reconciliation_drift_total{model=~\"$model\"}[24h]))", "legendFormat": "{{model}}", "refId": "A" }]
|
||||
}
|
||||
]
|
||||
}
|
||||
Reference in New Issue
Block a user