- Move 8 stray .md (+5 .txt) from ~/Desktop into docs/explorations/from-desktop/
- Reorganize 27 .md/.txt at workspace root:
- audit reports -> docs/audits/
- exploration reports -> docs/explorations/
- design system -> docs/design-system/
- Keep only README/CHANGELOG/CONTRIBUTING/CLAUDE at repo root
- Refresh docs/README.md as canonical index with links to all groups
- Note: pre-existing docs/audits/AUDIT_INDEX.md and AUDIT_SUMMARY.md were
overwritten by the newer root-level versions during the move
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Hoàn tất đợt cuối của nhiệm vụ chuyển toàn bộ tài liệu sang tiếng Việt.
Đã dịch 22 file `.md` còn sót (~9.7k dòng) — gồm RUNBOOK, audits,
docs/architecture, docs/load-testing, libs READMEs và các quick references.
Giữ nguyên code blocks, đường dẫn, identifier kỹ thuật, URL và biến môi trường.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
The .md files (CLAUDE.md, architecture docs) already referenced Next.js 15
correctly. Fixed the two remaining .txt audit files that still said Next.js 14.
libs/ai-services and libs/mcp-servers were already documented in CLAUDE.md
and both had comprehensive READMEs.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Update stale Next.js 14 references to 15 in audit docs
- Add libs/ai-services and libs/mcp-servers to CLAUDE.md project structure
Resolves TEC-2259
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Previously, `docker image prune` ran immediately after deploying new
containers, potentially deleting the old images needed for rollback
if smoke tests subsequently failed. Now the deploy pipeline:
1. Tags current images as :rollback before pulling new versions
2. Only runs `docker image prune` after smoke tests pass
3. Uses explicit :rollback tags for rollback instead of relying on
Docker layer cache (which is fragile)
Applied to:
- scripts/deploy-production.sh (manual deploy script)
- .github/workflows/deploy.yml (staging + production CI jobs)
- docs/deployment.md (updated rollback documentation)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Expand production monitoring with full alert coverage for database connections,
Redis memory/connections, container resources, disk usage, service health, and
backup integrity. Add Alertmanager service with Slack routing for critical and
warning alerts, and add automated backup verification to the pg-backup cron
schedule. Update runbook with DR validation procedures and quarterly checklist.
- Expand Prometheus alert rules from 4 to 24 alerts across 7 groups
- Add Alertmanager container (prom/alertmanager:v0.27.0) with Slack routing
- Configure inhibition rules (critical suppresses warning for same service)
- Schedule automated backup verification at 04:00 UTC daily
- Add Alertmanager datasource to Grafana provisioning
- Update runbook with Section 9: DR Validation (automated + manual procedures)
- Add SLACK_WEBHOOK_URL and Grafana vars to .env.example
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Move remaining root-level audit and CQRS handler analysis files
to the centralized docs/audits/ directory for consistency.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Move 36 root-level audit/analysis documents and 7 web app audit documents
into docs/audits/ directory to declutter the project root. Remove stale
EXPLORATION_SUMMARY.txt.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Trigger deploy workflow on push to `develop` branch (in addition to `master`)
- Add `staging-latest` Docker image tag for develop branch builds
- Add `rollback-staging` job: auto-reverts to previous images on smoke test failure
- Add Slack success notification for staging deploys (previously only failure was notified)
- Record pre-deploy image digests for rollback capability
- Update deployment docs with CI/CD pipeline details, rollback procedures, and required secrets
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Create comprehensive docs/RUNBOOK.md covering all 14 production services,
health checks, 10 common incident scenarios with diagnosis/resolution,
recovery procedures (DB restore, Redis flush, rolling restart, rollback),
escalation matrix, monitoring dashboards, and PromQL queries.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Fix port numbers across all docs (API :3001, Web :3000)
- Add 6 missing modules to README, CLAUDE.md, and architecture doc
(agents, health, inquiries, leads, reviews, metrics/web-vitals)
- Add Swagger UI reference and /api/v1 prefix notes
- Create docs/api-endpoints.md with complete REST API reference
- Create docs/README.md as documentation index
- Update deployment guide with Loki, Promtail, pg-backup services
- Update health check table with all current endpoints
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Root directory had accumulated audit/exploration markdown files cluttering
the project root. Moved all audit-related files to docs/audits/ with a
README.md index, and updated cross-references in K6_LOAD_TESTING_GUIDE.md
and README_FRONTEND_DOCS.md.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Document all 33 structured errorCode values from DomainException/ErrorCode
enum across all modules (auth, user, listing, property, media, payment,
subscription, course). Includes HTTP status mapping, Vietnamese error
messages, usage examples per module, alphabetical quick-reference table,
and TypeScript integration guide for frontend error handling.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Remove hardcoded minioadmin/minioadmin_secret fallback from docker-compose.yml,
require MINIO_ACCESS_KEY/MINIO_SECRET_KEY env vars (fail-fast with :? syntax)
- Align docker-compose.yml env var names with .env.example (MINIO_ACCESS_KEY/SECRET_KEY)
- Update CI e2e workflow to use GitHub vars with non-default fallbacks
- Update .env.test to use non-default test credentials
- Add @aws-sdk/s3-request-presigner and getPresignedUploadUrl() method to
MinioMediaStorageService for properly signed client-side uploads
- Remove hardcoded credentials from dev-environment docs
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Add pg-backup container with daily automated pg_dump (02:00 UTC) and 7-day retention
- Add backup/restore scripts with documented recovery procedure
- Add Loki + Promtail for centralized log aggregation from all Docker containers
- Add Loki as Grafana datasource with correlation ID derived fields
- Add Grafana logs dashboard with volume, error rate, HTTP request, and log viewer panels
- Configure Promtail to parse Pino structured JSON logs with level/context labels
- Enhance LoggerService with string-level formatter and service base field
- Configure 15-day log retention in Loki
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Setup code quality tooling for the monorepo:
- ESLint 9 flat config with TypeScript, import ordering, and NestJS rules
- Prettier with consistent formatting across all files
- dependency-cruiser enforcing module boundary rules (no cross-module internals, no circular deps)
- Husky + lint-staged for pre-commit hooks
- Auto-fixed existing files for type imports and import ordering
Co-Authored-By: Paperclip <noreply@paperclip.ing>