feat: Add functional tests for MktZaloService, new contract and load tests, and audit documentation, while removing a legacy infrastructure project and updating service configurations.

This commit is contained in:
Ho Ngoc Hai
2026-03-25 15:00:05 +07:00
parent 7a752f4a82
commit 36a0a9c256
82 changed files with 9323 additions and 5892 deletions

206
docs/audit/FIX-PLAN.md Normal file
View File

@@ -0,0 +1,206 @@
# GoodGo POS System — Audit Fix Plan
**Date:** 2026-03-23
**Owner:** CEO Agent
**Source:** 14 agent audit reports (94 total findings)
**Status:** Active
---
## Summary
| Category | Critical | High | Medium | Low | Total |
|---|:---:|:---:|:---:|:---:|:---:|
| Security | 5 | 10 | 5 | 1 | **21** |
| Backend | 4 | 5 | 3 | 0 | **12** |
| Frontend | 5 | 9 | 5 | 4 | **23** |
| DevOps | 4 | 12 | 5 | 0 | **21** |
| Testing | 4 | 7 | 3 | 1 | **15** |
| Documentation | 0 | 2 | 0 | 0 | **2** |
| **Total** | **22** | **45** | **21** | **6** | **94** |
---
## Wave 1 — P0 Blockers (Target: 24-48h)
### Security Blockers (assign: Security Engineer)
| ID | Finding | File | Fix |
|---|---|---|---|
| SEC-C-01 | DB credentials hardcoded in git (19 services) | All `appsettings.json` | Replace with env vars, add to `.gitignore` |
| SEC-C-02 | JWT token in MCP server `.env` committed | `services/goodgo-mcp-server/.env` | Revoke, remove from git, purge history |
| SEC-C-03 | `AddDeveloperSigningCredential()` in all envs | `iam-service-net/.../DependencyInjection.cs:142` | Wrap in `if (env.IsDevelopment())` |
| SEC-C-04 | Debug endpoints `[AllowAnonymous]` — privilege escalation | `merchant-service-net/.../StaffController.cs:249-390` | Delete or restrict to dev + SuperAdmin |
| SEC-C-05 | SQL injection via string interpolation | `merchant-service-net/.../StaffController.cs:307,367` | Use parameterized queries |
### DevOps Blockers (assign: DevOps Engineer)
| ID | Finding | File | Fix |
|---|---|---|---|
| DEVOPS-C-01 | K8s `:latest` image tag in production | All `production/kubernetes/*.yaml` | Use `IMAGE_TAG` placeholder + SHA |
| DEVOPS-C-02 | Alertmanager not configured — alerts silent | `prometheus/prometheus.yml:29` | Configure Alertmanager + receivers |
| DEVOPS-C-03 | CI pushes `:latest` to Docker Hub | `.github/workflows/docker-build.yml:99-103` | Remove `:latest`, use SHA only |
| DEVOPS-C-04 | 4 mkt-* services port 5000 conflict | `docker-compose.yml` | Assign ports 5021-5024 |
---
## Wave 2 — P1 Urgent (Target: 1 week)
### Security High (assign: Security Engineer)
| ID | Finding | Fix |
|---|---|---|
| SEC-W-02 | No Content-Security-Policy header | Add CSP to Traefik `middlewares.yml` |
| SEC-W-03 | CORS `allowCredentials: true` with dev origins | Separate per-env CORS config |
| SEC-W-04 | `sslRedirect: false` in shared config | Set `true` in staging/prod |
| SEC-W-05 | `Jwt__RequireHttpsMetadata=false` in docker-compose | Verify K8s ConfigMaps don't have this |
| SEC-W-14 | BFF CORS wildcard `AllowAnyOrigin()` | Whitelist specific origins |
| SEC-W-15 | JWT validation skipped in dev (4 services) | Always validate signatures |
### Backend Critical (assign: Senior Backend Engineer)
| ID | Finding | Fix |
|---|---|---|
| BACK-C-01 | `AllowAnyOrigin()` on all 26 services | Restrict origins in production |
| BACK-C-02 | Idempotency missing in 23/26 services | Implement `IRequestManager` (wallet, booking first) |
| BACK-C-03 | Error response format inconsistent | Standardize to `{ success, error: { code, message } }` |
| BACK-C-04 | ProblemDetails mapping incomplete in template | Update template with full exception mapping |
| BACK-W-02 | TenantMiddleware SQL string interpolation | Parameterized queries in 5 services |
### Frontend Critical (assign: Senior Frontend Engineer)
| ID | Finding | Fix |
|---|---|---|
| SEC-W-11 | Client secret in WASM (extractable) | Move to BFF server-side |
| SEC-W-12 | Password grant deprecated | Migrate to PKCE flow |
| SEC-W-01 | JWT in localStorage (XSS risk) | Migrate to httpOnly cookies via BFF |
| FRONT-C-04 | No route guards for auth pages | Add `[Authorize]` + `AuthorizeView` |
| FRONT-C-05 | shopId not validated against permissions | Backend verification call |
| FRONT-W-01 | Token refresh not implemented | Add background refresh timer |
| FRONT-W-02 | Global HttpClient header mutation (race) | Per-request headers via `DelegatingHandler` |
| SEC-W-13 | No CDN SRI for Lucide icons | Add SRI hash, pin version |
### DevOps High (assign: DevOps Engineer)
| ID | Finding | Fix |
|---|---|---|
| DEVOPS-W-02 | 15+ services missing CI/CD pipelines | Generate CI workflows from template |
| DEVOPS-W-03 | `pr-checks.yml` no .NET build/test | Add matrix build for .NET |
| DEVOPS-W-10 | `RequireHttpsMetadata=false` in staging K8s | Set `true` in staging/prod |
| DEVOPS-W-11 | booking-service missing K8s manifest | Create staging manifest |
| DEVOPS-W-12 | 13 Traefik routes missing | Add routes for all missing services |
### Testing Critical (assign: QA Engineer)
| ID | Finding | Fix |
|---|---|---|
| TEST-C-01 | Only 1/26 services has CI test pipeline | Generate CI for 25 services |
| TEST-C-02 | MCP server zero tests | Add Vitest test suite |
| TEST-C-03 | No coverage thresholds enforced | Add `.runsettings` with 80% threshold |
---
## Wave 3 — P2 High (Target: 2 weeks)
### Architecture (assign: Architect)
| ID | Finding | Fix |
|---|---|---|
| FRONT-I-01 | No shared UI component package | Extract shared Razor Class Library |
| FRONT-I-02 | ARIA/accessibility gaps | Add ARIA attributes to all components |
| FRONT-I-03 | No design-to-code token sync | Style Dictionary pipeline |
| FRONT-I-04 | `eval()` in OtpInput | Create JS module for focus |
### Backend Architecture (assign: Senior Backend Engineer)
| ID | Finding | Fix |
|---|---|---|
| BACK-I-01 | No OpenAPI specs in repo | Add `dotnet swagger tofile` to CI |
| BACK-I-02 | Missing Prometheus `/metrics` | Add OpenTelemetry + Prometheus exporter |
| BACK-W-01 | HttpContextAccessor in handlers | Inject contextual data from Controller |
| BACK-W-03 | Dapper no `commandTimeout` | Set explicit timeout on all queries |
### Frontend Improvements (assign: Senior Frontend Engineer)
| ID | Finding | Fix |
|---|---|---|
| FRONT-W-03 | ~20% POS pages incomplete backend integration | Implement 21 missing API integrations |
| FRONT-W-04 | Fragile multi-format deserialization | Standardize API response envelope |
| FRONT-W-06 | MudBlazor providers duplicated | Move to `App.razor` once |
| FRONT-W-07 | localStorage logic duplicated 5 files | Extract `LocalStorageService` |
### DevOps Improvements (assign: DevOps Engineer)
| ID | Finding | Fix |
|---|---|---|
| DEVOPS-W-01 | redis-exporter missing from compose | Add or remove scrape job |
| DEVOPS-W-04 | Redis single instance (SPOF) | Redis Sentinel or Cluster |
| DEVOPS-W-05 | No K8s NetworkPolicy | Add default-deny + whitelist |
| DEVOPS-M-01 | No image vulnerability scanning | Add Trivy to CI |
### Testing Improvements (assign: QA Engineer)
| ID | Finding | Fix |
|---|---|---|
| TEST-C-04 | No contract testing | Implement Pact.io for top 5 boundaries |
| TEST-W-01 | Shared packages zero tests | Add unit tests for 6 packages |
| TEST-W-04 | No performance/load testing | Add k6 load tests |
| TEST-W-05 | No frontend component tests | Add unit tests for key components |
### Documentation (assign: Technical Writer)
| ID | Finding | Fix |
|---|---|---|
| DOC-W-01 | Test credentials in ROADMAP.md | Remove credentials |
| DOC-W-02 | No ADR for Marketing dual-theme | Create ADR |
---
## Wave 4 — P3 Medium (Target: 1 month)
Lower priority items — tracked but deferred:
- FRONT-W-05: Lucide re-init on every render
- FRONT-W-08: Incomplete vi-VN translations
- FRONT-W-09: No IFormatProvider in JsonStringLocalizer
- FRONT-W-10: Event handler leak (no IAsyncDisposable)
- FRONT-W-11: Hardcoded Vietnamese in AuthInput
- FRONT-I-05 through FRONT-I-09: Component library expansion
- BACK-I-03: Outbox pattern (5d effort)
- BACK-I-04: Saga pattern (5d effort)
- DEVOPS-I-01 through DEVOPS-I-04: GitOps, PDB, Secrets Manager
- SEC-W-06 through SEC-W-10: Medium security items
---
## Agent Assignment Matrix
| Agent | Wave 1 | Wave 2 | Wave 3 | Total Items |
|---|:---:|:---:|:---:|:---:|
| **Security Engineer** | 5 | 6 | 0 | **11** |
| **Senior Backend Engineer** | 0 | 5 | 4 | **9** |
| **Senior Frontend Engineer** | 0 | 8 | 4 | **12** |
| **DevOps Engineer** | 4 | 5 | 4 | **13** |
| **QA Engineer** | 0 | 3 | 4 | **7** |
| **Architect** | 0 | 0 | 4 | **4** |
| **Technical Writer** | 0 | 0 | 2 | **2** |
| **CTO** | — | — | — | Review all |
---
## QA Verification Plan
After each wave completes:
1. Docker Compose rebuild: `docker-compose down && docker-compose up --build -d`
2. Health check all services: `curl http://localhost:{port}/health/live`
3. Run E2E tests: verify 38/41+ pass rate maintained
4. Security scan: verify hardcoded credentials removed
5. K8s dry-run: `kubectl apply --dry-run=server -f deployments/staging/kubernetes/`
---
## Success Criteria
- **Wave 1**: All 9 P0 blockers resolved, zero hardcoded credentials in git
- **Wave 2**: All 22 P1 items resolved, CI pipelines for all services
- **Wave 3**: Architecture improvements in place, test coverage >50%
- **Overall**: Project health score from 6.5/10 to 8.5/10

641
docs/audit/REPORT.md Normal file
View File

@@ -0,0 +1,641 @@
# Audit Report — POS System - AI
**Date**: 2026-03-20
**Compiled from**: 15 specialist audit reports (CEO, CTO, Architect, API Architect, Backend, Frontend, Database Architect, DevOps, Security, Product Manager, QA, UX/UI Designer, Founding Engineer, Research Analyst, Technical Writer)
**Platform**: GoodGo POS — 26 .NET 10 microservices, 5 frontend apps, MCP AI server
**Branch**: master (d0211e5)
---
## Executive Summary
GoodGo POS is a multi-vertical point-of-sale platform in **late-MVP / pre-production** phase. The architecture is exemplary — Clean Architecture + CQRS enforced across all 26 services, bilingual documentation (EN/VI), and a comprehensive observability stack. The core POS flow works end-to-end with 93% E2E test pass rate (38/41). The MCP AI server (12 tools) is a unique market differentiator no competitor offers.
However, the platform has **severe security vulnerabilities** that must be addressed before any deployment: production database credentials are committed to git (CVSS 9.8), debug endpoints allow unauthenticated privilege escalation, JWT tokens are stored in localStorage, and the IdentityServer uses developer signing credentials in all environments. Test coverage sits at ~15% against a 70% target. Five services remain incomplete, and the entire marketing/CRM suite is demo-only with hardcoded fake data.
**Overall Project Health Score: 5.5 / 10**
| Category | Score | Key Factor |
|----------|:-----:|------------|
| Architecture & Patterns | 9/10 | Clean Architecture + CQRS exemplary across all services |
| Code Quality | 8/10 | Strong DDD, bilingual docs, consistent patterns |
| Security | 2/10 | Production credentials in git, debug endpoints, no CSP |
| Test Coverage | 3/10 | ~15% coverage, 1/26 services has CI tests |
| Infrastructure & DevOps | 6/10 | Good stack, K8s gaps, alerting non-functional |
| Frontend & UX | 5/10 | Solid foundation, critical a11y failures, 2,316 inline styles |
| Documentation | 7/10 | 102+ docs, but 96% OpenAPI specs missing |
| Production Readiness | 4/10 | POS core ready, platform incomplete |
| Mobile Readiness | 2/10 | iOS partial, Android template only |
| Product Completeness | 5/10 | Core POS works, marketing/analytics are stubs |
---
## Critical Issues
Issues that are **showstoppers** and must be fixed before any staging or production deployment.
### SEC-1: Production Database Credentials Committed to Git (CVSS 9.8)
**Source**: Security Audit (CRIT-01)
**Severity**: BLOCKER
**Affected**: All 19 .NET microservices `appsettings.json`
The Neon PostgreSQL production password (`npg_Ssfy6HKO0cXI`), Redis production password (`Velik@2026` with public IP `167.114.174.113`), SMTP credentials, and JWT signing secret are all hardcoded in `appsettings.json` files tracked by git. Anyone with repository read access can authenticate directly to the production database and Redis. All customer data, merchant data, orders, wallets, and PII are at risk.
**Files**: `services/*/src/*/appsettings.json` (all 19 services)
**Fix**: Rotate ALL credentials immediately (within 24 hours). Replace with environment variable placeholders. Purge from git history with `git filter-repo` or BFG Repo Cleaner. Adopt Kubernetes External Secrets Operator or HashiCorp Vault.
---
### SEC-2: Active JWT Bearer Token Committed in MCP Server .env
**Source**: Security Audit (CRIT-02)
**Severity**: BLOCKER
**File**: `services/goodgo-mcp-server/.env:3`
A live, signed JWT bearer token is committed to git. Any party with repository access can replay this token to authenticate as the associated service account. The MCP server has 12 operational tools that can read/write F&B data.
**Fix**: Revoke token immediately. Remove `.env` from git tracking. Purge from history.
---
### SEC-3: Debug Endpoints Allow Unauthenticated Privilege Escalation
**Source**: CTO (C1), Backend (C1), CEO (referenced)
**Severity**: BLOCKER
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:249-390`
Five `[AllowAnonymous]` debug endpoints in production code:
- `POST /api/v1/staff/debug/seed` — creates arbitrary staff data
- `POST /api/v1/staff/debug/update-userid` — overwrites any staff's userId via reflection
- `POST /api/v1/staff/debug/update-merchant` — overwrites any merchantId
Any attacker can escalate privileges or tamper with merchant data.
**Fix**: Delete these endpoints entirely or wrap with `if (env.IsDevelopment())` + `[Authorize(Roles = "SuperAdmin")]`.
---
### SEC-4: SQL Injection in StaffController
**Source**: CTO (C2), Backend (C2)
**Severity**: BLOCKER
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:307, 367`
String interpolation used directly in SQL queries. While current `Guid` types self-sanitize, any future refactor changing types to `string` creates an immediate injection vulnerability.
```csharp
cmd.CommandText = $"UPDATE merchants SET user_id = '{userId}' WHERE id = '{merchantId}'";
```
**Fix**: Use parameterized queries (`cmd.Parameters.AddWithValue()`).
---
### SEC-5: IdentityServer Using Developer Signing Credential in All Environments
**Source**: Security Audit (CRIT-03)
**Severity**: CRITICAL
**File**: `services/iam-service-net/src/IamService.Infrastructure/DependencyInjection.cs:142`
`AddDeveloperSigningCredential()` is called unconditionally — no environment check. The signing key is regenerated on every restart, invalidating all active sessions. No certificate-based signing exists for production.
**Fix**: Use cert-based signing for non-development environments. Store RSA key in Vault or K8s TLS Secret.
---
### SEC-6: JWT Stored in localStorage (XSS Risk)
**Source**: Frontend (CRIT-01), Security (WARN-01)
**Severity**: CRITICAL
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs:147-148`
JWT tokens stored in `localStorage` are accessible to any JavaScript on the same origin. A single XSS vulnerability allows full token theft and account impersonation.
**Fix**: Migrate to `httpOnly` cookies via the BFF server. Add CSP headers.
---
### SEC-7: Client Secret Hardcoded in Browser-Delivered WASM Code
**Source**: Frontend (CRIT-02)
**Severity**: CRITICAL
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs:39-40`
```csharp
private const string ClientId = "password-client";
private const string ClientSecret = "password-client-secret";
```
Blazor WASM compiles to WebAssembly served to browsers. Constants are extractable via developer tools.
**Fix**: Move OAuth2 token exchange to the BFF. The BFF holds secrets server-side.
---
### SEC-8: Deprecated Password Grant (OAuth2) Used
**Source**: Frontend (CRIT-03)
**Severity**: CRITICAL
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs:136`
Resource Owner Password Credentials grant is deprecated in OAuth 2.1. It cannot support MFA properly and exposes credentials to the client application.
**Fix**: Migrate to Authorization Code Flow with PKCE.
---
### SEC-9: JWT Signature Validation Disabled in Development (4 Services)
**Source**: Backend (C3), Founding Engineer (C4)
**Severity**: CRITICAL
**Files**: `merchant-service-net`, `ads-serving-service-net`, `ads-tracking-service-net`, `ads-billing-service-net` — all `Program.cs`
```csharp
ValidateIssuerSigningKey = builder.Environment.IsDevelopment() ? false : true,
```
If `ASPNETCORE_ENVIRONMENT` is misconfigured on a server, all JWT signatures are bypassed. Any self-crafted token is accepted.
**Fix**: Use a shared deterministic development signing key via environment variable. Always validate signatures.
---
### SEC-10: BFF CORS Wildcard + All 26 Services Use `AllowAnyOrigin()`
**Source**: CTO (C3), Backend (W1), Founding Engineer (C1)
**Severity**: CRITICAL
**Files**: All 26 services `Program.cs`, BFF `Program.cs:74-78`
Every service allows any origin to make cross-origin requests. Enables CSRF attacks and violates same-origin policy.
**Fix**: Configure per-environment origin whitelists. Production: only `https://goodgo.vn`, `https://admin.goodgo.vn`.
---
### SEC-11: No Content-Security-Policy Header
**Source**: Security (WARN-02), Frontend (IMP-02)
**Severity**: CRITICAL
**File**: `infra/traefik/dynamic/middlewares.yml`
The `secure-headers` middleware is missing CSP, Referrer-Policy, and Permissions-Policy headers. Without CSP, the browser has no defense against XSS exploitation.
**Fix**: Add CSP with `default-src 'self'; script-src 'self' 'wasm-unsafe-eval';` and related policies.
---
### INFRA-1: Production K8s Manifests Hardcode `:latest` Image Tag
**Source**: DevOps (CRIT-1), CEO (referenced)
**Severity**: CRITICAL
**Files**: `deployments/production/kubernetes/*.yaml`
All production Kubernetes deployments use `goodgo/*:latest`. Not reproducible, no rollback capability.
**Fix**: Use commit SHA tags. Inject via Kustomize/Helm or pipeline sed-replace.
---
### INFRA-2: Alertmanager Not Configured — All Alert Rules Silent
**Source**: DevOps (CRIT-2)
**Severity**: CRITICAL
**File**: `infra/observability/prometheus/prometheus.yml:29`
Prometheus `alerting.alertmanagers.targets` is empty (`[]`). All defined alert rules (ServiceDown, HighErrorRate, HighLatencyP95) fire into the void. Production incidents will have zero notification.
**Fix**: Deploy Alertmanager with Slack/PagerDuty receivers. Update targets to `['alertmanager:9093']`.
---
### INFRA-3: Health Check Endpoints Require Authentication (18/23 Services)
**Source**: Founding Engineer (C2)
**Severity**: CRITICAL
**Files**: 18 services missing `.AllowAnonymous()` on health endpoints
Kubernetes kubelet cannot authenticate. Liveness/readiness probes return 401 Unauthorized, causing pod restart loops.
**Fix**: Add `.AllowAnonymous()` to all `MapHealthChecks()` calls.
---
### DATA-1: InventoryItem.CreatedAt Backing Field Not Mapped — Silent Data Loss
**Source**: Database Architect (C-1)
**Severity**: CRITICAL
**File**: `services/inventory-service-net/src/InventoryService.Infrastructure/EntityConfigurations/InventoryItemEntityTypeConfiguration.cs:34`
`builder.Ignore(i => i.CreatedAt)` discards the property but never maps the private backing field `_createdAt`. Every entity loaded from the database has `CreatedAt == DateTime.MinValue`.
**Fix**: Map `_createdAt` backing field to `created_at` column. No migration needed.
---
### DATA-2: No Optimistic Concurrency on Wallet Balances — Race Condition
**Source**: Database Architect (C-2)
**Severity**: CRITICAL
**File**: `services/wallet-service-net/src/WalletService.Infrastructure/EntityConfigurations/WalletItemEntityTypeConfiguration.cs`
Two concurrent POS transactions can read the same balance and overwrite each other. Money disappears silently.
**Fix**: Add `builder.UseXminAsConcurrencyToken()` (Npgsql built-in). No migration needed.
---
### SVC-1: 5 Services Still In-Progress (22% Incomplete)
**Source**: CEO (C2), Research Analyst (CRIT-01)
**Severity**: CRITICAL
| Service | Status |
|---------|--------|
| `promotion-service-net` | 0 commands, 0 queries — controllers reference non-existent logic |
| `mission-service-net` | Domain model done, 0 CQRS handlers |
| `inventory-service-net` | 1 command handler out of 12+ endpoints needed |
| `ads-analytics-service-net` | Minimal commands, incomplete aggregation |
| `ads-billing-service-net` | 3 of 8+ command handlers |
**Impact**: Cannot deliver full feature coverage. Order fulfillment blocked (no stock validation). Voucher campaigns non-functional.
**Fix**: 3 backend engineers, 2-3 weeks dedicated effort.
---
### SVC-2: ads-serving-service-net is READ-ONLY
**Source**: CEO (C3), Research Analyst (WARN-02)
**Severity**: CRITICAL
Zero command handlers. Auction logic is query-only. Cannot create or manage ad placements programmatically. Ads platform is non-functional for write operations.
---
### PROD-1: Marketing Suite is 100% Demo — Zero Backend Integration
**Source**: Product Manager (CI-1)
**Severity**: CRITICAL
**Files**: 6 Marketing pages (`AiChatbot.razor`, `CustomerCrm.razor`, `LivechatConsole.razor`, `AiContentStudio.razor`, `ChatbotAutomation.razor`, `SocialHub.razor`)
All 6 pages use hardcoded demo data arrays with fake customer records. This is marketed as a key differentiator vs. KiotViet/Sapo/iPOS but delivers zero actual functionality. Merchants who sign up for Growth/Pro plans expecting CRM and AI chatbot will experience immediate trust damage.
**Fix**: Hide Marketing section with "Coming Soon" label, or implement real backend integration (2+ weeks).
---
### PROD-2: Analytics & Reporting — 0% Wired to Real Data
**Source**: Product Manager (CI-4)
**Severity**: CRITICAL
Revenue Dashboard, Staff Performance, and EOD Report pages exist with full UI but all KPI values (revenue, transactions, avg order value, growth %) are hardcoded demo values. Merchants making business decisions based on fake numbers will lose real money.
**Fix**: Wire to order-service data. Add `GetDailyRevenueQuery` and `GetTopProductsQuery`.
---
### API-1: Response Format Fragmentation (3 Incompatible Patterns)
**Source**: API Architect (CRIT-1), Frontend (WARN-08)
**Severity**: CRITICAL
Three response patterns exist: IAM uses typed `ApiResponse<T>`, Order/Catalog use anonymous objects, and Merchant returns raw entities. Frontend compensates with a "4-format smart deserializer" — a fragile workaround. SDK auto-generation is blocked.
**Fix**: Extract `ApiResponse<T>` to shared NuGet package. Enforce across all services.
---
### TEST-1: Only 1 of 26 Services Has CI Test Pipeline
**Source**: QA (C1), CTO (W2)
**Severity**: CRITICAL
Only `iam-service-net` runs tests in CI. Regressions in 25 other services are undetected until staging deployment. The `deploy-staging.yml` runs migrations but NOT tests before deployment.
**Fix**: Generate CI pipelines for all 25 missing services using `ci-iam-service.yml` as template.
---
### CROSS-1: Cross-Service Messaging Not Implemented
**Source**: CTO (C5)
**Severity**: CRITICAL
All `IntegrationEvents/EventHandlers/` directories across 10+ services are empty. RabbitMQ is provisioned in docker-compose but no service publishes or consumes messages. System works via HTTP tight coupling, which will fail at scale.
**Fix**: Implement `IEventBus` abstraction with RabbitMQ backend. Start with `OrderCreatedIntegrationEvent` to inventory deduction.
---
## Warnings
Issues that should be addressed within the next 2-4 weeks.
### Security Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-SEC-1 | CORS `allowCredentials: true` with localhost origins in shared middleware config | Security (WARN-03) | `infra/traefik/dynamic/middlewares.yml:27` |
| W-SEC-2 | `sslRedirect: false` in shared middleware — applies to all environments | Security (WARN-04), CTO | `infra/traefik/dynamic/middlewares.yml:5` |
| W-SEC-3 | `Jwt__RequireHttpsMetadata=false` in staging K8s ConfigMap | DevOps (WARN-10), Security (WARN-05) | `deployments/staging/kubernetes/configmap.yaml:21` |
| W-SEC-4 | Test credentials hardcoded in ROADMAP.md (checked into git) | CTO (W8) | `ROADMAP.md` Section IX |
| W-SEC-5 | K8s staging secrets file contains literal placeholder strings | Security (WARN-09) | `deployments/staging/kubernetes/secrets.yaml` |
| W-SEC-6 | `AllowedHosts: "*"` in IAM service — DNS rebinding risk | Security (WARN-08) | `services/iam-service-net/appsettings.json:79` |
| W-SEC-7 | No CDN Subresource Integrity on Lucide script from unpkg.com | Frontend (CRIT-04) | `index.html:19` |
| W-SEC-8 | TOTP verification window allows 90-second replay (no used-code tracking) | Security (WARN-10) | `TotpTwoFactorService.cs:86` |
| W-SEC-9 | Unauthenticated ad tracking endpoints without rate limiting | Security (WARN-07) | Routes for `/api/v1/pixels`, `/api/v1/conversions` |
| W-SEC-10 | Traefik dashboard exposed without authentication in local dev | Security (WARN-06), DevOps (WARN-8) | `docker-compose.yml:121` |
### Infrastructure Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-INF-1 | Docker Compose port conflicts — 4 mkt-* services all bind port 5000 | CEO (W1), CTO (C4) | `docker-compose.yml` |
| W-INF-2 | Missing K8s manifests — Staging: 9 missing, Production: 11 missing | CEO (W2), CTO (W3) | `deployments/*/kubernetes/` |
| W-INF-3 | 13 services missing Traefik gateway routes — bypass security middleware | CTO (W4), API Architect (CRIT-3) | `infra/traefik/dynamic/routes.yml` |
| W-INF-4 | 15+ services not in CI/CD deployment pipeline | DevOps (WARN-2) | `.github/workflows/deploy-*.yml` |
| W-INF-5 | `redis-exporter` scraped by Prometheus but missing from docker-compose | DevOps (WARN-1) | `prometheus.yml:132` |
| W-INF-6 | Redis deployed as single instance in K8s — SPOF for real-time features | DevOps (WARN-4) | `deployments/staging/kubernetes/redis.yaml` |
| W-INF-7 | No Kubernetes NetworkPolicy manifests — all pods can communicate freely | DevOps (WARN-5) | Missing from all K8s dirs |
| W-INF-8 | Distributed tracing (Jaeger) commented out — no cross-service request correlation | DevOps (WARN-7) | `docker-compose.yml:1230-1241` |
| W-INF-9 | `docker-build.yml` pushes `:latest` tag to Docker Hub from `main` branch | DevOps (CRIT-3) | `.github/workflows/docker-build.yml:99-103` |
| W-INF-10 | PR checks only validate Node.js — no .NET build/test in PR pipeline | DevOps (WARN-3) | `.github/workflows/pr-checks.yml` |
| W-INF-11 | MinIO uses default credentials `minioadmin/minioadmin123` | DevOps (WARN-6) | `docker-compose.yml:78-79` |
### Frontend Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-FE-1 | Token refresh not implemented — silent 401 failures on token expiry | Frontend (WARN-01) | `AuthStateService.cs` |
| W-FE-2 | Global HttpClient header mutation — race condition under concurrent requests | Frontend (WARN-02) | `PosDataService.cs:40-47` |
| W-FE-3 | No route guards — admin layout renders before auth check | Frontend (WARN-06) | All layouts |
| W-FE-4 | `shopId` not validated against user permissions — URL manipulation possible | Frontend (WARN-07) | `AdminLayout.razor:246-286` |
| W-FE-5 | ~20% of POS pages have incomplete backend integration (21 TODOs) | Frontend (WARN-10) | Multiple POS pages |
| W-FE-6 | MudBlazor ThemeProvider declared in multiple layouts — duplicate instances | Frontend (WARN-04) | All layout files |
| W-FE-7 | localStorage logic duplicated across 5 files | Frontend (WARN-05) | `AuthService.cs`, layouts, `LanguageSwitcher.razor` |
| W-FE-8 | `eval()` used for DOM focus management in OTP input | Architect (W-2) | `OtpInput.razor` |
| W-FE-9 | Incomplete vi-VN translations vs en-US.json | Frontend (WARN-11) | `wwwroot/locales/vi-VN.json` |
### UX/Accessibility Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-UX-1 | No `:focus-visible` styles — keyboard navigation invisible (WCAG 2.4.7) | UX/UI (Issue 1) | All CSS files |
| W-UX-2 | Clickable `<div>` elements instead of `<button>` in POS pages (WCAG 4.1.2) | UX/UI (Issue 2) | All `*Desktop.razor` |
| W-UX-3 | No ARIA labels on icon-only interactive elements (WCAG 4.1.2) | UX/UI (Issue 3), Architect (C-2) | Auth components, layouts |
| W-UX-4 | Error/success messages missing `role="alert"` (WCAG 4.1.3) | UX/UI (Issue 4) | Login pages, POS |
| W-UX-5 | Hardcoded Vietnamese strings in POS UI — breaks English localization | UX/UI (Issue 5) | All POS vertical pages, layouts |
| W-UX-6 | 2,316 inline style attributes undermine design system | UX/UI (Issue 7) | All pages |
| W-UX-7 | Hardcoded color values instead of CSS variables | UX/UI (Issue 6) | POS pages, Dashboard |
| W-UX-8 | No focus trap in modal overlays (WCAG 2.1.2) | UX/UI (Issue 13) | Layouts, dialogs |
| W-UX-9 | Secondary text contrast may fail WCAG AA | UX/UI (Issue 9) | `admin.css` |
### Database Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-DB-1 | UUID v4 used everywhere instead of UUID v7 — index fragmentation at scale | DB Architect (C-3) | All 23 service domain entities |
| W-DB-2 | IAM Phase4A migration created PascalCase tables — breaks snake_case convention | DB Architect (C-4) | IAM migration `Phase4A_AuditAndCompliance.cs` |
| W-DB-3 | Missing global query filter for soft-deleted Merchant/Shop records | DB Architect (W-1) | `MerchantServiceContext.cs` |
| W-DB-4 | Missing composite indexes on high-traffic query patterns | DB Architect (W-2) | order, catalog, booking, inventory services |
| W-DB-5 | `Debug.WriteLine` left in OrderContext production code | DB Architect (W-4) | `OrderContext.cs:59,71` |
| W-DB-6 | ads-analytics and ads-billing bypass Repository pattern | DB Architect (W-6) | `ads-analytics-service-net`, `ads-billing-service-net` |
### API & Backend Warnings
| ID | Issue | Source | Files |
|----|-------|--------|-------|
| W-API-1 | API versioning inconsistency — 3 different patterns across services | API Architect (CRIT-2) | Multiple `Program.cs`, controllers |
| W-API-2 | Authorization pattern fragmentation — no policy-based auth, audience validation disabled | API Architect (WARN-2) | All services |
| W-API-3 | Health check implementation non-standard across services | API Architect (WARN-3) | Multiple services |
| W-API-4 | Idempotency only in 3/26 services — wallet and booking at risk of double-charge | Backend (W2) | `wallet-service-net`, `booking-service-net` |
| W-API-5 | TenantMiddleware uses string interpolation in SQL SET (5 services) | CTO (W1) | order, inventory, wallet, catalog, fnb-engine |
| W-API-6 | Fire-and-forget Task in mkt-whatsapp — errors silently swallowed | CTO (W5) | `WebhooksController.cs:88` |
### Testing Warnings
| ID | Issue | Source |
|----|-------|--------|
| W-TST-1 | MCP Server has zero tests (12 production tools) | QA (C2) |
| W-TST-2 | No coverage thresholds enforced — Coverlet installed but unconfigured | QA (C3) |
| W-TST-3 | No contract testing between microservices | QA (C4) |
| W-TST-4 | All 6 shared Node.js packages have zero tests | QA (W1) |
| W-TST-5 | E2E tests require live backend with no mocking — brittle in CI | QA (W2) |
| W-TST-6 | Playwright tests are read-only — no mutation/transaction coverage | QA (W3) |
| W-TST-7 | No performance/load testing anywhere in the stack | QA (W4) |
| W-TST-8 | Frontend has no component-level tests | QA (W5) |
### Product Warnings
| ID | Issue | Source |
|----|-------|--------|
| W-PRD-1 | Voucher redemption broken — backend exists but no UI in payment flow | Product Manager (CI-2) |
| W-PRD-2 | Payment gateway live/sandbox status unclear | Product Manager (CI-3) |
| W-PRD-3 | QR Menu customer ordering — post-cart flow has gaps | Product Manager (W-4) |
| W-PRD-4 | Spa vertical — appointment walk-in flow unclear | Product Manager (W-2) |
| W-PRD-5 | Mobile apps incomplete — iOS partial, Android template only | CEO (W3), Research (WARN-03) |
### Documentation Warnings
| ID | Issue | Source |
|----|-------|--------|
| W-DOC-1 | OpenAPI specs missing for 24/25 services | Technical Writer (C1) |
| W-DOC-2 | Postman collections directory is empty | Technical Writer (C2) |
| W-DOC-3 | README.md missing for 13/25 services | Technical Writer (W1) |
| W-DOC-4 | No CHANGELOG or release notes | Technical Writer (W4) |
### Design System Warnings
| ID | Issue | Source |
|----|-------|--------|
| W-DS-1 | No shared UI component package — components trapped in single app | Architect (C-1) |
| W-DS-2 | Theme token duplication between `AppTheme.cs` and `app.css` | Architect (W-1) |
| W-DS-3 | No responsive breakpoint tokens — hardcoded media queries | Architect (W-3) |
| W-DS-4 | Marketing module uses separate theme with no documented rationale | Architect (W-6) |
| W-DS-5 | No shared NuGet for DTOs — duplicated across web apps | Architect (W-7) |
---
## Improvements
Enhancements that would improve quality, maintainability, and competitive positioning.
### Architecture & Code Quality
- **Extract MediatR behaviors to shared NuGet** — 3,450+ lines duplicated across 23 services (Founding Engineer C3)
- **Extract `ApiResponse<T>` to shared package** — enforce uniform response format (API Architect IMP-1)
- **Implement transactional outbox pattern** for domain events — prevents event loss on process crash (Backend I1)
- **Implement Saga orchestration** for order placement flow — prevent inconsistent state across wallet/inventory/promotion (Backend I4)
- **Create shared Blazor Razor Class Library** (`@goodgo/blazor-ui`) — enable cross-app component reuse (Architect I-1)
- **Implement Style Dictionary token pipeline** — single source of truth for CSS vars and C# constants (Architect I-2)
- **Replace `Guid.NewGuid()` with `Guid.CreateVersion7()`** across all domain entities — prevents B-tree index fragmentation (DB Architect C-3)
### Infrastructure & DevOps
- **Adopt GitOps with ArgoCD** — drift detection, automatic rollback, audit trail (DevOps IMP-1)
- **Add PodDisruptionBudgets** for production services (DevOps IMP-2)
- **Enable distributed tracing** (Jaeger or Grafana Tempo) — critical for debugging 26-service interactions (DevOps WARN-7)
- **Add image vulnerability scanning** (`trivy-action`) to CI pipeline (DevOps IMP-5)
- **Implement External Secrets Operator** for K8s production secrets (DevOps IMP-8)
- **Add PgBouncer** for local development — 25+ services can exhaust PostgreSQL connections (DB Architect I-2)
- **Enable `pg_stat_statements`** in PostgreSQL `init.sql` for query monitoring (DB Architect I-1)
- **Add Prometheus business metrics** — orders/hour, conversion rate, revenue per vertical (Research IMP-04)
### Security Hardening
- **Add Serilog sensitive data destructuring** — prevent accidental PII/credential logging (Security IMP-02)
- **Add security scanning to CI** — GitLeaks, CodeQL SAST, `dotnet list package --vulnerable` (Security IMP-03)
- **Implement refresh token rotation** — one-time-use refresh tokens (Security IMP-04)
- **Add `Referrer-Policy` and `Permissions-Policy` headers** (Security IMP-05)
### Testing
- **Add Vitest test suite to MCP server** — 12 untested production tools (QA I2)
- **Enforce coverage thresholds** via `.runsettings` — 80% minimum (QA I3)
- **Add Pact.io contract tests** for top 5 service boundaries (QA I4)
- **Add k6 performance tests** for critical paths — order creation, catalog listing, ad events (QA I7)
- **Add mutation E2E tests** — test actual order creation, payment, booking flows (QA I5)
### Product & Market
- **Integrate MoMo QR + ZaloPay** — 60% of Vietnam digital payment market share, all competitors support them (Research IMP-01)
- **E-invoice compliance** (Nghi Dinh 123/2020) — legal requirement for all VN businesses since 2022 (Research IMP-02)
- **Wire analytics dashboards to real data** — replace demo arrays with actual order-service queries (Product IMP-2)
- **Implement voucher redemption in POS payment flow** — highest ROI product fix, RICE score 1,350 (Product IMP-1)
- **Expand MCP AI tools** to Karaoke and Retail verticals — strengthen unique differentiator (Research IMP-03)
- **Add customer feedback loop** post-payment (QR rating) — feature no competitor has (Product IMP-4)
- **Implement Redis application-level caching** for catalog, shop config — reduce DB load (Research IMP-05)
### UX/Accessibility
- **WCAG 2.1 AA accessibility pass** — focus-visible styles, semantic buttons, ARIA attributes, focus traps (UX/UI)
- **Extract inline styles to CSS classes** — 2,316 instances undermining design system (UX/UI Issue 7)
- **Add responsive breakpoint tokens** and standardize spacing scale (Architect W-3, UX/UI Issue 8)
- **Add password strength indicator** to registration (UX/UI Improvement B)
- **Lazy-load POS vertical assemblies** — reduce initial Blazor WASM download size (Frontend IMP-04)
### Documentation
- **Export and commit OpenAPI specs** for all 24 missing services (Technical Writer A1)
- **Create Postman collections** from Swagger specs (Technical Writer A2)
- **Add system architecture Mermaid diagram** to `system-design.md` (Technical Writer I1)
- **Deploy VitePress documentation site** via CI to `docs.goodgo.vn` (Technical Writer I2)
- **Add CHANGELOG.md** with automated generation (Technical Writer W4)
---
## Action Items
Prioritized checklist with assignee suggestions and effort estimates.
### P0 — IMMEDIATE (Within 24-48 Hours)
| # | Action | Assignee | Effort | Source |
|---|--------|----------|--------|--------|
| 1 | **Rotate ALL production credentials** (PostgreSQL, Redis, SMTP, JWT secret) | CTO + DevOps | 4h | SEC-1 |
| 2 | **Revoke MCP server JWT token**, remove `.env` from git, purge history | DevOps | 1h | SEC-2 |
| 3 | **Delete debug endpoints** from `StaffController.cs:249-390` | Backend Dev | 30m | SEC-3 |
| 4 | **Fix SQL injection** — parameterized queries in `StaffController.cs` | Backend Dev | 30m | SEC-4 |
| 5 | **Replace `AllowAnyOrigin()`** with environment-based whitelist in all 26 services | Backend Dev | 2h | SEC-10 |
| 6 | **Add `.AllowAnonymous()`** to health check endpoints in 18 services | Founding Eng | 1h | INFRA-3 |
| 7 | **Replace `AddDeveloperSigningCredential()`** with cert-based signing for non-dev | Backend Lead | 2h | SEC-5 |
| 8 | **Remove `:latest`** from production K8s manifests — use SHA tags | DevOps | 2h | INFRA-1 |
| 9 | **Configure Alertmanager** with Slack/PagerDuty receivers | DevOps | 4h | INFRA-2 |
| 10 | **Remove credentials from ROADMAP.md** | CTO | 5m | W-SEC-4 |
### P0 — THIS WEEK (Within 7 Days)
| # | Action | Assignee | Effort | Source |
|---|--------|----------|--------|--------|
| 11 | **Fix Docker Compose port conflicts** (mkt-* services: assign 5021-5024) | DevOps | 2h | W-INF-1 |
| 12 | **Add CSP header** to Traefik `secure-headers` middleware | DevOps | 1h | SEC-11 |
| 13 | **Fix BFF CORS** — replace `AllowAnyOrigin()` with whitelist | Backend Dev | 15m | SEC-10 |
| 14 | **Fix InventoryItem._createdAt** backing field mapping | Backend Dev | 30m | DATA-1 |
| 15 | **Add optimistic concurrency token** to wallet_items | Backend Dev | 1h | DATA-2 |
| 16 | **Add 13 missing Traefik routes** (chat, social, mining, ads-*, mkt-*, promotion, mission) | DevOps | 2h | W-INF-3 |
| 17 | **Enable SSL redirect** in staging/production Traefik config | DevOps | 15m | W-SEC-2 |
| 18 | **Hide Marketing section** with "Coming Soon" label | Frontend Dev | 1d | PROD-1 |
| 19 | **Remove `:latest`** push from docker-build.yml | DevOps | 30m | W-INF-9 |
### P1 — THIS MONTH (Within 30 Days)
| # | Action | Assignee | Effort | Source |
|---|--------|----------|--------|--------|
| 20 | **Complete 5 in-progress services** (promotion, mission, inventory, ads-analytics, ads-billing) | 3 Backend Devs | 2-3 weeks | SVC-1 |
| 21 | **Migrate JWT storage to httpOnly cookies** via BFF | Frontend Dev | 2d | SEC-6 |
| 22 | **Replace password grant with Authorization Code + PKCE** | Frontend + Backend | 3d | SEC-8 |
| 23 | **Generate CI pipelines** for 25 missing services | DevOps | 1d | TEST-1 |
| 24 | **Implement voucher redemption** in POS payment flow | Backend + Frontend | 1 week | W-PRD-1 |
| 25 | **Wire analytics dashboards** to real order data | Backend + Frontend | 1 week | PROD-2 |
| 26 | **Add .NET build/test** to PR checks pipeline | DevOps | 2h | W-INF-10 |
| 27 | **Complete remaining K8s manifests** (9 staging + 11 production) | DevOps | 1 week | W-INF-2 |
| 28 | **Add route guards** / AuthorizeView to all layouts | Frontend Dev | 1d | W-FE-3 |
| 29 | **Validate shopId against user permissions** in AdminLayout | Frontend Dev | 4h | W-FE-4 |
| 30 | **Implement token refresh** logic | Frontend Dev | 1d | W-FE-1 |
| 31 | **Add `:focus-visible` styles** and semantic buttons to POS pages | Frontend Dev | 3d | W-UX-1, W-UX-2 |
| 32 | **Add ARIA attributes** to all interactive custom components | Frontend Dev | 2d | W-UX-3 |
| 33 | **Implement IEventBus** skeleton + first integration event (OrderCreated to inventory) | Backend Dev | 3-5d | CROSS-1 |
| 34 | **Extract MediatR behaviors** to shared NuGet package | Founding Eng | 4h | Founding Eng C3 |
| 35 | **Add Vitest test suite** to MCP server (12 tools) | Backend Dev | 3-5d | W-TST-1 |
| 36 | **Increase test coverage to 50%** — focus on wallet, catalog, booking | QA + Devs | 3 weeks | CEO P0 |
| 37 | **Add HasQueryFilter** for soft-deleted Merchant + Shop | Backend Dev | 1h | W-DB-3 |
| 38 | **Migrate IAM AuditLogs** to snake_case table names | Backend Dev | 2h | W-DB-2 |
| 39 | **Implement idempotency** for wallet-service and booking-service | Backend Dev | 2d | W-API-4 |
| 40 | **Separate CORS config** by environment — remove localhost from production | DevOps | 1h | W-SEC-1 |
| 41 | **Confirm payment gateway status** (VNPay live/sandbox, plan for MoMo/ZaloPay) | CTO + Backend | 3d | W-PRD-2 |
### P2 — NEXT QUARTER (Within 90 Days)
| # | Action | Assignee | Effort | Source |
|---|--------|----------|--------|--------|
| 42 | Implement MoMo QR + ZaloPay payment integration | Backend Dev | 2-3 weeks | Research IMP-01 |
| 43 | E-invoice integration (Nghi Dinh 123/2020) | Backend Dev | 2 weeks | Research IMP-02 |
| 44 | Replace `Guid.NewGuid()` with `Guid.CreateVersion7()` globally | Backend Dev | 4h | W-DB-1 |
| 45 | Add composite indexes on high-traffic query patterns | Backend Dev | 1h | W-DB-4 |
| 46 | Deploy PgBouncer for local development | DevOps | 2h | DB I-2 |
| 47 | iOS app feature development (SwiftUI, core POS screens) | Mobile Dev | 4-6 weeks | W-PRD-5 |
| 48 | Performance baseline (k6 load testing) on staging | QA | 3d | W-TST-7 |
| 49 | Add Pact.io contract tests for top 5 service boundaries | Backend Dev | 5-7d | W-TST-3 |
| 50 | Export and commit OpenAPI specs for all 24 services | Backend + DevOps | 2d | W-DOC-1 |
| 51 | Add security scanning to CI (GitLeaks, CodeQL, vulnerability audit) | DevOps | 1d | Security IMP-03 |
| 52 | Standardize API versioning across all services | Backend Lead | 3d | W-API-1 |
| 53 | Implement policy-based authorization with scope validation | IAM Dev | 1 week | W-API-2 |
| 54 | Redis HA (Sentinel) for staging/production | DevOps | 4h | W-INF-6 |
| 55 | Add NetworkPolicy manifests to K8s | DevOps | 4h | W-INF-7 |
| 56 | Enable distributed tracing (Jaeger/Grafana Tempo) | DevOps | 2h | W-INF-8 |
| 57 | Implement secrets management pipeline (Vault / External Secrets) | DevOps | 1 week | Security IMP-01 |
| 58 | Extract inline styles to CSS classes (2,316 instances) | Frontend Dev | 1-2 weeks | W-UX-6 |
| 59 | Move hardcoded Vietnamese strings to localization keys | Frontend Dev | 1 week | W-UX-5 |
| 60 | Production pilot with 1-2 enterprise merchants (Cafe/Restaurant) | Product + CTO | Ongoing | CEO recommendation |
### P3 — BACKLOG
| # | Action | Assignee | Effort | Source |
|---|--------|----------|--------|--------|
| 61 | Create shared Blazor Razor Class Library (`blazor-ui`) | Architect + Frontend | 1 week | Architect I-1 |
| 62 | Implement Style Dictionary token pipeline | Architect | 3d | Architect I-2 |
| 63 | Add README.md to 13 missing services | Tech Lead | 1d | W-DOC-3 |
| 64 | Create CHANGELOG.md + automate generation | DevOps | 2h | W-DOC-4 |
| 65 | Deploy VitePress docs site via CI | DevOps | 2h | Tech Writer I2 |
| 66 | Expand MCP AI tools to Karaoke + Retail verticals | Backend Dev | 4-6 weeks | Research IMP-03 |
| 67 | MAUI Android app feature development | Mobile Dev | 6-8 weeks | W-PRD-5 |
| 68 | Add customer feedback loop post-payment (QR rating) | Backend + Frontend | 2 weeks | Product IMP-4 |
| 69 | Implement Saga orchestration for order placement flow | Backend Dev | 2 weeks | Backend I4 |
| 70 | Add materialized view for daily sales reporting | Backend Dev | 2d | DB I-5 |
| 71 | Adopt ArgoCD for GitOps | DevOps | 2-3d | DevOps IMP-1 |
| 72 | Create system architecture Mermaid diagram | Technical Writer | 2h | Tech Writer I1 |
---
## Competitive Positioning
| Feature | GoodGo Status | KiotViet | Sapo POS | iPOS |
|---------|:-------------:|:--------:|:--------:|:----:|
| Multi-vertical POS (5+) | **LEAD** | Retail only | Limited | 2 verticals |
| AI-powered operations (MCP) | **UNIQUE** | None | None | None |
| KDS Kitchen Display | Production | None | Yes | Yes |
| Microservices scalability | **LEAD** | Monolith | Monolith | Monolith |
| Real-time analytics | STUB (demo data) | **Production** | **Production** | **Production** |
| Marketing CRM | STUB (demo data) | Basic | **Production** | None |
| Mobile app (production) | **GAP** | Production | Production | Production |
| Payment (MoMo/ZaloPay) | **GAP** | Production | Production | Production |
| E-invoice compliance | **GAP** | Production | Production | Production |
| Booking/Scheduling | Production | None | None | Yes |
| Loyalty stamps/levels | Production | Basic | Basic | Yes |
**Assessment**: GoodGo has architectural and AI advantages no competitor matches. The critical gaps are: (1) real analytics/reporting, (2) payment gateway breadth, (3) mobile apps, and (4) e-invoice compliance. Fixing items 1 and 2 alone would make GoodGo competitive for enterprise SMB acquisition.
---
## Recommendation
**Phase 1 (Week 1-2): Security Lockdown**
Rotate all credentials, fix debug endpoints, fix SQL injection, add CSP, remove secrets from git history. This is non-negotiable before any deployment.
**Phase 2 (Week 2-4): Staging Deployment**
Deploy the 15 production-ready services to staging. Fix health check auth, complete K8s manifests, configure Alertmanager. Add CI test pipelines.
**Phase 3 (Month 2): Service Completion + Product Gaps**
Complete the 5 in-progress services. Wire analytics to real data. Implement voucher redemption. Confirm payment gateway status. Increase test coverage to 50%.
**Phase 4 (Month 3): Production Pilot**
Production pilot with 1-2 enterprise merchants in Cafe/Restaurant verticals. Integrate MoMo/ZaloPay. Begin e-invoice compliance work.
**Success probability: 75%** given current momentum, architecture quality, and the severity of security issues that need immediate resolution. The architecture is excellent — the gaps are in security hygiene, test coverage, and last-mile product completeness.
---
*Report compiled from 15 specialist audits on 2026-03-20*
*Overall Health Score: **5.5 / 10** — Strong architecture, critical security debt, incomplete features*

255
docs/audit/api-architect.md Normal file
View File

@@ -0,0 +1,255 @@
# API Architecture Audit — GoodGo POS System
**Auditor:** API Architect
**Date:** 2026-03-20
**Scope:** API design consistency, OpenAPI specs, versioning, Traefik routing, endpoint conventions
**Services Audited:** 26 microservices, 121 controllers
---
## Executive Summary
The GoodGo platform has a solid architectural foundation rooted in Clean Architecture and CQRS, but the API layer suffers from **fragmented versioning, inconsistent response formats, and incomplete gateway routing**. The IAM service sets a high standard that the majority of other services have not followed. These gaps create compounding developer experience issues: the POS frontend already compensates with a "4-format smart deserializer" — a symptom that the API contract is not being enforced. Fixing response format and versioning consistency is the highest-leverage action available.
---
## Critical Issues
### CRIT-1: Response Format Fragmentation
Three incompatible response patterns exist across the platform, making contract-first SDK generation impossible and forcing clients to implement defensive deserialization.
**Pattern A — IAM Service (correct):**
```json
{ "success": true, "data": { ... } }
{ "success": false, "error": { "code": "INVALID_TOKEN", "message": "...", "details": { ... } } }
```
Source: `services/iam-service-net/src/IamService.API/Application/Common/ApiResponse.cs`
**Pattern B — Order / Catalog / Template (partially correct):**
```json
{ "success": true, "data": { ... } }
{ "success": false, "error": { "code": "...", "message": "..." } }
```
Using anonymous objects rather than typed `ApiResponse<T>`.
- `services/order-service-net/src/OrderService.API/Controllers/OrdersController.cs:136,139`
- `services/_template_dot_net/src/MyService.API/Controllers/SamplesController.cs:38-87`
**Pattern C — Merchant Service (broken):**
```json
{ ... } // raw entity, no wrapper
{ "message": "Merchant profile not found..." } // no success/error structure
```
Source: `services/merchant-service-net/src/MerchantService.API/Controllers/MerchantsController.cs:44,46`
**Impact:** Frontend `PosDataService` implements 4-format deserialization to compensate. SDK auto-generation (openapi-generator/orval) is blocked. Error codes are not surfaced to the UI consistently.
---
### CRIT-2: API Versioning Inconsistency
Three versioning patterns found across the platform. Only IAM and Catalog services are version-future-proof.
**Pattern A — `Asp.Versioning` with dynamic URL + header (correct):**
```csharp
[ApiVersion("1.0")]
[Route("api/v{version:apiVersion}/auth")]
```
Services: `iam-service-net`, `catalog-service-net`, `storage-service-net`
Config: `services/iam-service-net/src/IamService.API/Program.cs:46-59`
**Pattern B — Hardcoded `api/v1` without `[ApiVersion]` attribute:**
```csharp
[Route("api/v1/merchants")] // no version negotiation
```
Services: `merchant-service-net`, `order-service-net`, `inventory-service-net`, `wallet-service-net`, `booking-service-net`
- `services/merchant-service-net/src/MerchantService.API/Controllers/MerchantsController.cs:18-19`
- `services/order-service-net/src/OrderService.API/Controllers/OrdersController.cs:17-18`
- `services/inventory-service-net/src/InventoryService.API/Controllers/InventoryController.cs:18`
**Pattern C — No version at all:**
```csharp
[Route("api/[controller]")]
```
Services: `chat-service-net`, some `fnb-engine-net` controllers
- `services/chat-service-net/src/ChatService.API/Controllers/MessagesController.cs:14`
**Impact:** No backward-compatibility guarantee. A v2 rollout requires touching all clients simultaneously. `X-Api-Version` header negotiation only works for Pattern A services.
---
### CRIT-3: Missing Traefik Gateway Routes
11 services (out of 26) have no entry in `infra/traefik/dynamic/routes.yml`. These services are either inaccessible from outside the Docker network or require clients to hardcode internal URLs — both are unacceptable.
**Missing services:**
- `chat-service-net` (SignalR `/hubs/chat` also not routed)
- `promotion-service-net`
- `social-service-net`
- `mining-service-net`
- `mission-service-net`
- `ads-manager-service-net`
- `ads-serving-service-net`
- `ads-tracking-service-net`
- `ads-billing-service-net`
- `ads-analytics-service-net`
- `mkt-facebook-service-net`, `mkt-whatsapp-service-net`, `mkt-x-service-net`, `mkt-zalo-service-net`
**Note:** The `CLAUDE.md` lists these services with full implementation. Their absence from the gateway config suggests the routes file is out of sync with actual deployment.
---
## Warnings
### WARN-1: OpenAPI Spec Coverage is Incomplete
No `.yaml` or `.json` OpenAPI spec files exist in the repository. Swagger UI is configured on some services but not all:
| Service | Swagger Configured |
|---|---|
| `iam-service-net` | ✅ Full (with SwaggerTag, SwaggerOperation, SwaggerResponse) |
| `order-service-net` | ✅ Basic |
| `catalog-service-net` | ✅ With versioning support |
| `merchant-service-net` | ❌ Not found |
| `inventory-service-net` | ❌ Not found |
| `chat-service-net` | ❌ Not found |
| Ads/Marketing services | ❌ Not found |
No central API documentation aggregator is configured. Traefik does not expose a unified Swagger UI. As a result, there is no way for frontend teams or external integrators to discover the full API surface without reading source code.
---
### WARN-2: Authorization Pattern Fragmentation
Three authorization patterns are in use with no consistent policy layer:
- **Attribute-based:** `[Authorize]` — checks token existence only
`services/merchant-service-net/.../MerchantsController.cs:20`
- **Role-based:** `[Authorize(Roles = "Admin,SuperAdmin")]`
`services/merchant-service-net/.../Admin/AdminMerchantsController.cs:18`
- **AllowAnonymous:** health endpoints
`services/wallet-service-net/.../HealthController.cs:13`
**No Policy-Based Authorization** (`[Authorize(Policy = "...")]`) is used anywhere. JWT scope validation is disabled in the IAM service configuration:
```csharp
ValidateAudience = false, // services/iam-service-net/src/IamService.API/Program.cs
```
This means a token issued for one service's scope is silently accepted by all services.
---
### WARN-3: Health Check Implementation is Non-Standard
Only `wallet-service-net` has an explicit `HealthController` with `/api/v1/health/live` and `/api/v1/health/ready`. The majority of services rely on implicit framework behavior and register health checks in `Program.cs` without exposing a documented endpoint.
The docker-compose healthcheck command (`curl -f http://localhost:8080/health/live`) works only if each service individually exposes that path — which is inconsistent.
No standardized health check response format exists. Kubernetes readiness/liveness probes in staging (`deployments/staging/kubernetes/`) point to `/health/ready` and `/health/live` — if a service doesn't expose these, probes will fail silently.
---
### WARN-4: SignalR Hub Route Fragmentation
The Order service registers a WebSocket hub (`/hubs/pos`) and includes query-string token handling:
```csharp
// services/order-service-net/src/OrderService.API/Program.cs:216-226
var accessToken = context.Request.Query["access_token"];
if (!string.IsNullOrEmpty(accessToken) && path.StartsWithSegments("/hubs"))
context.Token = accessToken;
```
Traefik routes `/hubs/pos` to the order-service with a dedicated `hub-ratelimit` middleware. However, `chat-service-net` likely has its own hub (`/hubs/chat`) which is **not routed through Traefik at all**.
CORS headers on the `hub-ratelimit` middleware also need to be verified — WebSocket upgrades require permissive CORS preflight handling.
---
### WARN-5: Rate Limit Middlewares Not Verified
`routes.yml` references three rate limit middleware names (`auth-ratelimit`, `api-ratelimit`, `payment-ratelimit`, `hub-ratelimit`) but the corresponding definitions in `infra/traefik/dynamic/middlewares.yml` were not verified to be complete or consistent with the referenced names. A typo or missing definition causes Traefik to silently drop the middleware.
---
## Improvements
### IMP-1: Extract `ApiResponse<T>` to Shared Package
Move `services/iam-service-net/src/IamService.API/Application/Common/ApiResponse.cs` to a NuGet package (`GoodGo.Shared.ApiResponse`) referenced by all services. This enforces uniform serialization at compile time.
```
packages/
GoodGo.Shared.ApiResponse/
ApiResponse.cs
ApiError.cs
PaginationInfo.cs
```
### IMP-2: Standardize API Versioning via `_template_dot_net`
Update `services/_template_dot_net/src/MyService.API/Program.cs` to include `Asp.Versioning` configuration as the reference default. All services should include the same versioning setup during scaffolding:
```csharp
builder.Services.AddApiVersioning(options => {
options.DefaultApiVersion = new ApiVersion(1, 0);
options.AssumeDefaultVersionWhenUnspecified = true;
options.ReportApiVersions = true;
options.ApiVersionReader = ApiVersionReader.Combine(
new UrlSegmentApiVersionReader(),
new HeaderApiVersionReader("X-Api-Version"));
});
```
### IMP-3: Add a HealthController to `_template_dot_net`
Create a standard `HealthController` in the template that all services inherit:
- `GET /health/live` — always returns 200 (container alive)
- `GET /health/ready` — checks PostgreSQL + Redis connections, returns 200/503
### IMP-4: OpenAPI Spec Export in CI/CD
Add a CI job that runs `dotnet swagger tofile` for each service and stores the output as a build artifact. This creates the foundation for:
- Client SDK generation via `orval` or `openapi-generator`
- Contract testing via `pact.js`
- Central API documentation via Redoc or Scalar
### IMP-5: Add Policy-Based Authorization
Define authorization policies in a shared middleware and register them per-service. Minimum set:
- `RequireAuthenticated` — replaces bare `[Authorize]`
- `RequireMerchantAccess` — validates `merchant_id` claim matches route param
- `RequireAdminRole` — replaces `[Authorize(Roles = "Admin,SuperAdmin")]`
Enable audience validation once scopes are defined per service.
### IMP-6: Complete Traefik Routes with Route Priority Planning
Add all 11 missing services to `infra/traefik/dynamic/routes.yml`. Follow the existing pattern and assign route prefixes:
| Service | Suggested Prefix |
|---|---|
| `promotion-service-net` | `/api/v1/promotions`, `/api/v1/vouchers` |
| `social-service-net` | `/api/v1/social`, `/api/v1/feed` |
| `mission-service-net` | `/api/v1/missions`, `/api/v1/rewards` |
| `mining-service-net` | `/api/v1/analytics` (internal only, consider IP allowlist) |
| `chat-service-net` | `/api/v1/chat`, `/hubs/chat` |
| `ads-*-service-net` | `/api/v1/ads/*` (sub-paths per service) |
| `mkt-*-service-net` | `/api/v1/mkt/*` (sub-paths per channel) |
---
## Action Items
| Priority | Action | Owner | Files |
|---|---|---|---|
| P0 | Unify response format — extract `ApiResponse<T>` to shared package | Backend Lead | `services/iam-service-net/.../ApiResponse.cs``packages/GoodGo.Shared.ApiResponse/` |
| P0 | Fix `merchant-service-net` controllers to use `ApiResponse<T>` wrapper | Backend Dev | `services/merchant-service-net/.../MerchantsController.cs` |
| P0 | Add missing Traefik routes for all 11 unrouted services | DevOps | `infra/traefik/dynamic/routes.yml` |
| P1 | Standardize API versioning — update `_template_dot_net` and migrate non-IAM services | Backend Lead | `services/_template_dot_net/src/MyService.API/Program.cs` |
| P1 | Add `HealthController` to `_template_dot_net`, backfill all services | Backend Dev | `services/_template_dot_net/src/MyService.API/Controllers/HealthController.cs` |
| P1 | Add `AddSwaggerGen` to all services missing it; export specs in CI | DevOps | All missing `Program.cs` files |
| P2 | Define and implement Policy-Based Authorization with scope validation | IAM Dev | `services/iam-service-net/` + all service `Program.cs` |
| P2 | Verify `middlewares.yml` has all 4 rate limit middleware definitions | DevOps | `infra/traefik/dynamic/middlewares.yml` |
| P2 | Route `/hubs/chat` through Traefik with correct CORS/upgrade config | DevOps | `infra/traefik/dynamic/routes.yml` |
| P3 | Set up OpenAPI spec export in CI and configure Redoc/Scalar for central docs | DevOps | `.github/workflows/` |
| P3 | Add contract tests (Pact.js) for cross-service boundaries | QA | `tests/ContractTests/` |

228
docs/audit/architect.md Normal file
View File

@@ -0,0 +1,228 @@
# Audit Report — Architecture Design System (Architect Agent)
**Date**: 2026-03-20
**Auditor**: Architect (Architecture Design System Specialist)
**Scope**: Design system patterns, architecture consistency, component library
**Codebase**: `/Users/velikho/Desktop/WORKING/pos-system`
---
## Executive Summary
The GoodGo POS System has a **solid foundational design system** built on MudBlazor 8.15 + CSS Custom Properties with a well-structured Primitives → Semantics → Components token architecture. The main app (`web-client-tpos-net`) demonstrates mature theming, bilingual i18n, and responsive layout patterns. However, the design system lives entirely within a single app — there is **no shared UI package**, no Storybook, and no cross-platform token synchronization. Gaps in accessibility (ARIA), no light/dark mode flexibility, and fragmented component ownership across apps are the primary concerns requiring remediation.
---
## Critical Issues
### C-1: No Shared UI Component Package
**Impact**: Blockers for cross-app reuse and design consistency.
Components in `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/` are **not exported** as a shared package. The enterprise portal (`web-client-base-net`) and any future Blazor apps must re-implement their own versions of `AuthButton`, `AuthCard`, `OtpInput`, etc.
- Components folder: `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/` (only 2 subdirs: `Auth/`, `Pos/`)
- No `packages/@goodgo/ui-kit` or equivalent exists
- TypeScript packages at `packages/` do not include any Blazor/UI package
**Fix**: Extract reusable components into a shared Blazor component library (Razor Class Library), published as a NuGet package or mono-repo project reference.
---
### C-2: ARIA / Accessibility Gaps in Custom Components
**Impact**: WCAG 2.1 AA violation risk.
The custom component `AuthButton.razor` renders a `<button>` without `aria-label` when `ChildContent` is absent and only an icon is rendered. The `OtpInput.razor` renders 6 `<input>` fields without `aria-label` or `aria-describedby`. The `PosLayout.razor` sidebar toggle buttons use only `title=` for label — insufficient for screen readers.
Evidence:
- `AuthButton.razor`: `<button ... @onclick="OnClick">` — no `aria-label` parameter, no `aria-busy` for loading state
- `OtpInput.razor`: `<input id="otp-@index" ...>` — no `<label>`, no `aria-label="Digit @(index+1) of 6"`
- `PosLayout.razor:30`: `<button class="pos-mobile-toggle" ... title="Menu">` — only `title=`, no `aria-expanded`, no `aria-controls`
- Only `MainLayout.razor:31` has a single `aria-label="Toggle menu"` — the only ARIA attribute in all layouts
**Fix**: Add ARIA attributes to all interactive custom components. Minimum: `aria-label` on icon-only buttons, `aria-label` on OTP inputs, `aria-expanded`/`aria-controls` on toggles.
---
### C-3: No Design-to-Code Token Synchronization
**Impact**: Token drift between design files and implementation.
The CSS comment in `app.css` references `pencil-design/src/pages/aPOS/landing/` tokens, but there is no automated pipeline to sync Figma/Pencil variables → CSS custom properties. Design tokens exist only in:
- `app.css` (CSS custom properties, 1983 lines)
- `AppTheme.cs` (MudBlazor PaletteDark, partially duplicated)
The two token stores are **manually maintained and can drift**. `AppTheme.cs` references `#1A1A1D` for `Surface` while `app.css` defines `--bg-elevated: #1A1A1D` — same value, different names, no automated cross-check.
**Fix**: Implement Style Dictionary or Tokens Studio pipeline. Single source of truth → generates both CSS vars and C# constants.
---
## Warnings
### W-1: Theme Architecture Duplication Between `AppTheme.cs` and `app.css`
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/AppTheme.cs` + `wwwroot/css/app.css`
The same color values are defined twice: once in `PaletteDark` for MudBlazor and once as CSS custom properties. When a brand color changes (e.g., accent from `#FF5C00` to a new value), it must be updated in **two places** manually.
Count of duplicated values:
- Background: `#0A0A0B` appears in both `AppTheme.cs:Background` and `app.css:--bg-page`
- Surface: `#1A1A1D` appears in `AppTheme.cs:Surface` AND `app.css:--bg-elevated`
- Primary: `#FF5C00` in `AppTheme.cs:Primary` and `app.css:--accent-primary`
---
### W-2: `eval()` Usage in OtpInput for DOM Focus Management
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/Auth/OtpInput.razor`
```csharp
await JS.InvokeVoidAsync("eval", $"document.getElementById('otp-{index + 1}')?.focus()");
```
Using `eval()` for focus management is a security smell (Content Security Policy violation risk) and bad practice. Should use a dedicated JS interop function in a `*.js` or `*.ts` file.
---
### W-3: No Responsive Breakpoint Tokens
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/wwwroot/css/app.css`
The design token system in `app.css` defines spacing, colors, typography, and border radius — but **no breakpoint variables**. Responsive behavior in `pos.css`, `admin.css`, etc. uses hardcoded `@media (max-width: 768px)` values, not token references.
This makes it impossible to update the tablet breakpoint globally without a grep-and-replace across all CSS files.
---
### W-4: Component Library Scope Is Too Narrow
**Current**: Only 8 custom components across 2 subdirectories (`Auth/`, `Pos/`).
The 23+ pages in `Pages/Admin/Shop/` (e.g., `ShopMenu.razor`, `ShopTables.razor`, `ShopStaff.razor`) rely entirely on raw MudBlazor primitives with no intermediate abstraction layer. There are no shared:
- Data table with pagination pattern
- Confirmation dialog component
- Status badge component
- Empty state component
- Loading skeleton component
This leads to duplicated MudBlazor boilerplate across all admin pages.
---
### W-5: Scoped CSS Is Underused
**Evidence**: Only 2 scoped CSS files exist:
- `Layout/MainLayout.razor.css`
- `Layout/PosLayout.razor.css`
All other components use global CSS class names in `app.css`, `admin.css`, `pos.css`, `auth.css`, creating a fragile global namespace. Class naming follows BEM-like patterns (`auth-btn`, `auth-btn--@Variant`) but without `.razor.css` enforcement, naming conflicts are a maintenance risk.
---
### W-6: Marketing Module Uses a Separate Theme — Risk of Visual Drift
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/AppTheme.cs`
`MarketingDark` uses a completely different primary color (`#FACC15` yellow) and slightly different background (`#18181B`). The marketing module is visually disconnected from the rest of aPOS. This may be intentional branding, but there is no documented rationale or design system ADR explaining this split.
The `--bg-page` CSS variable is `#0A0A0B` (shared across all layouts), while `MarketingDark.Background = "#18181B"` — MudBlazor components under `MarketingLayout` will use `#18181B`, but any custom CSS components using `var(--bg-page)` will still render `#0A0A0B`.
---
### W-7: No TypeScript Shared Packages for Blazor Apps
**File**: `packages/` directory
The 6 shared TypeScript packages (`@goodgo/types`, `@goodgo/http-client`, `@goodgo/auth-sdk`, etc.) serve the MCP server and Node.js ecosystem. There is no equivalent shared infrastructure for Blazor:
- No shared NuGet package for DTOs
- `WebClientTpos.Shared` is app-scoped and not importable by `web-client-base-net`
- `WebClientBase.Shared` is a separate, parallel DTOs structure
Both apps duplicate `UserDto`, `ApiResponse<T>`, etc. independently.
---
## Improvements
### I-1: Establish `@goodgo/blazor-ui` Razor Class Library
Extract into a shared Razor Class Library (RCL):
- `AuthButton` → parameterized variant system
- `AuthCard`, `AuthInput`, `OtpInput` → auth primitives
- `ResponsiveOrderPanel` → already abstracted
- New: `DataTable<T>`, `ConfirmDialog`, `StatusBadge`, `EmptySate`, `LoadingSkeleton`
This enables `web-client-base-net` and future apps to import the same atoms.
### I-2: Style Dictionary Token Pipeline
Implement Style Dictionary with two outputs:
- `wwwroot/css/tokens.css` — CSS custom properties (replaces the token section of `app.css`)
- `DesignTokens.cs` — C# static class with all token values (replaces `AppTheme.cs` hardcoded strings)
This eliminates duplication between CSS vars and MudBlazor palette config.
### I-3: Replace `eval()` Focus Management with JS Module
Create `wwwroot/js/otp-helpers.js`:
```js
export function focusOtpInput(index) {
document.getElementById(`otp-${index}`)?.focus();
}
```
Then use `JS.InvokeVoidAsync("focusOtpInput", index + 1)` via JSImport or standard Blazor JS interop.
### I-4: Add Responsive Breakpoint Tokens
Extend `app.css :root {}` with:
```css
--breakpoint-sm: 640px;
--breakpoint-md: 768px;
--breakpoint-lg: 1024px;
--breakpoint-xl: 1280px;
```
Replace all hardcoded `@media (max-width: 768px)` occurrences with references.
### I-5: Add Storybook (or Blazorise Gallery)
Since Storybook does not natively support Blazor, consider:
- **Option A**: Storybook with `@storybook/web-components` for design token showcase (colors, typography, spacing)
- **Option B**: A `/gallery` route within the app itself that renders all component variants (similar to `/storybook` pattern for Blazor)
- **Option C**: Blazor Story tool (`BlazorStories` NuGet)
This documents variants, states (loading, disabled, error) and serves as living style guide.
### I-6: WCAG 2.1 AA Accessibility Pass
Systematic pass over all custom components:
- `AuthButton`: Add `[Parameter] string? AriaLabel`, `aria-busy="@Loading"`
- `OtpInput`: Add `aria-label="Mã OTP, chữ số @(index+1) trên @DigitCount"` to each input
- All layout toggle buttons: Add `aria-expanded`, `aria-controls`, `aria-label`
- Add `axe-core` Playwright accessibility assertions to E2E test suite
### I-7: Document the Marketing Dual-Theme Decision
Create ADR (Architecture Decision Record) at `docs/adr/001-marketing-dual-theme.md` explaining:
- Why `MarketingDark` deviates from `DefaultDark`
- How CSS variable conflicts between layouts should be resolved
- When/if a light mode should be considered
---
## Action Items
| # | Priority | Task | Owner | File |
|---|----------|------|-------|------|
| 1 | **P0** | Fix `eval()` in `OtpInput.razor` — replace with proper JS interop module | Frontend Dev | `Components/Auth/OtpInput.razor` |
| 2 | **P0** | Add ARIA attributes to `AuthButton`, `OtpInput`, layout toggle buttons | Frontend Dev | `Components/Auth/*.razor`, `Layout/PosLayout.razor` |
| 3 | **P1** | Extract `AuthButton`, `AuthCard`, `AuthInput`, `OtpInput` into shared RCL | Architect + Frontend Dev | New: `packages/blazor-ui/` |
| 4 | **P1** | Add responsive breakpoint tokens to `app.css` and refactor media queries | Frontend Dev | `wwwroot/css/app.css`, `*.css` files |
| 5 | **P1** | Unify DTO sharing: Create `GoodGo.Shared` NuGet accessible to both web apps | Backend Dev | New: `packages/dotnet-shared/` |
| 6 | **P2** | Implement Style Dictionary pipeline for token single source of truth | Architect | New: `packages/design-tokens/` |
| 7 | **P2** | Create shared admin atom components (`DataTable`, `StatusBadge`, `EmptyState`) | Frontend Dev | `Components/Admin/` (new) |
| 8 | **P2** | Add `axe-core` a11y assertions to Playwright E2E suite | QA | `tests/WebClientTpos.E2ETests/` |
| 9 | **P2** | Write ADR for Marketing dual-theme decision | Architect | `docs/adr/001-marketing-dual-theme.md` |
| 10 | **P3** | Set up Blazor component gallery page at `/gallery` for living style guide | Frontend Dev | New: `Pages/Gallery.razor` |
---
## Appendix: Key File Inventory
| Category | File Path |
|----------|-----------|
| Design Tokens (CSS) | `apps/web-client-tpos-net/src/WebClientTpos.Client/wwwroot/css/app.css` |
| MudBlazor Theme Config | `apps/web-client-tpos-net/src/WebClientTpos.Client/AppTheme.cs` |
| Auth Components | `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/Auth/` |
| POS Components | `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/Pos/` |
| Layout System | `apps/web-client-tpos-net/src/WebClientTpos.Client/Layout/` |
| Admin Pages | `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Admin/` |
| Localization | `apps/web-client-tpos-net/src/WebClientTpos.Client/wwwroot/locales/` |
| i18n Implementation | `apps/web-client-tpos-net/src/WebClientTpos.Client/Localization/` |
| Shared TypeScript Packages | `packages/` (types, http-client, auth-sdk, logger, tracing, config) |
| E2E Tests | `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/` |
| VitePress Docs | `apps/web-docs/` |

283
docs/audit/backend.md Normal file
View File

@@ -0,0 +1,283 @@
# Backend Audit Report — GoodGo POS System
> **Date**: 2026-03-20
> **Auditor**: Senior Backend Engineer (Agent 98eaaecb)
> **Scope**: .NET service implementation, CQRS patterns, MediatR usage, EF Core, domain model
> **Branch**: master (d0211e5)
---
## Executive Summary
Platform có 26 .NET 10 microservices với kiến trúc Clean Architecture + CQRS nhất quán và chất lượng code cao. Naming conventions, domain patterns, và bilingual documentation được áp dụng đồng bộ trên tất cả services. **Tuy nhiên, 3 vấn đề critical về security và 4 vấn đề high về consistency cần xử lý trước khi deploy staging.** Điểm nổi bật nhất là debug endpoints không có auth (privilege escalation) và SQL injection tồn tại trong merchant-service-net — cả hai đều là BLOCKER.
---
## Critical Issues
### C1 — Debug Endpoints `[AllowAnonymous]` trong MerchantService
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:249390`
**Severity**: BLOCKER — bất kỳ ai cũng có thể gọi
5 endpoints không có authentication:
- `POST /api/v1/staff/debug/seed` (line 249) — tạo staff data tùy ý
- `POST /api/v1/staff/debug/update-userid` (line 340) — ghi đè userId của bất kỳ staff qua reflection
- `POST /api/v1/staff/debug/update-merchant` (line 360) — ghi đè merchantId tùy ý
Đây là **privilege escalation + data tampering vulnerability** cực kỳ nghiêm trọng. Bất kỳ attacker nào cũng có thể leo thang quyền hoặc phá vỡ dữ liệu merchant trong production.
**Fix ngay**: Xóa hoàn toàn các debug endpoints này khỏi production code, hoặc wrap trong `if (env.IsDevelopment())` và thêm `[Authorize(Roles = "SuperAdmin")]`.
---
### C2 — SQL Injection trong StaffController
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:307, 367`
**Severity**: HIGH — string interpolation trực tiếp vào SQL
```csharp
// Line 307
cmd.CommandText = $"SELECT id FROM shops WHERE merchant_id = '{merchant.Id}' LIMIT 1";
// Line 367 — userId, merchantId từ [FromQuery] — SQL INJECTION RISK
cmd.CommandText = $"UPDATE merchants SET user_id = '{userId}' WHERE id = '{merchantId}'";
```
`Guid` tự sanitize ở thời điểm hiện tại, pattern string interpolation trong SQL là vi phạm security baseline. Bất kỳ refactor nào đổi type sang string sẽ tức thì trở thành lỗ hổng.
**Fix**: Dùng parameterized queries:
```csharp
cmd.CommandText = "UPDATE merchants SET user_id = @userId WHERE id = @merchantId";
cmd.Parameters.AddWithValue("@userId", userId);
cmd.Parameters.AddWithValue("@merchantId", merchantId);
```
---
### C3 — JWT Signature Validation Bị Tắt trong Development — 4 Services
**Files** (confirmed bằng grep toàn repo):
- `services/merchant-service-net/src/MerchantService.API/Program.cs`
- `services/ads-serving-service-net/src/AdsServingService.API/Program.cs`
- `services/ads-tracking-service-net/src/AdsTrackingService.API/Program.cs`
- `services/ads-billing-service-net/src/AdsBillingService.API/Program.cs`
**Severity**: HIGH — token forgery risk nếu `ASPNETCORE_ENVIRONMENT` bị misconfigure
```csharp
// Pattern này tồn tại trong cả 4 services
ValidateIssuerSigningKey = builder.Environment.IsDevelopment() ? false : true,
SignatureValidator = builder.Environment.IsDevelopment() ? ... : null,
```
Nếu `ASPNETCORE_ENVIRONMENT` không được set đúng trên server (hoặc bị override sai), tất cả JWT signatures đều bị bỏ qua — bất kỳ token tự tạo nào đều được chấp nhận. Nguy hiểm đặc biệt với `ads-billing-service-net` (xử lý billing).
> **Note**: `iam-service-net/tests/CustomWebApplicationFactory.cs` cũng có `ValidateIssuerSigningKey = false` nhưng đây là test context — chấp nhận được.
**Fix**: Dùng test signing key trong development thay vì skip validation hoàn toàn:
```csharp
ValidateIssuerSigningKey = true, // Luôn validate — kể cả development
// Development: dùng shared test key từ env var DEV_JWT_SIGNING_KEY
```
---
## Warnings
### W1 — CORS Policy `AllowAnyOrigin()` trên tất cả Services
**File**: Mọi service `Program.cs`, ví dụ `services/_template_dot_net/src/MyService.API/Program.cs:93`
```csharp
policy.AllowAnyOrigin().AllowAnyMethod().AllowAnyHeader();
```
Tất cả 26 services đều cấu hình CORS wildcard. Trong production, điều này cho phép bất kỳ domain nào gọi API — vi phạm same-origin policy intent.
**Fix**: Restrict origins trong production environment:
```csharp
if (env.IsProduction())
policy.WithOrigins("https://goodgo.vn", "https://admin.goodgo.vn")
.WithMethods("GET", "POST", "PUT", "DELETE", "PATCH")
.WithHeaders("Content-Type", "Authorization");
else
policy.AllowAnyOrigin().AllowAnyMethod().AllowAnyHeader();
```
---
### W2 — Idempotency Chỉ Có ở 3/26 Services
**Services với idempotency**: order-service-net, inventory-service-net, ads-serving-service-net
**Services thiếu**: wallet-service-net, booking-service-net, promotion-service-net, và 19 services khác
Các write operations trong wallet, booking, promotion có nguy cơ cao bị duplicate execution (network retry, user double-click). Thiếu `IRequestManager` idempotency cho phép double-charge, double-booking.
**Fix**: Implement `IRequestManager` pattern (đã có trong `_template_dot_net`) cho tất cả write-heavy services:
- Ưu tiên cao nhất: `wallet-service-net`, `booking-service-net`
- Ưu tiên trung bình: `promotion-service-net`, `membership-service-net`
---
### W3 — API Error Response Format Không Nhất Quán
**File**: `services/order-service-net/src/OrderService.API/Controllers/OrdersController.cs:73`
Một số controllers trả về raw message object thay vì format chuẩn:
```csharp
// Inconsistent — dùng ở một số controllers
return NotFound(new { Message = "Order not found" });
// Standard theo CLAUDE.md — cần dùng nhất quán
return NotFound(new { success = false, error = new { code = "ORDER_NOT_FOUND", message = "..." } });
```
**Fix**: Đồng nhất tất cả error response theo format: `{ success: false, error: { code, message } }`. Tốt nhất dùng exception middleware để map `DomainException` → ProblemDetails tự động, tránh manual error construction trong controllers.
---
### W4 — ProblemDetails Exception Mapping Không Đầy Đủ ở Một Số Services
**Reference (tốt)**: `services/iam-service-net/src/IamService.API/Program.cs` — map 5+ exception types
**Template thiếu**: `services/_template_dot_net/src/MyService.API/Program.cs` — chỉ map cơ bản
Các services mới tạo từ template sẽ không có exception mapping đầy đủ, dẫn đến stack trace leak hoặc generic 500 errors.
**Fix**: Cập nhật `_template_dot_net` với exception mappings đầy đủ. Sau đó sync cho các services đã tạo từ template.
---
### W5 — HttpContextAccessor Inject Trực Tiếp vào Handler
**File**: `services/merchant-service-net/src/MerchantService.API/Program.cs:31`
```csharp
builder.Services.AddHttpContextAccessor(); // OK
// Nhưng một số handlers inject IHttpContextAccessor trực tiếp
```
Truy cập HttpContext trong MediatR handler vi phạm clean boundary giữa application layer và web layer. Handler không nên biết về HTTP context.
**Fix**: Inject contextual data (userId, tenantId) vào Command/Query parameter khi dispatch từ Controller:
```csharp
// Controller
var userId = User.FindFirstValue(ClaimTypes.NameIdentifier);
await _mediator.Send(new CreateOrderCommand(userId, shopId, items), cancellationToken);
```
---
## Improvements
### I1 — Outbox Pattern cho Domain Events
**Hiện tại**: Domain events được dispatch in-process trong `SaveEntitiesAsync()` trước SaveChanges. Nếu process crash sau SaveChanges, events đã bị lost.
**Recommendation**: Implement transactional outbox pattern — persist events vào `outbox_messages` table trong cùng transaction, sau đó poll/relay sang RabbitMQ. Tham khảo: MassTransit Outbox, Wolverine Outbox.
**Impact**: Đảm bảo at-least-once delivery cho cross-service events — quan trọng cho wallet-service, booking-service, promotion-service.
---
### I2 — Missing OpenAPI Specs (openapi.yaml)
Theo `CLAUDE.md`, mỗi service PHẢI có `openapi.yaml`. Hiện tại chỉ có Swagger UI từ Swashbuckle (runtime generated), không có static `openapi.yaml` file được commit vào repo.
**Impact**: Không thể dùng API-first contract testing (Pact), không có versioning cho API spec.
**Fix**: Thêm script generate `openapi.yaml` vào CI pipeline:
```bash
dotnet swagger tofile --output openapi.yaml ./bin/Release/net10.0/ServiceName.API.dll v1
```
---
### I3 — Missing Prometheus Metrics Endpoint
Theo `CLAUDE.md`, mỗi service PHẢI có `/metrics` endpoint cho Prometheus. Chưa thấy `prometheus-net` hoặc `OpenTelemetry.Exporter.Prometheus.AspNetCore` được configure trong services.
**Fix**: Thêm vào `_template_dot_net` và sync:
```csharp
builder.Services.AddOpenTelemetry()
.WithMetrics(metrics => metrics
.AddAspNetCoreInstrumentation()
.AddPrometheusExporter());
app.MapPrometheusScrapingEndpoint("/metrics");
```
---
### I4 — Saga Pattern cho Cross-Service Transactions
Hiện tại order placement phụ thuộc vào nhiều services (inventory, wallet, promotion) nhưng không có orchestration nếu một service fail mid-flow. Tình huống: wallet bị trừ nhưng inventory deduct fail → inconsistent state.
**Recommendation**: Implement Saga orchestration (dùng MassTransit Saga hoặc Wolverine) cho Order Placement flow. Đây là medium-term investment nhưng cần thiết trước khi scale.
---
### I5 — Dapper Queries Không Có Timeout Explicit
Một số Dapper queries (read-heavy) không set `commandTimeout`, dùng default 30s của PostgreSQL. Với queries phức tạp trên dataset lớn, điều này có thể block connection pool.
**Fix**: Set explicit timeout cho Dapper queries:
```csharp
var result = await _connection.QueryAsync<T>(sql, parameters, commandTimeout: 5);
```
---
## Action Items (Prioritized)
| # | Priority | Action | Owner | File/Service |
|---|----------|--------|-------|-------------|
| 1 | **P0 BLOCKER** | Xóa debug endpoints `[AllowAnonymous]` khỏi StaffController | Backend Dev | `merchant-service-net/StaffController.cs:249390` |
| 2 | **P0 BLOCKER** | Fix SQL injection — dùng parameterized queries | Backend Dev | `merchant-service-net/StaffController.cs:307, 367` |
| 3 | **P0 HIGH** | Fix JWT signature validation — không skip trong development | Backend Dev | `merchant-service-net/Program.cs:91` |
| 4 | **P1 HIGH** | Restrict CORS origins trong production environment | Backend Dev | Tất cả services `Program.cs` |
| 5 | **P1 HIGH** | Implement idempotency cho wallet-service, booking-service | Backend Dev | `wallet-service-net`, `booking-service-net` |
| 6 | **P1 HIGH** | Đồng nhất API error response format (`{ success, error: { code, message } }`) | Backend Dev | Tất cả controllers |
| 7 | **P2 MEDIUM** | Cập nhật `_template_dot_net` với đầy đủ ProblemDetails exception mapping | Backend Dev | `_template_dot_net/Program.cs` |
| 8 | **P2 MEDIUM** | Refactor HttpContextAccessor ra khỏi handlers — inject qua Command params | Backend Dev | `merchant-service-net` handlers |
| 9 | **P2 MEDIUM** | Thêm `/metrics` Prometheus endpoint vào tất cả services | Backend Dev | Tất cả `Program.cs` |
| 10 | **P2 MEDIUM** | Generate và commit `openapi.yaml` trong CI pipeline | DevOps/Backend | CI `.github/workflows` |
| 11 | **P3 LOW** | Set explicit Dapper query timeout | Backend Dev | Tất cả services dùng Dapper |
| 12 | **P3 LOW** | Implement Outbox pattern cho domain events | Backend Dev | Tất cả services có cross-service events |
| 13 | **P3 LOW** | Implement Saga orchestration cho Order Placement flow | Backend Dev | `order-service-net`, `wallet-service-net`, `inventory-service-net` |
---
## Strengths
Những patterns đang được áp dụng tốt — nên duy trì:
- **Bilingual codebase**: Tất cả XML docs, comments, log messages có EN + VI — nhất quán 100%
- **Domain encapsulation**: Private fields + read-only properties, zero public setters — đúng chuẩn DDD
- **MediatR pipeline**: 3 behaviors (Logging → Validation → Transaction) áp dụng nhất quán 26/26 services
- **Entity patterns**: Aggregate roots, domain events, behavior methods — clean và testable
- **EF Core config**: snake_case columns, private field mapping, owned collections — đúng theo quy chuẩn
- **FluentValidation**: Bilingual messages, child validation, enum validation — đầy đủ
- **Container security**: Multi-stage Dockerfile, non-root user (dotnetuser:1001), healthcheck — production-ready
- **Test structure**: 50 test projects (25 unit + 25 functional), WebApplicationFactory pattern đúng
- **C# 14 / .NET 10**: Nullable enabled, TreatWarningsAsErrors=true, implicit usings — type-safe
- **Health checks**: `/health`, `/health/live`, `/health/ready` với PostgreSQL probe — đúng chuẩn Kubernetes
- **API versioning**: URL segment + Header reader, consistent v1.0 default — RESTful
- **Repository pattern**: Interface trong Domain, implementation trong Infrastructure — dependency inversion đúng
---
## Services Examined
| Service | Detail Level | Key Findings |
|---------|-------------|--------------|
| `order-service-net` | Deep | CQRS, entity, EF patterns — reference quality |
| `iam-service-net` | Deep | Authorization, exception handling — best ProblemDetails mapping |
| `merchant-service-net` | Deep | **C1 debug endpoints, C2 SQL injection, C3 JWT validation** |
| `fnb-engine-net` | Medium | Validation patterns — good use of RuleForEach |
| `inventory-service-net` | Medium | Idempotency pattern — only one of 3 services with IRequestManager |
| `wallet-service-net` | Medium | Repository implementation, Dapper usage |
| `_template_dot_net` | Deep | Reference architecture — needs update (W4, I2, I3) |
| Other 19 services | Structural | Verified pattern consistency, no deep code review |
**Coverage**: 7 services deep-reviewed (27%), 19 services structural verification (73%)

151
docs/audit/ceo.md Normal file
View File

@@ -0,0 +1,151 @@
# CEO Audit Report — GoodGo POS System
**Date:** 2026-03-20
**Auditor:** CEO Agent
**Scope:** Strategic overview, business alignment, team utilization, project maturity
---
## Executive Summary
GoodGo Platform is in **BETA/PRE-PRODUCTION** phase with 65% service completion (15/23 production-ready) and 93% POS E2E pass rate (38/41 tests). Architecture quality is excellent — Clean Architecture + CQRS strictly enforced across all services. The platform is ready for **staging deployment immediately** and **production pilot within 4-6 weeks**.
---
## Critical Issues
### C1. Low Test Coverage (~15%)
- **Risk:** Production incidents without adequate test safety net
- **Current:** 371 test files but coverage ratio is ~15%
- **Target:** 70% for production readiness
- **Impact:** Blocks production SLA commitments
- **Action:** Dedicate 2 QA engineers for 3-4 weeks
### C2. 5 Services Still In-Progress (22% incomplete)
- `promotion-service-net` — Missing 15-20 Command handlers
- `mission-service-net` — Domain model done, CQRS handlers incomplete
- `inventory-service-net` — Core done, Stock command handlers missing
- `ads-analytics-service-net` — Minimal commands, incomplete aggregation
- `ads-billing-service-net` — Limited command handlers (3 of 8+)
- **Impact:** Cannot deliver full feature coverage for all 5 verticals
- **Action:** 3 backend engineers, 2-3 weeks
### C3. ads-serving-service-net is READ-ONLY
- Zero command handlers — auction logic query-only
- Cannot create/manage ad placements programmatically
- **Impact:** Ads platform non-functional for write operations
---
## Warnings
### W1. Docker Compose Port Conflicts
- 4 marketing services (Facebook, WhatsApp, Zalo, X) all bind to port 5000
- Only ONE can run at a time in local development
- **Fix:** Allocate unique ports (5021-5024) — 2 hour effort
### W2. Missing Kubernetes Manifests
- Staging: 14/23 services (9 missing)
- Production: 12/23 services (11 missing)
- Missing: ads-*, mkt-*, promotion, mission services
- **Fix:** 1 week DevOps effort
### W3. Mobile Apps Incomplete
- **MAUI (Android):** Template phase only — not started
- **SwiftUI (iOS):** Auth flows done, feature screens pending
- **Impact:** No mobile channel for launch
### W4. 21 Separate Databases
- Each service has its own PostgreSQL database (data isolation pattern)
- Trade-off: Excellent isolation vs. complex distributed queries
- No cross-service query or reporting database yet
- **Recommendation:** Add read-replica or analytics database for reporting
---
## Improvements
### I1. Architecture Quality — Excellent
- 100% Clean Architecture compliance across all 23 services
- Domain layers have ZERO external dependencies
- Consistent CQRS pattern: Commands (write) + Queries (read) fully separated
- MediatR pipeline: Logging -> Validation -> Transaction -> Handler
- Bilingual documentation (EN + VI) throughout codebase
### I2. POS System — Production-Ready
- 38/41 E2E tests passing (93%)
- All admin features verified (13 categories)
- All staff features verified (6 categories)
- Full order flow: creation -> payment -> receipt
- Multi-role session isolation working
- 5 vertical support: Karaoke, Restaurant, Cafe, Spa, Retail
### I3. Infrastructure Stack — Comprehensive
- Full Docker Compose (1,411 lines, 25+ services)
- Kubernetes manifests for core services
- CI/CD via GitHub Actions (7 workflows)
- Observability: Prometheus + Grafana + Loki + Promtail
- API Gateway: Traefik v3 with rate limiting, CORS, security headers
### I4. MCP Server — Innovation Differentiator
- AI-assisted F&B operations (12 tools)
- Production-ready v1.0.0 with full documentation
- Unique market differentiator vs. competitors (KiotViet, Sapo, iPOS)
### I5. Development Velocity — Strong
- 20 commits in last 7 days
- Daily iteration cadence
- Active bug fixing and documentation
---
## Action Items (Prioritized)
### P0 — Critical (This Week)
1. **Fix Docker Compose port conflicts** — 2 hours, DevOps
2. **Start completing in-progress services** — 3 backend engineers assigned
### P0 — Critical (This Month)
3. **Complete 5 in-progress services** — Target: 23/23 production-ready by March 31
4. **Increase test coverage to 50%** — 2 QA engineers, 3 weeks
5. **Deploy 15 production-ready services to staging** — DevOps, 1-2 weeks
### P1 — High (April 2026)
6. **Complete remaining K8s manifests** — 9 staging + 11 production
7. **iOS app feature development** — SwiftUI, core POS screens
8. **Test coverage to 70%** — Production readiness threshold
9. **Performance baseline measurement** — Load testing on staging
### P2 — Medium (May 2026)
10. **Production pilot** — 1-2 enterprise merchants (Cafe/Restaurant verticals)
11. **MAUI Android app** — Start feature development
12. **Analytics/reporting database** — Cross-service read replica
13. **API rate limiting per tenant** — Currently per-route only
---
## Project Health Scorecard
| Category | Score | Notes |
|----------|:-----:|-------|
| Architecture | 9/10 | Clean Architecture + CQRS exemplary |
| Code Quality | 8/10 | Strong patterns, bilingual docs |
| Test Coverage | 4/10 | 15% is critically low |
| Infrastructure | 7/10 | Good stack, K8s gaps |
| Documentation | 8/10 | CLAUDE.md, reports, roadmap |
| Development Velocity | 9/10 | High commit rate, active iteration |
| Production Readiness | 5/10 | POS ready, platform incomplete |
| Mobile Readiness | 2/10 | iOS partial, Android template |
| **Overall** | **6.5/10** | **Ready for staging, not production** |
---
## Recommendation
**PROCEED with parallel execution:**
1. Deploy existing 15 services to staging NOW
2. Complete remaining 8 services in parallel
3. Ramp test coverage aggressively
4. Target production pilot for May 2026 with 1-2 merchants
**Success probability: HIGH (80%+)** given current momentum and architecture quality.

274
docs/audit/cto.md Normal file
View File

@@ -0,0 +1,274 @@
# CTO Audit Report — GoodGo POS System
> **Date**: 2026-03-20
> **Auditor**: CTO (Agent cbfd1a9f)
> **Scope**: Architecture decisions, service boundaries, tech debt, cross-service dependencies
> **Branch**: master (d0211e5)
---
## Executive Summary
Platform đang trong giai đoạn late-MVP với 15/24 services production-ready và POS flow đã hoạt động end-to-end (xác nhận qua deployment report 2026-03-13). Wave 1-3 fixes (202 files) đã giải quyết phần lớn P0 security và runtime bugs. **Tuy nhiên, 5 vấn đề critical mới được phát hiện** cần fix trước khi deploy staging: debug endpoints không có auth, SQL injection, BFF CORS wildcard, 4 mkt-* services chưa vào Docker Compose, và cross-service messaging chưa thực sự triển khai.
---
## Critical Issues
### C1 — Debug Endpoints `[AllowAnonymous]` trong Production Code
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:249-390`
**Severity**: BLOCKER — bất kỳ ai cũng có thể gọi vào production
5 endpoints không có authentication:
- `POST /api/v1/staff/debug/seed` — tạo staff data tùy ý (line 249)
- `POST /api/v1/staff/debug/update-userid` — ghi đè userId của bất kỳ staff nào qua reflection (line 340)
- `POST /api/v1/staff/debug/update-merchant` — ghi đè merchantId của bất kỳ merchant nào (line 360)
Đây là privilege escalation / data tampering vulnerability cực kỳ nghiêm trọng. Các endpoint này phải được xóa hoàn toàn khỏi production code hoặc chỉ chạy trong `Development` environment.
**Fix**: Xóa ngay các method này hoặc wrap với `if (env.IsDevelopment())` và thêm `[Authorize(Roles = "SuperAdmin")]`.
---
### C2 — SQL Injection trong StaffController
**File**: `services/merchant-service-net/src/MerchantService.API/Controllers/StaffController.cs:307,367`
**Severity**: HIGH — tham số từ query string được nối trực tiếp vào SQL
```csharp
// Line 307 — merchant.Id là Guid (safe), nhưng pattern nguy hiểm
cmd.CommandText = $"SELECT id FROM shops WHERE merchant_id = '{merchant.Id}' LIMIT 1";
// Line 367 — userId và merchantId đến từ [FromQuery] — SQL INJECTION RISK
cmd.CommandText = $"UPDATE merchants SET user_id = '{userId}' WHERE id = '{merchantId}'";
```
`Guid` tự sanitize, pattern string interpolation trong SQL là vi phạm security baseline. Bất kỳ refactor nào đổi type sang string sẽ tức thì trở thành lỗ hổng.
**Fix**: Dùng parameterized queries (`cmd.Parameters.AddWithValue()`).
---
### C3 — BFF CORS Wildcard `AllowAnyOrigin()`
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Server/Program.cs:74-78`
**Severity**: HIGH — cho phép bất kỳ domain nào gọi BFF API
```csharp
policy.AllowAnyOrigin()
.AllowAnyMethod()
.AllowAnyHeader();
```
Đây là điểm vào duy nhất của frontend. Nếu BFF có endpoint nhạy cảm, bất kỳ website nào cũng có thể khai thác CSRF.
**Fix**: Thay bằng whitelist cụ thể theo environment (`localhost:3001`, `goodgo.vn`, `admin.goodgo.vn`).
---
### C4 — 4 mkt-* Services Không Có Trong Docker Compose (Port 5000 Conflict)
**File**: `deployments/local/docker-compose.yml`
**Severity**: HIGH — 4 services không thể chạy local hoặc staging
Services `mkt-facebook`, `mkt-whatsapp`, `mkt-x`, `mkt-zalo` đều thiếu trong `docker-compose.yml`. Lý do đã được ghi nhận trong ROADMAP nhưng chưa được giải quyết: tất cả 4 services được cấu hình với port 5000 (xung đột). Thực tế các services này cũng thiếu Traefik routes.
**Fix**: Gán ports 5021-5024, thêm vào docker-compose và Traefik. DevOps issue đang ở `TODO` trong ROADMAP Week 9-10.
---
### C5 — Cross-Service Messaging Chưa Được Triển Khai
**Severity**: HIGH — tất cả `IntegrationEvents/EventHandlers/` đều rỗng
Audit thư mục `IntegrationEvents/EventHandlers` trên 10 services: **không có file `.cs` nào**. Tất cả là thư mục rỗng. Đây có nghĩa là RabbitMQ được khai báo trong docker-compose nhưng không có service nào thực sự publish/consume messages qua event bus.
Điều này ảnh hưởng trực tiếp đến:
- Order → Kitchen deduction (đang dùng HTTP + Polly theo ROADMAP, nhưng đây là tight coupling)
- Promotion voucher redemption events (no consumer)
- Membership level changes (no consumer)
Hiện tại hệ thống hoạt động vì business flows quan trọng dùng HTTP trực tiếp. Nhưng đây là tech debt sẽ gây vấn đề khi scale.
**Fix**: Implement IEventBus abstraction với RabbitMQ backend, bắt đầu từ OrderCreatedIntegrationEvent → inventory deduction.
---
## Warnings
### W1 — TenantMiddleware Dùng String Interpolation trong SQL SET
**Files**: `TenantMiddleware.cs` trong 5 services (order, inventory, wallet, catalog, fnb-engine)
```csharp
cmd.CommandText = $"SET LOCAL app.current_shop_id = '{shopId.ToString("D")}'";
```
Guid.ToString("D") an toàn trong thực tế vì Guid không chứa ký tự nguy hiểm. Nhưng cần chuẩn hóa sang `NpgsqlCommand.AddParameter()` để tuân thủ secure coding standards.
---
### W2 — Test Coverage Cực Thấp (~15%)
Chỉ 1 trong 24 services (IAM) có CI pipeline chạy tests. Thống kê unit test files:
- IAM: 32 test files ✓
- FnB Engine: 16 test files ✓ (96 tests theo ROADMAP)
- Order: 5 test files
- Wallet: 8 test files
- Merchant: 3 test files
- **22 services còn lại**: không có CI test pipeline
Target roadmap là 50% coverage trước Phase 3. Với tốc độ hiện tại sẽ không đạt được.
---
### W3 — booking-service Thiếu K8s Staging Manifest
**File**: `deployments/staging/kubernetes/` — không có `booking-service.yaml`
Booking service có trong docker-compose, có Traefik route (partial — thiếu `/api/v1/schedules`), có K8s production manifest, nhưng **thiếu staging**. Mọi staging deployment sẽ không include booking/spa flow.
---
### W4 — 13 Traefik Routes Bị Thiếu
Services có code nhưng không có Traefik route trong `infra/traefik/dynamic/routes.yml`:
- `chat-service` (SignalR hub cũng thiếu)
- `social-service`
- `mining-service`
- `mission-service`
- `promotion-service`
- `ads-manager-service`
- `ads-serving-service`
- `ads-billing-service`
- `ads-tracking-service`
- `ads-analytics-service`
- `mkt-facebook-service`
- `mkt-whatsapp-service`
- `mkt-zalo-service`
Hiện tại các service này chỉ accessible trực tiếp qua port (bypassing Traefik security middleware).
---
### W5 — Fire-and-Forget Task trong mkt-whatsapp WebhooksController
**File**: `services/mkt-whatsapp-service-net/src/WhatsAppService.API/Controllers/WebhooksController.cs:88`
```csharp
_ = Task.Run(async () => await ProcessWebhookAsync(payload));
```
Đây là unobserved task — lỗi trong `ProcessWebhookAsync` sẽ bị nuốt im lặng. Dùng hosted service hoặc background job queue thay thế.
---
### W6 — Docker Images Dùng `:latest` Tag
**File**: `deployments/local/docker-compose.yml`
Tất cả service images đang dùng `goodgo/*:latest`. Vi phạm 12-factor app (reproducibility). Production K8s manifest nên dùng immutable commit SHA tags.
---
### W7 — Promotion & Mission Services Chưa Đủ Handlers
Mặc dù Wave 2 đã add handlers theo ROADMAP:
- `promotion-service`: vẫn thiếu `ExchangeVoucherCommand`, `PurchaseVoucherCommand` handlers theo CTO_FIX_TRACKER
- `inventory-service`: chỉ có `DeductInventoryCommand` — thiếu `AddStock`, `AdjustInventory`, `GetInventoryReport`
- `mission-service`: `GetUserMissionProgressQuery` theo tracker chưa rõ trạng thái
ROADMAP Phase 3 Week 9-10 đặt priority P0 cho các items này — cần xác nhận lại trạng thái thực tế.
---
### W8 — Test Credentials Hardcoded trong ROADMAP.md
**File**: `ROADMAP.md` Section IX
```
hongochai10@icloud.com | Velik@2026 | Admin/Owner
tranvanb@goodgo.vn | Staff@2026 | Staff/Cashier
```
ROADMAP.md là file check vào git. Credentials không nên ở đây ngay cả cho dev/staging. Dùng `.env.example` hoặc secret manager.
---
## Improvements
### I1 — Implement IEventBus / RabbitMQ Integration
Hiện tại RabbitMQ được provisioned trong docker-compose nhưng không có service nào dùng. Đây là cơ hội lớn để decouple cross-service communication:
- Tạo shared `IEventBus` interface trong `packages/` (hoặc NuGet package)
- RabbitMQ implementation với idempotency (outbox pattern)
- Start với `OrderItemServedDomainEvent` → inventory deduction thay vì HTTP call
---
### I2 — API Gateway Security Hardening
Traefik `middlewares.yml` đang ổn nhưng thiếu:
- **SSL redirect** bị disable (`sslRedirect: false`) — phải enable trên staging/production
- **Missing routes** cho 13 services bypass security middleware hoàn toàn
- `api-ratelimit` (100/min) chưa được apply cho các services chưa có route
---
### I3 — CI Pipeline Expansion
Chỉ có `ci-iam-service.yml`. Cần thêm ít nhất:
1. `ci-merchant-service.yml` (critical — có seed endpoints nhạy cảm)
2. `ci-order-service.yml` (critical — order flow)
3. `ci-fnb-engine.yml` (có 96 tests, nên chạy trong CI)
4. `ci-wallet-service.yml` (payment critical path)
---
### I4 — Seed Endpoint Removal / Environment Guard
Ngoài security fix cho C1, recommend tách seed logic ra:
- Script riêng trong `scripts/db/seed.sh`
- Database seeder class chỉ chạy khi `env.IsDevelopment() && args.Contains("--seed")`
- Không bao giờ expose seed logic qua HTTP endpoint
---
### I5 — Reflection-Based Field Setter Anti-Pattern
**File**: `StaffController.cs:352`
```csharp
var field = typeof(MerchantStaff).GetField("_userId", BindingFlags.NonPublic | BindingFlags.Instance);
field?.SetValue(staff, userId);
```
Đây là violation của DDD encapsulation. Cần thêm domain method `staff.AssignUserId(Guid userId)` trong aggregate root.
---
## Action Items (Prioritized)
| Priority | Item | Owner | Effort |
|----------|------|-------|--------|
| **P0** | [C1] Xóa / guard debug endpoints (`StaffController.cs:249-390`) | Backend Dev | 30 min |
| **P0** | [C2] Fix SQL injection — parameterized queries (`StaffController.cs:307,367`) | Backend Dev | 30 min |
| **P0** | [C3] Fix BFF CORS — thay `AllowAnyOrigin()` bằng whitelist | Backend Dev | 15 min |
| **P0** | [C4] Fix port conflict + add mkt-* services to Docker Compose (ports 5021-5024) | DevOps | 2h |
| **P1** | [W3] Tạo booking-service K8s staging manifest | DevOps | 1h |
| **P1** | [W4] Add 13 Traefik routes (chat, social, mining, ads-*, mkt-*, promotion, mission) | DevOps | 2h |
| **P1** | [W7] Fix promotion/mission/inventory service handlers | Backend Dev | 1-2 days |
| **P1** | [C5] Implement IEventBus skeleton + first integration event | Backend Dev | 3-5 days |
| **P2** | [W1] Fix TenantMiddleware SQL string interpolation (5 services) | Backend Dev | 1h |
| **P2** | [W5] Fix fire-and-forget in mkt-whatsapp | Backend Dev | 30 min |
| **P2** | [W6] Thay `:latest` bằng commit SHA trong CI/CD | DevOps | 1h |
| **P2** | [W8] Xóa credentials khỏi ROADMAP.md | CTO | 5 min |
| **P2** | [I3] Thêm CI pipelines cho merchant, order, fnb, wallet | DevOps | 2h |
| **P3** | [I4] Tách seed logic ra scripts | Backend Dev | 2h |
| **P3** | [I5] Add `AssignUserId()` domain method thay reflection | Backend Dev | 30 min |
| **P3** | Enable SSL redirect trong Traefik (staging/prod) | DevOps | 15 min |
---
## Architectural Assessment
### What's Working Well
- **Clean Architecture + CQRS** được tuân thủ nhất quán trong 15 production-ready services
- **MediatR pipeline** (Logging → Validation → Transaction → Handler) hoạt động đúng
- **Multi-tenancy** via EF Core global filters + PostgreSQL RLS — defense-in-depth tốt
- **IAM service** với Duende IdentityServer là foundation vững chắc
- **BFF pattern** với YARP là đúng hướng — tránh expose microservices trực tiếp
- **FnB Engine** là service chất lượng cao nhất (96 tests, full DDD, SignalR hub)
- **Observability stack** (Prometheus + Grafana + Loki) đã sẵn sàng
### What Needs Attention
- **Event-driven messaging** chưa triển khai — RabbitMQ hiện là dead weight
- **Auth boundary** giữa BFF và microservices cần review: BFF dùng `AuthForwardingHandler` để forward JWT, nhưng BFF CORS wildcard tạo ra điểm yếu
- **Service decomposition** cho ads services (5 services) có vẻ over-engineered cho MVP — ads-serving, ads-billing, ads-analytics, ads-tracking có minimal business logic và không đủ handlers
---
*Report generated by CTO Agent | Run ID: 3d2b7974 | Commit: d0211e5*

View File

@@ -0,0 +1,329 @@
# Database Architecture Audit — GoodGo POS System
**Auditor:** Database Architect (TechBi Company)
**Date:** 2026-03-20
**Scope:** Schema design, migrations, indexing, query performance, data modeling
**Coverage:** 23 microservices, 24 PostgreSQL databases, Redis 7 caching layer
---
## Executive Summary
The platform follows a sound **database-per-service** architecture with EF Core 10 + Fluent API configuration, domain-driven entity design, and consistent snake_case column naming across most services. The foundation is solid for a POS microservices platform. However, three critical issues require immediate attention: a broken `InventoryItem.CreatedAt` field mapping that silently returns `DateTime.MinValue` from the database, missing optimistic concurrency on wallet balance updates (race condition risk under concurrent POS load), and UUID v4 used universally instead of the UUID v7 standard specified in the architecture guidelines. Additionally, IAM's Phase 4A migration introduced PascalCase table names that break the snake_case convention across the rest of the system.
---
## Critical Issues
### C-1 · `InventoryItem.CreatedAt` Backing Field Not Mapped — Silent Data Loss
**File:** `services/inventory-service-net/src/InventoryService.Infrastructure/EntityConfigurations/InventoryItemEntityTypeConfiguration.cs:34`
```csharp
// WRONG — Ignores public property but NEVER maps the private backing field _createdAt
builder.Ignore(i => i.CreatedAt);
```
The domain entity has `private DateTime _createdAt` and `public DateTime CreatedAt => _createdAt;`. The `created_at` column exists in the migration and database. However, because `builder.Ignore(i => i.CreatedAt)` discards the property *and* no `builder.Property<DateTime>("_createdAt").HasColumnName("created_at")` mapping exists, EF Core never reads `created_at` from the DB into the backing field. Every entity loaded from the database will have `CreatedAt == DateTime.MinValue`.
Compare with the correct pattern used in `InventoryItemEntityTypeConfiguration` for other fields:
```csharp
builder.Property(i => i.ShopId).HasField("_shopId").HasColumnName("shop_id").IsRequired(); // ✓ correct
```
**Fix:**
```csharp
// Replace builder.Ignore(i => i.CreatedAt) with:
builder.Property<DateTime>("_createdAt")
.HasColumnName("created_at")
.IsRequired();
builder.Ignore(i => i.CreatedAt); // keep ignoring the computed public property
```
No migration needed — the column already exists.
---
### C-2 · No Optimistic Concurrency on Wallet Balances — Race Condition Risk
**File:** `services/wallet-service-net/src/WalletService.Infrastructure/EntityConfigurations/WalletItemEntityTypeConfiguration.cs`
`wallet_items.balance` (the per-currency balance) has no concurrency token. In a busy POS terminal, two concurrent `Debit` or `Credit` operations on the same wallet can interleave:
```
Tx1: READ balance=100 → compute 100-30=70
Tx2: READ balance=100 → compute 100-50=50
Tx1: WRITE balance=70
Tx2: WRITE balance=50 ← overwrites Tx1's debit, 30 VND disappears
```
**Fix — add PostgreSQL `xmin` system column as concurrency token:**
```csharp
// WalletItemEntityTypeConfiguration.cs
builder.UseXminAsConcurrencyToken(); // Npgsql built-in
```
No migration needed — `xmin` is a PostgreSQL system column. EF Core will throw `DbUpdateConcurrencyException` on conflict, which the application must retry.
Also applies to `wallet_transactions` if idempotent writes are not already enforced by `ClientRequest` idempotency table.
---
### C-3 · UUID v4 (`Guid.NewGuid()`) Used Everywhere Instead of UUID v7
**Files:** All domain aggregate roots across all 23 services
Architecture spec mandates **UUID v7** (time-sortable, B-tree friendly). Every service uses `Guid.NewGuid()` (UUID v4 — random). At scale, random UUIDs cause **index fragmentation** in PostgreSQL B-tree indexes: each new row's primary key is a random value, forcing page splits and bloating indexes.
**Examples:**
- `services/order-service-net/src/OrderService.Domain/AggregatesModel/OrderAggregate/Order.cs:163``Id = Guid.NewGuid()`
- `services/merchant-service-net/src/MerchantService.Domain/AggregatesModel/ShopAggregate/Shop.cs:191``Id = Guid.NewGuid()`
**Fix (.NET 9+):**
```csharp
// Create a shared utility or use directly:
Id = Guid.CreateVersion7(); // .NET 9+ built-in, time-sortable
```
This is a cross-cutting change. Recommend introducing a `NewId()` factory in the `SeedWork/Entity.cs` base class:
```csharp
protected static Guid NewId() => Guid.CreateVersion7();
```
---
### C-4 · IAM Phase4A Migration — PascalCase Table Names Break Convention
**File:** `services/iam-service-net/src/IamService.Infrastructure/Migrations/20260114091114_Phase4A_AuditAndCompliance.cs`
Audit and compliance tables were created with PascalCase names, breaking the snake_case convention used by every other service:
| Table Created | Expected |
|---|---|
| `AuditLogs` | `audit_logs` |
| `ComplianceReports` | `compliance_reports` |
| `ComplianceViolations` | `compliance_violations` |
| `IX_AuditLogs_ActorId` | `ix_audit_logs_actor_id` |
| `IX_AuditLogs_Timestamp` | `ix_audit_logs_timestamp` |
PostgreSQL is case-insensitive for unquoted identifiers but stores them lowercase. Quoted identifiers in EF queries will fail if mixed. The `AuditLogRepository` likely works by accident (EF generates quoted DDL), but it creates a maintainability hazard.
**Fix — additive migration:**
```sql
ALTER TABLE "AuditLogs" RENAME TO audit_logs;
ALTER TABLE "ComplianceReports" RENAME TO compliance_reports;
ALTER TABLE "ComplianceViolations" RENAME TO compliance_violations;
ALTER INDEX "IX_AuditLogs_ActorId" RENAME TO ix_audit_logs_actor_id;
-- ... etc
```
Then update `EntityTypeConfiguration` and `AuditLogEntityTypeConfiguration` to use `builder.ToTable("audit_logs")`.
Also note: **OpenIddict tables** (`OpenIddictApplications`, `OpenIddictTokens`, etc.) in the same IAM database use PascalCase — this is OpenIddict's default and acceptable since it manages its own schema, but is worth documenting.
---
## Warnings
### W-1 · Missing Global Query Filter for Soft-Deleted Records
**File:** `services/merchant-service-net/src/MerchantService.Infrastructure/MerchantServiceContext.cs`
Both `Merchant` and `Shop` entities have `is_deleted` columns (correctly designed). However, `MerchantServiceContext.OnModelCreating` does not register a global query filter:
```csharp
// Missing in MerchantServiceContext.OnModelCreating:
modelBuilder.Entity<Merchant>()
.HasQueryFilter(m => EF.Property<bool>(m, "_isDeleted") == false);
modelBuilder.Entity<Shop>()
.HasQueryFilter(s => EF.Property<bool>(s, "_isDeleted") == false);
```
Without this, every repository query must manually add `.Where(x => !x.IsDeleted)`. Any query that omits this filter will return deleted records silently. This is a data leak risk.
**Fix:** Add `HasQueryFilter` in `MerchantServiceContext.OnModelCreating`. Ensure `IgnoreQueryFilters()` is available for admin/hard-delete operations.
---
### W-2 · Missing Composite Indexes on High-Traffic Query Patterns
The most common POS reporting queries are not supported by composite indexes:
| Service | Missing Index | Common Query |
|---|---|---|
| `order-service-net` | `(shop_id, created_at DESC)` | "Orders today for shop X" |
| `order-service-net` | `(shop_id, status_id)` | "Pending orders for shop X" |
| `catalog-service-net` | `(shop_id, is_active)` | "Active products for shop X" |
| `booking-service-net` | `(shop_id, start_time)` | "Appointments today for shop X" |
| `inventory-service-net` | `(shop_id, quantity)` | "Low stock items for shop X" |
Current separate single-column indexes (`ix_orders_shop_id` + `ix_orders_created_at`) require a bitmap index scan merge, which is slower than a single composite index for the `WHERE shop_id = ? AND created_at BETWEEN ? AND ?` pattern.
**Fix example for orders:**
```csharp
// OrderEntityTypeConfiguration.cs
builder.HasIndex("_shopId", "_createdAt")
.HasDatabaseName("ix_orders_shop_id_created_at");
builder.HasIndex("_shopId", "StatusId")
.HasDatabaseName("ix_orders_shop_id_status_id");
```
---
### W-3 · Index Naming Inconsistency — `idx_` vs `ix_` Prefix
**Files:** `services/ads-analytics-service-net/src/AdsAnalyticsService.Infrastructure/EntityConfigurations/`
All services use `ix_` prefix for index names. The `AdsAnalyticsService` (and its `ReportEntityTypeConfiguration`) uses `idx_` prefix:
- `idx_campaign_metrics_campaign_id` ← should be `ix_campaign_metrics_campaign_id`
- `idx_reports_advertiser_id` ← should be `ix_reports_advertiser_id`
Minor but affects tooling, migrations consistency, and DBA runbooks.
---
### W-4 · `Debug.WriteLine` Left in Production Code
**File:** `services/order-service-net/src/OrderService.Infrastructure/OrderContext.cs:59,71`
```csharp
System.Diagnostics.Debug.WriteLine("OrderContext::ctor - " + GetHashCode());
System.Diagnostics.Debug.WriteLine("OrderContext::ctor (tenant) - " + GetHashCode());
```
Debug output left in a production DbContext constructor. Under load, this generates noise. Should be removed or replaced with structured `ILogger` at `Debug` level.
---
### W-5 · Inconsistent DbContext File Naming
**File:** `services/mkt-facebook-service-net/src/FacebookService.Infrastructure/MyServiceContext.cs`
The file is named `MyServiceContext.cs` — the template placeholder name was not updated when the service was created. The class inside is correctly named `FacebookServiceContext`, but the file name reveals copy-paste from `_template_dot_net` without cleanup. Affects code navigation and grep tooling.
---
### W-6 · `ads-analytics-service-net` and `ads-billing-service-net` Have No Repositories
These two services access `DbContext` directly from command/query handlers, bypassing the Repository pattern used everywhere else. This:
- Makes unit testing harder (must mock DbContext instead of a simple IRepository)
- Breaks the Clean Architecture boundary (Infrastructure DbContext leaks into Application layer)
- Creates inconsistency for future maintainers
---
### W-7 · Missing `expires_at` Index on Vouchers
**File:** `services/promotion-service-net/src/PromotionService.Infrastructure/EntityConfigurations/VoucherEntityTypeConfiguration.cs`
Voucher expiry queries (`WHERE expires_at < NOW() AND status_id = 'active'`) have no index on `expires_at`. For a promotion system that may hold millions of vouchers, periodic expiry sweeps will cause sequential scans.
```csharp
// Add:
builder.HasIndex(v => v.ExpiresAt)
.HasDatabaseName("ix_vouchers_expires_at")
.HasFilter("expires_at IS NOT NULL");
```
---
## Improvements
### I-1 · Enable `pg_stat_statements` in `init.sql`
**File:** `infra/databases/postgres/init.sql`
The `init.sql` is nearly empty. Add the essential PostgreSQL extension for query monitoring:
```sql
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; -- useful for uuid_generate_v7() if Guid.CreateVersion7 is not used
```
---
### I-2 · Add PgBouncer for Local Development
**File:** `deployments/local/docker-compose.yml`
PgBouncer is referenced in staging/production (via Neon's built-in pooler with `?pgbouncer=true`). However, local development has no connection pooler. With 25+ microservices each holding a connection pool, the local PostgreSQL can hit `max_connections=200` quickly during integration testing.
Add a lightweight PgBouncer container to `docker-compose.yml`:
```yaml
pgbouncer:
image: bitnami/pgbouncer:1.22
environment:
POSTGRESQL_HOST: postgres
POSTGRESQL_PORT: 5432
PGBOUNCER_POOL_MODE: transaction
PGBOUNCER_MAX_CLIENT_CONN: 500
PGBOUNCER_DEFAULT_POOL_SIZE: 10
```
Services should connect to `pgbouncer:5432` locally.
---
### I-3 · Consider Table Partitioning for High-Volume Append-Only Tables
As the platform grows, these tables will accumulate millions of rows and benefit from range partitioning by month:
| Table | Service | Strategy |
|---|---|---|
| `wallet_transactions` | wallet-service | RANGE on `created_at` |
| `inventory_transactions` | inventory-service | RANGE on `created_at` |
| `order_items` | order-service | inherited from `orders` RANGE on `created_at` |
| `audit_logs` (IAM) | iam-service | RANGE on `timestamp` |
EF Core 10 + Npgsql support declarative partitioning. This is not urgent for current scale but should be planned for tables expected to exceed 10M rows.
---
### I-4 · Add Partial Index for Active Products
**File:** `catalog-service-net`
For POS menu loading, ~90% of queries filter on `is_active = true`. A partial index reduces index size by ~10x:
```csharp
builder.HasIndex(p => new { p.ShopId, p.CategoryId })
.HasDatabaseName("ix_products_active_shop_category")
.HasFilter("is_active = true");
```
---
### I-5 · Materialized View for Daily Sales Reporting
**Service:** `order-service-net`
The most common merchant dashboard query — "sales by day for the last 30 days" — requires aggregating `orders` and `order_items` over large date ranges. A nightly-refreshed materialized view would eliminate expensive aggregation queries at runtime:
```sql
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
shop_id,
DATE(created_at) AS sale_date,
COUNT(*) AS order_count,
SUM(total_amount) AS total_revenue,
SUM(discount_amount) AS total_discount
FROM orders
WHERE status_id != 4 -- exclude cancelled
GROUP BY shop_id, DATE(created_at);
CREATE UNIQUE INDEX ON mv_daily_sales(shop_id, sale_date);
```
Refresh with `REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales;` via a scheduled job.
---
## Action Items (Prioritized)
| Priority | Item | Service(s) | Effort |
|---|---|---|---|
| **P0** | Fix `InventoryItem._createdAt` backing field mapping | `inventory-service-net` | 30 min |
| **P0** | Add optimistic concurrency token to `wallet_items` | `wallet-service-net` | 1 hour |
| **P1** | Migrate IAM AuditLogs/ComplianceReports to snake_case | `iam-service-net` | 2 hours |
| **P1** | Add `HasQueryFilter` for soft-deleted Merchant + Shop | `merchant-service-net` | 1 hour |
| **P1** | Replace `Guid.NewGuid()` with `Guid.CreateVersion7()` in all domain entities | All 23 services | 4 hours |
| **P2** | Add composite indexes: `(shop_id, created_at)` for orders, `(shop_id, is_active)` for products | order, catalog | 1 hour |
| **P2** | Remove `Debug.WriteLine` from `OrderContext` constructors | `order-service-net` | 10 min |
| **P2** | Add `expires_at` index on `vouchers` | `promotion-service-net` | 15 min |
| **P2** | Standardize index prefix to `ix_` in ads-analytics | `ads-analytics-service-net` | 30 min |
| **P3** | Rename `MyServiceContext.cs` to `FacebookServiceContext.cs` | `mkt-facebook-service-net` | 5 min |
| **P3** | Add PgBouncer to local docker-compose | `deployments/local` | 2 hours |
| **P3** | Enable `pg_stat_statements` in `init.sql` | `infra/databases` | 15 min |
| **P3** | Add Repository pattern to ads-analytics + ads-billing | `ads-*-service-net` | 4 hours |
---
## Database Inventory Summary
| Category | Count |
|---|---|
| Microservices with own database | 23 |
| Total migrations | 46 |
| DbContext files | 24 |
| EntityTypeConfiguration files | 95 |
| Repository implementations | ~91 |
| PostgreSQL databases (local) | 24 |
| Redis usage | SignalR backplane (chat), caching |
**Overall Assessment:** Architecture is well-structured and production-viable. The critical issues (C-1 to C-4) should be resolved before the next production deployment. Warnings are tech debt that should be addressed in the next sprint.

266
docs/audit/devops.md Normal file
View File

@@ -0,0 +1,266 @@
# DevOps Audit Report — GoodGo POS Platform
**Auditor**: DevOps Engineer (TechBi Company)
**Date**: 2026-03-20
**Scope**: Docker configs, CI/CD pipelines, Kubernetes manifests, health checks, observability
**Branch**: master
---
## Executive Summary
Infrastructure cơ bản đã được triển khai tốt: multi-stage Docker builds, non-root container users, health probes trên tất cả K8s deployments, và CI/CD pipeline có environment approval cho production. Tuy nhiên có **2 vấn đề critical** cần fix ngay trước khi go-live: production K8s manifests hardcode `:latest` image tag, và Alertmanager chưa được cấu hình dẫn đến toàn bộ alert rules không có đích nhận.
---
## Critical Issues
### CRIT-1: Production K8s manifests hardcode `:latest` image tag
**File**: `deployments/production/kubernetes/iam-service.yaml:45`
```yaml
image: goodgo/iam-service-net:latest # WRONG
```
**Impact**: Nếu ai chạy `kubectl apply -f` trực tiếp (không qua pipeline), pod sẽ pull `:latest` — không reproducible, không rollback được. Mâu thuẫn hoàn toàn với nguyên tắc "NEVER use :latest tag in production" trong CLAUDE.md và logic pipeline `deploy-production.yml:327`.
**Affected files**: Cần kiểm tra tất cả files trong `deployments/production/kubernetes/` — khả năng cao tất cả service manifests đều bị.
**Fix**: Dùng placeholder `IMAGE_TAG` và sed-replace trong pipeline, hoặc dùng Kustomize/Helm để inject SHA tag.
---
### CRIT-2: Alertmanager không được cấu hình — toàn bộ alert rules im lặng
**File**: `infra/observability/prometheus/prometheus.yml:29`
```yaml
alerting:
alertmanagers:
- static_configs:
- targets: [] # EMPTY — no alerts will fire
```
**Impact**: Toàn bộ `alert-rules.yml``rules/service-alerts.yml` định nghĩa đúng (ServiceDown, HighErrorRate, HighLatencyP95...) nhưng **không có receiver nào**. Production incident xảy ra sẽ không có ai được notify.
**Fix**: Deploy Alertmanager và uncomment `targets: ['alertmanager:9093']`. Configure receivers (Slack/PagerDuty/email) cho ít nhất kênh critical.
---
### CRIT-3: `docker-build.yml` push `:latest` tag lên Docker Hub từ branch `main`
**File**: `.github/workflows/docker-build.yml:99-103`
```bash
if [ "$BRANCH" = "main" ]; then
echo "tags=${IMAGE}:latest,${IMAGE}:${SHA}" >> $GITHUB_OUTPUT
```
**Impact**: `:latest` tag bị overwrite mỗi lần push lên `main`. Mâu thuẫn với `deploy-production.yml` (dùng SHA). Tạo race condition nếu pipeline chạy song song. Tag `:latest` trên Docker Hub không ổn định.
**Fix**: Bỏ `:latest` khỏi production tags. Chỉ dùng `${SHA}` + `production` mutable tag nếu cần.
---
## Warnings
### WARN-1: `redis-exporter` được Prometheus scrape nhưng không tồn tại trong docker-compose
**File**: `infra/observability/prometheus/prometheus.yml:132`
```yaml
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121'] # service không tồn tại
```
`deployments/local/docker-compose.yml` không có service `redis-exporter`. Prometheus sẽ báo `target missing` liên tục. Cần thêm `oliver006/redis_exporter` vào compose hoặc xóa scrape job này.
---
### WARN-2: Nhiều microservices trong docker-compose nhưng không có trong CI/CD pipeline
**Missing từ `deploy-staging.yml` và `deploy-production.yml`**:
- `chat-service-net`, `social-service-net`, `mining-service-net`, `mission-service-net`
- `promotion-service-net`, `booking-service-net` (staging), `membership-service-net`
- Toàn bộ 5 ads services (`ads-manager`, `ads-serving`, `ads-billing`, `ads-tracking`, `ads-analytics`)
- `mkt-facebook`, `mkt-whatsapp`, `mkt-x`, `mkt-zalo`
15+ services không có automated deployment. Phải deploy thủ công.
---
### WARN-3: `pr-checks.yml` chỉ kiểm tra Node.js — không có .NET build/test
**File**: `.github/workflows/pr-checks.yml`
PR checks chỉ chạy `pnpm lint`, `pnpm typecheck`, `pnpm build`. Không có step nào build hoặc test .NET services. Broken .NET code có thể merge vào branch mà không bị phát hiện.
**Fix**: Thêm matrix build cho .NET services, hoặc ít nhất `dotnet build` cho các service thay đổi trong PR.
---
### WARN-4: Redis deploy trong K8s là single instance — SPOF
**File**: `deployments/staging/kubernetes/redis.yaml:14`
```yaml
replicas: 1
```
Redis là SignalR backplane cho `chat-service`, `fnb-engine`, `mining-service`. Redis down = toàn bộ real-time features down. Cần Redis Sentinel (HA) hoặc migrate sang Redis Cluster.
---
### WARN-5: Không có Kubernetes NetworkPolicy
Không tìm thấy `NetworkPolicy` manifest nào trong `deployments/staging/kubernetes/` hay `deployments/production/kubernetes/`. Theo principle least-privilege, mặc định tất cả pods có thể communicate với nhau (và ra internet nếu không có egress restriction).
**Fix**: Thêm default-deny NetworkPolicy + whitelist explicit inter-service communication.
---
### WARN-6: MinIO dùng credentials mặc định trong local dev
**File**: `deployments/local/docker-compose.yml:78-79`
```yaml
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=minioadmin123
```
Credentials mặc định well-known. Dù là local dev, nên dùng `.env` file để tránh habit bad practice. Thêm vào `.gitignore` nếu chưa có.
---
### WARN-7: Distributed tracing (Jaeger) bị comment out
**File**: `deployments/local/docker-compose.yml:1230-1241`
Jaeger config được comment out. Không có distributed tracing cho request correlation cross-service. Với 26+ microservices, đây là gap observability nghiêm trọng khi debug production issues.
**Fix**: Enable Jaeger hoặc dùng Grafana Tempo (đã có Loki/Grafana stack).
---
### WARN-8: Traefik Dashboard exposed không có authentication
**File**: `deployments/local/docker-compose.yml:121`
```yaml
- "--api.insecure=true"
```
Traefik dashboard (`http://localhost:8080`) accessible không cần auth. OK cho local dev nhưng cần đảm bảo `api.insecure=false` ở staging/production (K8s ingress không expose dashboard port nên staging OK, nhưng cần verify production Traefik config).
---
### WARN-9: `storage-service` naming inconsistency
Trong `docker-compose.yml` service được gọi là `storage-service` (line 152) nhưng build context là `../../services/storage-service-net`. Trong Traefik routes và K8s manifests cũng không nhất quán. Cần chuẩn hóa.
---
### WARN-10: `Jwt__RequireHttpsMetadata=false` trong staging configmap
**File**: `deployments/staging/kubernetes/configmap.yaml:21`
```yaml
Jwt__RequireHttpsMetadata: "false"
```
Staging đã có TLS (cert-manager + letsencrypt), nhưng JWT validation không require HTTPS. Nếu token bị intercept qua HTTP (misconfigured client), không bị phát hiện.
---
## Improvements
### IMP-1: Áp dụng GitOps với ArgoCD
Hiện tại CI/CD dùng `kubectl apply` trực tiếp từ GitHub Actions. Recommend deploy ArgoCD và dùng GitOps: Git là source of truth, ArgoCD tự sync K8s state. Lợi ích: drift detection, automatic rollback, audit trail, không cần expose KUBECONFIG trong GitHub Secrets.
---
### IMP-2: Thêm PodDisruptionBudget cho production services
Hiện chưa có PDB. Khi node drain/upgrade, tất cả replicas của một service có thể bị terminate cùng lúc. Với `maxUnavailable: 0` trong RollingUpdate — đây là bảo vệ khi rolling update, nhưng không bảo vệ khi voluntary disruption.
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: iam-service-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: iam-service
```
---
### IMP-3: Pin tất cả Docker image versions trong docker-compose
Hiện tại `minio/minio:latest` (line 74), `grafana/grafana:10.3.1` (OK), `prom/prometheus:v2.51.0` (OK).
Cần pin MinIO: `minio/minio:RELEASE.2024-03-15T01-07-19Z` hoặc version cụ thể.
---
### IMP-4: Thêm health check endpoint `/health/ready` cho web-client-tpos-net
**File**: `deployments/local/docker-compose.yml:1375`
```yaml
test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1"]
```
Blazor WASM app chỉ có `/health` (basic), không có readiness probe riêng. K8s sẽ gửi traffic trước khi app thực sự ready.
---
### IMP-5: Thêm image vulnerability scanning vào CI pipeline
Không có bước nào trong `.github/workflows/` scan Docker images cho CVE. Recommend thêm `aquasecurity/trivy-action` sau build step.
---
### IMP-6: Migrate Grafana credentials sang K8s Secret
**File**: `deployments/local/docker-compose.yml:1278`
```yaml
GF_SECURITY_ADMIN_PASSWORD=admin
```
Password `admin` là default. Cần rotate và inject qua Secret, không hardcode trong compose file.
---
### IMP-7: Bổ sung scrape targets còn thiếu trong Prometheus
`prometheus.yml` chỉ scrape 8 services trong khi docker-compose có 20+. Missing: `promotion-service`, `booking-service`, `membership-service`, `social-service`, `mining-service`, `mission-service`, `fnb-engine`, `wallet-service`, `ads-*`.
---
### IMP-8: Thêm Kubernetes Secrets External Operator cho production
`secrets.yaml.example` note "Option 3: Use sealed-secrets or external-secrets operator (recommended for production)" nhưng chưa implement. Nên dùng [External Secrets Operator](https://external-secrets.io) với HashiCorp Vault hoặc AWS Secrets Manager.
---
## Action Items (Prioritized)
| Priority | Issue | File | Effort |
|----------|-------|------|--------|
| P0 | Fix production K8s `:latest` image tags | `deployments/production/kubernetes/*.yaml` | 2h |
| P0 | Configure Alertmanager + receivers | `infra/observability/prometheus/prometheus.yml` | 4h |
| P0 | Remove `:latest` push từ docker-build.yml | `.github/workflows/docker-build.yml:100` | 30m |
| P1 | Thêm redis-exporter vào docker-compose | `deployments/local/docker-compose.yml` | 1h |
| P1 | Thêm .NET build/test vào pr-checks.yml | `.github/workflows/pr-checks.yml` | 2h |
| P1 | Thêm NetworkPolicy manifests | `deployments/*/kubernetes/` | 4h |
| P1 | Mở rộng CI/CD pipelines cho missing services | `.github/workflows/deploy-*.yml` | 8h |
| P2 | Deploy Alertmanager + PagerDuty/Slack | `infra/observability/` | 4h |
| P2 | Redis HA (Sentinel) cho staging/production | `deployments/*/kubernetes/redis.yaml` | 4h |
| P2 | Enable distributed tracing (Jaeger/Tempo) | `deployments/local/docker-compose.yml` | 2h |
| P2 | Thêm PodDisruptionBudget cho production | `deployments/production/kubernetes/` | 2h |
| P2 | Image vulnerability scanning trong CI | `.github/workflows/docker-build.yml` | 1h |
| P3 | Pin MinIO image version | `deployments/local/docker-compose.yml:74` | 15m |
| P3 | Rotate Grafana default password | `deployments/local/docker-compose.yml:1278` | 30m |
| P3 | ArgoCD GitOps adoption | New infra | 2-3 days |
| P3 | External Secrets Operator | `deployments/production/kubernetes/` | 1 day |
---
*Report generated by DevOps Engineer agent (TEC-223) — 2026-03-20*

View File

@@ -0,0 +1,289 @@
# Audit Report: POS System — Founding Engineer Perspective
**Date**: 2026-03-20
**Auditor**: Founding Engineer
**Scope**: Code quality, shared patterns, template compliance, DRY violations
**Working Directory**: `/Users/velikho/Desktop/WORKING/pos-system`
---
## Executive Summary
The GoodGo Platform is a well-structured enterprise microservices monorepo with 23 .NET services, 5 frontend applications, and 6 shared Node.js packages. Clean Architecture and CQRS patterns are consistently applied across all services. However, three critical issues must be resolved before production: insecure CORS configuration (all 23 services), broken health check authentication (18 services), and ~5,100 lines of duplicated MediatR behaviors.
---
## Critical Issues
### C1 — Insecure CORS Configuration (All 23 Services)
**Severity**: HIGH — Security vulnerability
**Files**: `services/*/src/*/Program.cs` lines ~113120
Every service uses `AllowAnyOrigin()`:
```csharp
// Current (INSECURE)
policy.AllowAnyOrigin()
.AllowAnyMethod()
.AllowAnyHeader();
```
This allows any website to issue cross-origin requests to the API, enabling CSRF attacks. There is no origin validation. `AllowAnyOrigin()` is also incompatible with `AllowCredentials()`, silently breaking cookie/credential flows.
**Fix**:
```csharp
var allowedOrigins = builder.Configuration
.GetSection("Cors:AllowedOrigins")
.Get<string[]>() ?? Array.Empty<string>();
policy.WithOrigins(allowedOrigins)
.AllowAnyMethod()
.AllowAnyHeader()
.AllowCredentials();
```
Add `Cors__AllowedOrigins__0=https://goodgo.vn` per service in `docker-compose.yml` and Kubernetes manifests.
---
### C2 — Health Check Endpoints Require Authentication (18 of 23 Services)
**Severity**: HIGH — Breaks Kubernetes liveness/readiness probes
**Example**: `services/merchant-service-net/src/MerchantService.API/Program.cs` lines 132137
```csharp
// Current (missing .AllowAnonymous())
app.MapHealthChecks("/health");
app.MapHealthChecks("/health/live", new() { Predicate = _ => false });
app.MapHealthChecks("/health/ready");
```
Kubernetes kubelet cannot authenticate; probes return `401 Unauthorized` → pod marked unhealthy → restart loop. Docker Compose healthcheck (`curl -f http://localhost:8080/health/live`) also fails.
**Services without `.AllowAnonymous()`**: merchant, iam, order, wallet, storage, membership, booking, catalog, inventory, chat, social, mining, mission, promotion, ads-manager, ads-analytics, mkt-facebook, mkt-whatsapp (and others).
**Services correctly configured**: ads-serving, ads-tracking, ads-billing, mkt-x, mkt-zalo.
**Fix**:
```csharp
app.MapHealthChecks("/health").AllowAnonymous();
app.MapHealthChecks("/health/live", new() { Predicate = _ => false }).AllowAnonymous();
app.MapHealthChecks("/health/ready").AllowAnonymous();
```
---
### C3 — MediatR Behaviors Duplicated Across All 23 Services
**Severity**: HIGH — Maintenance risk; single bug requires 23 fixes
**Files**: `services/*/src/*/Application/Behaviors/`
Three behavior files are copied verbatim in every service:
| File | Lines per copy | Total copies | Total duplicated lines |
|------|---------------|--------------|----------------------|
| `LoggingBehavior.cs` | ~59 | 23 | ~1,357 |
| `ValidatorBehavior.cs` | ~41 | 23 | ~943 |
| `TransactionBehavior.cs` | ~50 | 23 | ~1,150 |
| **Total** | | | **~3,450** |
Example of identical files:
- `services/merchant-service-net/src/MerchantService.API/Application/Behaviors/LoggingBehavior.cs`
- `services/order-service-net/src/OrderService.API/Application/Behaviors/LoggingBehavior.cs`
Only the namespace differs. Any fix in one behavior must be manually propagated to 22 other copies.
**Fix**: Extract to a shared NuGet package `GoodGo.MediatR.Behaviors`:
```
packages/
GoodGo.MediatR.Behaviors/
LoggingBehavior.cs
ValidatorBehavior.cs
TransactionBehavior.cs
```
Reference from each service:
```xml
<PackageReference Include="GoodGo.MediatR.Behaviors" Version="1.0.0" />
```
---
### C4 — JWT Signature Validation Disabled in Development
**Severity**: MEDIUM — Allows any token to be accepted in dev environments
**Files**: `services/*/src/*/Program.cs` lines ~91108
```csharp
ValidateIssuerSigningKey = builder.Environment.IsDevelopment() ? false : true,
SignatureValidator = builder.Environment.IsDevelopment()
? (token, _) => new JsonWebToken(token)
: null,
```
In development, any crafted JWT with valid `exp`/`nbf` claims is accepted. Tokens are not signature-verified. Risk is mitigated by `IsDevelopment()` check but raises the question of why this is needed at all.
**Root Cause**: Likely the IAM service (Duende IdentityServer) uses a rotating signing key not pre-shared with other services in Docker Compose local dev.
**Fix**: Seed a deterministic development signing key via environment variable and share it across services in `docker-compose.yml`. This enables full signature validation in all environments.
---
## Warnings
### W1 — SeedWork Classes Duplicated Across Domain Layers
**Impact**: Medium — Changes to base patterns require multi-service updates
**Files**: `services/*/src/*/Domain/SeedWork/`
`Entity.cs`, `IAggregateRoot.cs`, `IRepository.cs`, `ValueObject.cs`, `IUnitOfWork.cs`, `Enumeration.cs` are copied into every service's Domain layer. This is partially justified by DDD isolation (Domain layer must have no external NuGet dependencies), but it means fixing a bug in `Entity.cs` requires updates in 23 places.
**Recommendation**: Document this as an intentional architectural trade-off in CLAUDE.md (currently undocumented). If the team wants single-source-of-truth, create `GoodGo.Domain.SeedWork` as a source-only NuGet (no DI dependencies).
---
### W2 — API Versioning Configured But Not Fully Utilized
**Impact**: Low — Unused NuGet package capability
**Files**: `services/*/src/*/Program.cs` lines 4256
```csharp
builder.Services.AddApiVersioning(options => {
options.DefaultApiVersion = new ApiVersion(1, 0);
options.AssumeDefaultVersionWhenUnspecified = true; // Bypasses versioning
});
```
Controllers use hardcoded routes (`[Route("api/v1/merchants")]`) rather than the `{version:apiVersion}` parameter. `Asp.Versioning` NuGet is installed but the API version attribute system is not used. Routes work correctly but versioning capability is nullified.
**Options**:
1. Accept current pattern (hardcoded v1 routes) and document it
2. Migrate to `[ApiVersion("1.0")]` on controllers and use `{version:apiVersion}` route parameter
---
### W3 — Missing README.md in 14 Services
**Impact**: Low — Developer onboarding friction
**Services with README**: iam-service-net, merchant-service-net, membership-service-net, mkt-facebook-service-net, mkt-whatsapp-service-net, mkt-x-service-net, mkt-zalo-service-net (7 of 23)
**Services without README**: order, wallet, storage, booking, catalog, inventory, chat, social, mining, mission, promotion, ads-manager, ads-serving, ads-tracking, ads-billing, ads-analytics (16 of 23)
---
### W4 — `init-databases.sh` Must Be Manually Maintained
**File**: `deployments/local/init-databases.sh`
When a new service is created, the database must be manually added to this script. There is no automated enforcement. If a developer creates a new service and forgets this step, the service's database won't be initialized in local dev.
**Recommendation**: Add a pre-start validation script or generate the init script from docker-compose.yml service names.
---
### W5 — One Incomplete Test TODO
**File**: `services/iam-service-net/tests/IamService.FunctionalTests/Controllers/AuthorizationPolicyTests.cs`
```csharp
/// TODO: Configure test authentication to properly inject role claims.
```
Minor — single comment in test code, not production code. But indicates an incomplete test scenario for authorization policies.
---
## Improvements
### I1 — Extract Common Infrastructure to Shared NuGet Packages
Priority: High
Create a `GoodGo.*` suite of shared NuGet packages:
- `GoodGo.MediatR.Behaviors` — LoggingBehavior, ValidatorBehavior, TransactionBehavior
- `GoodGo.AspNetCore.Extensions` — Common Program.cs middleware setup
- `GoodGo.Domain.SeedWork` *(optional, source-only)* — Entity, ValueObject, IAggregateRoot
This eliminates the bulk of cross-service duplication.
---
### I2 — Standardize CORS via Configuration
Priority: High
Move CORS allowed origins out of code and into configuration. Each service should read from `Cors:AllowedOrigins` environment variable. Docker Compose and Kubernetes manifests set per-environment values. This is a one-time templating effort with high security payoff.
---
### I3 — Add Health Check Tests to All Functional Test Suites
Priority: Medium
Verify that `/health/live` and `/health/ready` return 200 without authentication. Currently only some functional test suites include these tests. Adding them to all 23 services would catch the C2 issue automatically in CI.
```csharp
[Fact]
public async Task HealthLive_ReturnsOk_WithoutAuthentication()
{
var response = await _client.GetAsync("/health/live");
response.StatusCode.Should().Be(HttpStatusCode.OK);
}
```
---
### I4 — Centralize Docker Compose Environment Defaults
Priority: Medium
Several environment variables are repeated across 23+ service definitions in `docker-compose.yml`. Consider using YAML anchors or a `.env` fragment approach to reduce drift:
```yaml
x-common-env: &common-env
ASPNETCORE_ENVIRONMENT: Development
Serilog__MinimumLevel: Information
ConnectionStrings__Redis: redis:6379
```
---
### I5 — Add Consistent CI Health Check Validation
Priority: Low
Add a CI step that calls `/health/live` on each newly-built Docker image before pushing to Docker Hub. This catches health check regressions before staging deployment.
---
## Action Items (Prioritized)
| # | Action | Priority | Effort | Owner |
|---|--------|----------|--------|-------|
| 1 | Fix CORS — replace `AllowAnyOrigin()` with `WithOrigins()` in all 23 services | P0 | 2h | Founding Engineer |
| 2 | Fix health check authentication — add `.AllowAnonymous()` to 18 services | P0 | 1h | Founding Engineer |
| 3 | Extract MediatR behaviors to `GoodGo.MediatR.Behaviors` NuGet | P1 | 4h | Founding Engineer |
| 4 | Fix JWT development signing key — use deterministic key via env var | P1 | 2h | Founding Engineer |
| 5 | Add health check tests to all functional test suites | P1 | 3h | QA Engineer |
| 6 | Create README.md for 16 missing services | P2 | 8h | Tech Lead |
| 7 | Document SeedWork duplication trade-off in CLAUDE.md | P2 | 0.5h | Tech Lead |
| 8 — | Resolve IAM authorization test TODO | P2 | 1h | Backend Dev |
| 9 | Clarify API versioning strategy and document or clean up | P3 | 1h | CTO |
---
## Architecture Strengths
The following areas are well-implemented and should serve as reference patterns:
- **Clean Architecture**: All 23 services strictly enforce the Domain → no external dependencies rule
- **CQRS**: Commands, Queries, and Handlers follow naming conventions consistently
- **Entity Pattern**: Private fields + behavior methods + domain events is correctly applied everywhere
- **Test Structure**: Every service has both unit tests (xUnit + Moq + FluentAssertions) and functional tests (WebApplicationFactory + InMemory)
- **Dockerfile**: Multi-stage build, non-root user (dotnetuser:1001), health check executable installed, port 8080
- **Traefik**: Path-based routing, rate limiting, secure headers, CORS middleware all configured
- **Bilingual**: Comments, logs, validation messages in EN + VI across the entire codebase
- **Observability**: Prometheus + Grafana + Loki stack configured in `infra/observability/`
- **Monorepo**: pnpm workspaces + Turborepo correctly set up with shared `packages/`

327
docs/audit/frontend.md Normal file
View File

@@ -0,0 +1,327 @@
# Frontend Audit Report — POS System
**Auditor**: Senior Frontend Microservice Engineer
**Date**: 2026-03-20
**Scope**: Blazor WASM apps (`web-client-tpos-net`, `web-client-base-net`), MudBlazor usage, component architecture, state management
**Codebase**: `/apps/web-client-tpos-net` (195 Razor files, 248 C# files)
---
## Executive Summary
The POS System frontend (`web-client-tpos-net`) has a solid architectural foundation with well-structured multi-layout design, multi-vertical POS support, and clean component separation. However, **3 critical security vulnerabilities** were identified (JWT in localStorage, hardcoded OAuth2 client secret, password grant flow), combined with near-zero test coverage (~1 test file, 36 lines) and ~20% of POS pages having incomplete backend integration. These issues must be resolved before production deployment.
---
## Critical Issues
### CRIT-01 — JWT Stored in localStorage (XSS Risk)
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs` lines 147148
**Severity**: CRITICAL
```csharp
await _js.InvokeVoidAsync("localStorage.setItem", TokenKey(role), token.AccessToken);
```
JWT tokens are stored in `localStorage`, making them accessible to JavaScript. Any XSS exploit will steal all active tokens and allow full account impersonation.
**Fix**: Use `httpOnly` cookies via the BFF (WebClientTpos.Server). Add a `/api/auth/token` endpoint on the BFF that sets the token as a cookie rather than returning it in the response body.
---
### CRIT-02 — Client Secret Hardcoded in Client-Side Code
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs` lines 3940
**Severity**: CRITICAL
```csharp
private const string ClientId = "password-client";
private const string ClientSecret = "password-client-secret";
```
Blazor WASM compiles to WebAssembly served to the browser. Any constant is extractable via browser developer tools or disassembly. If `password-client-secret` is a real secret used in Duende IdentityServer, it is already compromised.
**Fix**: Move all OAuth2 token exchange calls to the BFF. The BFF holds the client secret server-side and proxies to IAM on behalf of the browser.
---
### CRIT-03 — Deprecated Password Grant (OAuth2)
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs` line 136
**Severity**: CRITICAL
```csharp
new KeyValuePair<string, string>("grant_type", "password"),
```
The Resource Owner Password Credentials grant is deprecated in OAuth 2.1. It bypasses consent screens and exposes credentials to the client application. It cannot support MFA flows properly.
**Fix**: Migrate to Authorization Code Flow with PKCE. Use Duende IdentityServer's built-in OIDC endpoints.
---
### CRIT-04 — No CDN Subresource Integrity (SRI)
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/wwwroot/index.html` line 19
**Severity**: HIGH
Lucide icons loaded from `unpkg.com` CDN without an `integrity` attribute. A CDN compromise would allow arbitrary JavaScript execution in all POS sessions.
**Fix**:
```html
<script src="https://unpkg.com/lucide@latest/dist/umd/lucide.min.js"
integrity="sha384-[HASH]"
crossorigin="anonymous"></script>
```
Use a pinned version and generate the SRI hash with `openssl dgst -sha384`.
---
## Warnings
### WARN-01 — Token Refresh Not Implemented
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthStateService.cs`
**Severity**: HIGH
`AuthStateService` stores the access token but never checks expiry or calls the refresh token endpoint. When tokens expire, API calls silently fail with 401, and users are not redirected to login.
**Fix**: Add a `TokenExpiry` property and a background timer that refreshes the token 60 seconds before expiry using the `refresh_token` from `OAuthTokenResponse`.
---
### WARN-02 — Global HttpClient Header Mutation (Race Condition)
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/PosDataService.cs` lines 4047
**Severity**: HIGH
```csharp
_http.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", _authState.Token);
```
Mutating `DefaultRequestHeaders` on a shared `HttpClient` instance is not thread-safe. Under concurrent requests, one request may attach or overwrite the header mid-flight.
**Fix**: Use `HttpRequestMessage` with per-request headers, or register a `DelegatingHandler` that injects the Bearer token automatically without touching global headers.
---
### WARN-03 — Lucide Re-Init on Every Render
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Layout/AdminLayout.razor` line 196
**Severity**: MEDIUM
```csharp
protected override async Task OnAfterRenderAsync(bool firstRender)
{
try { await JS.InvokeVoidAsync("lucide.createIcons"); } catch { }
```
`createIcons()` traverses the entire DOM on every render cycle. `index.html` already sets up a `MutationObserver` for this purpose (lines 2546). The explicit call is redundant and adds unnecessary overhead.
**Fix**: Remove the `lucide.createIcons()` call from `OnAfterRenderAsync` entirely. The `MutationObserver` handles dynamic icon insertion.
---
### WARN-04 — MudBlazor ThemeProvider Declared in Multiple Layouts
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Layout/AdminLayout.razor` lines 1720
**Severity**: MEDIUM
`<MudThemeProvider>`, `<MudPopoverProvider>`, `<MudDialogProvider>`, and `<MudSnackbarProvider>` are declared in each layout independently. This causes multiple instances in the component tree and can create CSS specificity conflicts and duplicate dialog/snackbar stacks.
**Fix**: Move all four providers to `App.razor` once, outside the `<Router>`. Remove from individual layouts.
---
### WARN-05 — localStorage Logic Duplicated Across 5 Files
**Severity**: MEDIUM
Token read/write logic is repeated in:
1. `AuthService.cs` lines 147148
2. `AuthService.cs` lines 230236
3. `AdminLayout.razor` lines 215220
4. `StaffLayout.razor` (similar pattern)
5. `LanguageSwitcher.razor`
**Fix**: Extract a `LocalStorageService` with typed `GetAsync<T>` / `SetAsync<T>` / `RemoveAsync` methods. Register as singleton.
---
### WARN-06 — No Route Guards for Authenticated Pages
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Layout/AdminLayout.razor`
**Severity**: HIGH
Session restore happens in `OnAfterRenderAsync`, meaning the admin layout renders briefly before auth is checked. No `<AuthorizeView>` or route guard redirects unauthenticated users before render.
**Fix**: Add a `RedirectToLogin` component or use Blazor's `[Authorize]` attribute with a proper `AuthenticationStateProvider` backed by `AuthStateService`.
---
### WARN-07 — shopId Not Validated Against User Permissions
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Layout/AdminLayout.razor` lines 246286
**Severity**: HIGH
Shop context is detected by parsing the URL path (e.g., `/admin/shop/{shopId}/cafe/`). There is no check that the authenticated user has access to the given `shopId`. A user could navigate directly to another merchant's shop URL.
**Fix**: After detecting `shopId` from URL, call the backend to verify the current user owns/manages that shop. Redirect to `/admin` with an error if unauthorized.
---
### WARN-08 — Fragile Multi-Format API Response Deserialization
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/PosDataService.cs` lines 122185
**Severity**: MEDIUM
The deserializer tries 5 different response shapes in order. At line 152155, it falls back to selecting the **first array property** found in the JSON object. If a response contains multiple array properties (e.g., `{ "items": [...], "errors": [...] }`), it selects whichever comes first in the serialized JSON, which is non-deterministic.
**Fix**: Standardize all API responses to a single envelope: `{ "success": bool, "data": T, "error": { "code": "", "message": "" } }`. Remove the multi-format fallback logic.
---
### WARN-09 — Hardcoded String in AuthInput Component
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Components/Auth/AuthInput.razor` line 56
**Severity**: LOW
```csharp
public string ForgotPasswordText { get; set; } = "Quên mật khẩu?";
```
Default value hardcoded in Vietnamese. English-culture users will see Vietnamese text.
**Fix**: Use `@inject IStringLocalizer<AuthInput> L` and default to `L["ForgotPassword"]`.
---
### WARN-10 — ~20% of POS Pages Have Incomplete Backend Integration
**Severity**: HIGH
21 `TODO` comments indicate unimplemented API integration:
| File | Missing Integration |
|------|---------------------|
| `MemberCard.razor` | Member visit history API |
| `PartialPayment.razor` | Order Service API |
| `TipEntry.razor` | Order Service API |
| `GiftCardPayment.razor` | Order Service API |
| `PaymentPending.razor` | Order Service API |
| `CashDrawer.razor` | Merchant Service staff/shift |
| `QuickSale.razor` | Merchant Service staff/shift |
| `PendingOrders.razor` | Merchant Service staff/shift |
| `ClockInOut.razor` | Merchant Service staff/shift |
| `ShiftManagement.razor` | Merchant Service staff/shift |
| `OrderCancel.razor` | DDD Value Object mapping fix |
| `PriceCheck.razor` | DDD Value Object mapping fix |
| `OrderEdit.razor` | DDD Value Object mapping fix |
| `VoidRefund.razor` | DDD Value Object mapping fix |
| `StockIn.razor` | DDD Value Object mapping fix |
| `StockTransfer.razor` | DDD Value Object mapping fix |
| `StockOut.razor` | DDD Value Object mapping fix |
| `TreatmentPlan.razor` | Backend CRUD API |
| `ConsentForm.razor` | Backend persistence API |
---
### WARN-11 — Incomplete vi-VN Translations
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/wwwroot/locales/vi-VN.json`
**Severity**: LOW
Some keys present in `en-US.json` are missing in `vi-VN.json`. When a key is missing, `JsonStringLocalizer` returns the key name as the display text.
**Fix**: Run a diff between the two JSON files in CI. Fail the build on missing translation keys.
---
### WARN-12 — No IFormatProvider in JsonStringLocalizer
**File**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Localization/JsonStringLocalizer.cs`
**Severity**: LOW
`string.Format(format, arguments)` is called without a culture-aware `IFormatProvider`. Numbers and dates will be formatted using the thread's current culture, not the user's selected locale.
**Fix**:
```csharp
var value = string.Format(CultureInfo.CurrentUICulture, format ?? name, arguments);
```
---
### WARN-13 — No IAsyncDisposable on Layout Components
**Severity**: LOW
`AdminLayout.razor` subscribes to `AuthStateService.OnChange` but does not implement `IAsyncDisposable` to unsubscribe. In long-lived SPA sessions with layout reloads (e.g., via forceLoad navigation), this will leak event handlers.
**Fix**:
```csharp
@implements IDisposable
// ...
public void Dispose() => _authState.OnChange -= StateHasChanged;
```
---
## Improvements
### IMP-01 — Implement Authorization Code Flow + PKCE
Replace the password grant with a proper OIDC flow. Duende IdentityServer supports this natively. This enables MFA, social login, and OAuth2 compliance.
### IMP-02 — Add Content Security Policy
Add CSP header in BFF or via Traefik middleware:
```
Content-Security-Policy: default-src 'self'; script-src 'self' unpkg.com; style-src 'self' 'unsafe-inline'
```
### IMP-03 — Add @key to List Renders
All `@foreach` that render lists of components should use `@key` to prevent unnecessary DOM diffing:
```razor
@foreach (var item in _items)
{
<ItemComponent @key="item.Id" Item="item" />
}
```
### IMP-04 — Lazy Load POS Vertical Pages
Each POS vertical (Karaoke, Restaurant, Spa, Cafe, Retail) should be lazy-loaded. Blazor WASM supports lazy assembly loading via `LazyAssemblyLoader`. This will reduce initial download size.
### IMP-05 — Extract AdminLayout Shop Context to Service
`AdminLayout.razor` is 328 lines with complex shop context detection inline. Extract shop context detection to a `ShopContextService` (scoped) that parses the URL and validates permissions.
### IMP-06 — Add `@rendermode InteractiveWebAssembly` Explicitly
With .NET 8+, pages should declare render mode explicitly to avoid confusion with static rendering.
### IMP-07 — Standardize DTO Naming Convention
DTOs use mixed `Dto` vs no-suffix naming:
- `MerchantRegisterDto` (has `Dto` suffix) vs `AuthTokenResponse` (no suffix)
Standardize to always use `Dto` suffix for data transfer objects.
### IMP-08 — Add BFF Health Check Routes
`WebClientTpos.Server/Program.cs` has no health check endpoints. The BFF should expose `/health/live` and `/health/ready` for K8s probes.
### IMP-09 — Pin Lucide Version
Currently using `lucide@latest`. Pin to a specific version (e.g., `lucide@0.363.0`) to prevent breaking changes from CDN updates.
---
## Action Items
Prioritized by severity:
| Priority | Item | File(s) | Effort |
|----------|------|---------|--------|
| P0 | CRIT-02: Verify ClientSecret not exposed in prod | `AuthService.cs:40` | 1h |
| P0 | CRIT-01: Migrate JWT to httpOnly cookie via BFF | `AuthService.cs`, BFF | 2d |
| P0 | CRIT-03: Replace password grant with PKCE flow | `AuthService.cs` | 3d |
| P0 | WARN-06: Add route guards / AuthorizeView | All layouts | 1d |
| P0 | WARN-07: Validate shopId against user permissions | `AdminLayout.razor:246` | 4h |
| P1 | CRIT-04: Add SRI hash to Lucide CDN | `index.html:19` | 30m |
| P1 | WARN-01: Implement token refresh | `AuthStateService.cs` | 1d |
| P1 | WARN-02: Fix global HttpClient header mutation | `PosDataService.cs:40` | 4h |
| P1 | WARN-03: Remove redundant lucide.createIcons call | `AdminLayout.razor:196` | 30m |
| P1 | WARN-04: Move MudBlazor providers to App.razor | All layouts | 1h |
| P1 | WARN-10: Implement missing API integrations (21 TODOs) | Multiple POS pages | 5d |
| P2 | WARN-05: Extract LocalStorageService | 5 files | 4h |
| P2 | WARN-08: Standardize API response envelope | `PosDataService.cs:122` | 2d |
| P2 | WARN-13: Add IDisposable to layouts | Layout files | 1h |
| P2 | IMP-01: Write 50+ unit tests for services | `/tests/` | 3d |
| P2 | IMP-04: Lazy load POS vertical assemblies | Program.cs | 1d |
| P3 | WARN-09: Localize ForgotPasswordText | `AuthInput.razor:56` | 30m |
| P3 | WARN-11: CI check for missing translation keys | CI pipeline | 2h |
| P3 | WARN-12: Add IFormatProvider to localizer | `JsonStringLocalizer.cs` | 1h |
| P3 | IMP-09: Pin Lucide version | `index.html:19` | 15m |
---
*Generated by Senior Frontend Microservice Engineer audit — TEC-229*

View File

@@ -0,0 +1,285 @@
# Product Manager Audit — GoodGo POS System
> **Auditor**: Product Manager Agent
> **Date**: 2026-03-20
> **Scope**: Feature completeness, user flows, product gaps, roadmap alignment
> **Working Directory**: `/Users/velikho/Desktop/WORKING/pos-system`
---
## Executive Summary
GoodGo POS là nền tảng multi-vertical mạnh, với 5 vertical đều có core POS hoạt động (~8595% feature-complete cho Restaurant, Karaoke, Cafe, Spa, Retail). Tuy nhiên, toàn bộ marketing/CRM ecosystem — được quảng bá là key differentiator so với KiotViet, Sapo, iPOS — là **UI stub hoàn toàn, không có backend integration**. Đây là rủi ro sản phẩm cao nhất cần giải quyết trước demo/launch.
---
## Critical Issues
### CI-1: Marketing Suite là Demo 100% — Zero Backend Integration
**File paths**:
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/AiChatbot.razor` (210 lines)
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/CustomerCrm.razor` (173 lines)
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/LivechatConsole.razor` (192 lines)
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/AiContentStudio.razor` (176 lines)
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/ChatbotAutomation.razor` (145 lines)
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Marketing/SocialHub.razor` (162 lines)
**Pattern** (identical trong tất cả 6 trang):
```csharp
// EN: Demo customer data / VI: Dữ liệu khách hàng demo
private record CustomerInfo(string Name, string Phone, ...);
private readonly CustomerInfo[] _customers = new[] { new CustomerInfo(...), ... };
```
**Impact**: Merchant đăng ký plan Growth/Pro vì tin vào tính năng CRM, AI Chatbot, và Social Hub — khi vào dùng sẽ thấy dữ liệu giả. Đây là **churn risk****trust damage** lớn. Đây là điểm mà GoodGo tuyên bố vượt KiotViet và Sapo nhưng thực tế là không có gì.
**Recommended action**: Ẩn Marketing section khỏi Starter/Growth plan. Chỉ show cho Pro+ với label "Coming Soon" hoặc implement thực tế — không nên để merchant dùng thử demo data như product thật.
---
### CI-2: Voucher/Promotion Redemption bị gián đoạn hoàn toàn
**Backend** (hoạt động): `services/promotion-service-net/src/PromotionService.API/Application/Commands/` — có `ClaimVoucher`, `RedeemVoucher`, `CreateCampaign`.
**Frontend gap**: Không có UI nào trong payment flow để nhập mã voucher. Kiểm tra:
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Pos/Shared/Payment/MethodSelect.razor` — không có "Apply Voucher" field.
- `apps/web-client-tpos-net/src/WebClientTpos.Client/Pages/Admin/Shop/ShopPromotions.razor` — có form tạo voucher nhưng khép lại tại UI.
**Impact**: Merchant tạo được voucher nhưng customer không thể redeem tại quầy. Merchant nghĩ mình đã chạy campaign — thực tế campaign không có tác dụng gì. Đây là **broken promise** trong mỗi giao dịch.
---
### CI-3: Payment Gateway — Trạng thái Live/Sandbox không rõ ràng
`services/wallet-service-net/src/WalletService.API/Application/Commands/Payments/``CreatePayment`, `ProcessPaymentCallback` — architecture tốt với `IPaymentGateway` abstraction. Tuy nhiên:
- Không rõ gateway nào đang live (VNPay? MoMo? ZaloPay?)
- Webhook callback security (HMAC signature verification) không được documented
- TipEntry UI tồn tại nhưng tip logic tới wallet service chưa được verify
**Impact**: Nếu team demo payment cho khách hàng mà không biết gateway nào đang hoạt động, sẽ fail live demo. Đây là **P0 trước bất kỳ investor/merchant demo**.
---
### CI-4: Analytics & Reporting — 0% Wired to Real Data
**Frontend** (đầy đủ UI): Revenue Dashboard, Staff Performance, EOD Report tồn tại.
**Backend**: `order-service-net``CloseDayCommand` nhưng không có aggregation queries cho trend analytics.
**Gap**: All KPI cards (revenue, transactions, avg order value, growth %) là hardcoded demo values trong Razor `@code` blocks.
**Impact**: Merchant đưa ra quyết định kinh doanh dựa trên số liệu giả. Với SMB Việt Nam, quyết định nhập hàng/thuê staff dựa trên báo cáo — nếu báo cáo sai họ sẽ mất tiền thật.
---
## Warnings
### W-1: Promotion-to-POS Gap Ảnh Hưởng Toàn Bộ Marketing Funnel
Backend promotion service exists và functional. Frontend marketing UI exists nhưng disconnected. Kết quả: Merchant create campaign → customer không thể redeem → merchant không thấy ROI → churn.
Theo user flow chuẩn: **Merchant tạo voucher** (Admin Shop → Promotions) → **Customer nhận voucher** (qua gì? SMS/QR? Chưa có) → **Customer redeem tại quầy** (POS Payment screen — không có input field) → **Order được discount** (backend ready). Bước 2 và 3 bị vỡ.
### W-2: Spa Vertical — Appointment Logic chưa End-to-End
- UI: Appointment calendar, therapist assignment, treatment timer có (`Pages/Pos/Spa/`)
- Backend: `booking-service` có CreateAppointment, therapist CRUD
- Gap: Khi khách walk-in không book trước, flow "new appointment from POS" không rõ ràng. "Book from POS" button và "Customer lookup" có trong UI nhưng integration path chưa documented.
### W-3: Onboarding Wizard — Device Pairing bước 5 chưa rõ
`Pages/Admin/Onboarding/OnboardingDevice.razor` — Step 5 có UI pairing device nhưng mechanism (QR code? token? physical device ID?) không visible trong code. Nếu merchant dùng iPad/tablet mới và step này fail, họ sẽ stuck ngay bước cuối onboarding.
### W-4: QR Menu Customer Ordering — Post-Cart Flow Bị Hở
`Pages/Customer/TableMenu.razor`: Cart management hoạt động tốt. Tuy nhiên sau khi customer bấm "Checkout":
- Navigates to payment method selection
- Nhưng customer context (tableId, sessionId) cần được passed qua payment flow
- Không rõ order được submit vào order-service như thế nào, staff có nhận notification không
**Risk**: Customer submit order nhưng staff không thấy → order bị miss → bad experience → review kém.
### W-5: Neon PostgreSQL Shared Credentials (Security)
`deployments/` — Staging/production sử dụng shared DB credentials cho tất cả 23+ services. Nếu 1 service bị compromise, attacker có access tới toàn bộ DB tier.
**PM impact**: Đây là compliance risk (PDPA Việt Nam, payment data security) sẽ block enterprise deals.
### W-6: Marketing mkt-* Services thiếu trong Docker (4 services)
`mkt-facebook-service-net`, `mkt-whatsapp-service-net`, `mkt-x-service-net`, `mkt-zalo-service-net` không có trong `deployments/local/docker-compose.yml`. Team không thể test marketing integration locally.
---
## Improvements
### IMP-1: Triển khai "Voucher Redemption" tại điểm bán — ROI cao nhất trong 1 sprint
**User Story**:
*As a cashier, I want to enter a voucher code during checkout so that the customer gets the promised discount.*
**Effort**: 3 ngày backend (new Query: `ValidateVoucher`), 2 ngày frontend (thêm input vào `MethodSelect.razor`).
**RICE Score**: Reach=500 merchants × Impact=3 (high) × Confidence=90% / Effort=1 week = **1,350** (highest priority).
### IMP-2: Wire Analytics Dashboard vào Real Data — Quick Win
`services/order-service-net` đã có order data. Thêm:
- `GetDailyRevenueQuery` → trả về revenue by day/payment method
- `GetTopProductsQuery` → trả về best-sellers
- Wire `Pages/Admin/Dashboard/RevenueDashboard.razor` thay demo arrays
**Effort**: 5 ngày backend (2 queries + aggregation), 2 ngày frontend (replace mock data).
**Impact**: Merchant có real business intelligence → quyết định tốt hơn → giảm churn.
### IMP-3: Vertical-Specific Reporting — "Quick Wins" per persona
| Vertical | Missing Report | Effort | Value |
|----------|---------------|--------|-------|
| Karaoke | Per-room revenue, peak hour heatmap | 5 ngày | High |
| Cafe | Daily queue throughput, popular items | 3 ngày | High |
| Restaurant | Table turnover rate, avg dwell time | 4 ngày | High |
| Spa | Therapist utilization, rebooking rate | 4 ngày | Medium |
| Retail | Inventory aging, return rate by SKU | 5 ngày | Medium |
### IMP-4: Customer Feedback Loop — Differentiate từ Competitors
Sau khi payment success, hiện QR code cho customer rate experience (1-5 stars + comment). Feed data vào merchant dashboard. Không competitor nào (KiotViet, Sapo, iPOS) có feature này tích hợp trong POS flow.
**Effort**: 2 tuần (new `feedback-service` đơn giản hoặc add vào order-service).
**Competitive impact**: Cao — là unique differentiator có thể quảng cáo.
### IMP-5: Marketing Section — Quyết Định Rõ Ràng về Roadmap
Có 3 lựa chọn:
**Option A — Hide & Honest (Recommended ngắn hạn)**:
- Ẩn Marketing section với "Coming in Q2 2026" label
- Giữ mockup làm prototype cho user research
- Không gây trust damage với merchants
**Option B — Stub with Real Data**:
- Giữ UI, thay demo arrays bằng real API calls (customer list from IAM/order service, basic stats)
- 2 tuần effort, delivers "real feel"
- Không có full chatbot/social media nhưng CRM list sẽ có data thật
**Option C — Full Implementation**:
- Build MCP-powered chatbot integration (Zalo, Facebook, WhatsApp)
- 8-12 tuần, cần dedicated team
- Align với Q2 2026 roadmap target
### IMP-6: Onboarding Post-Completion — First Order Tutorial
Sau bước 6 "Complete" của onboarding wizard, hiện guided walkthrough:
- "Bước tiếp theo: Tạo đơn hàng đầu tiên" với checklist
- Link tới tutorial video
**Effort**: 2 ngày, impact rất cao với new merchant activation rate.
### IMP-7: Mobile App Completeness cho Staff
`app-client-base-swift` iOS app: 9 pages, auth flows và basic navigation. Staff cần mobile app để:
- Clock in/out khi không ở quầy
- Nhận order notifications
- Check lịch làm việc
Priority: Medium, nhưng cần để tránh staff dùng browser mobile (poor UX).
---
## Action Items (Prioritized)
### Sprint 1 (Week 1-2) — Revenue-Critical
| # | Action | Owner | Effort | Priority | RICE |
|---|--------|-------|--------|----------|------|
| A1 | Xác nhận payment gateways nào đang live, document & test | CTO + Backend | 3 ngày | P0 | — |
| A2 | Implement Voucher Redemption field trong payment flow | Backend + Frontend | 1 tuần | P0 | 1,350 |
| A3 | Ẩn/label "Coming Soon" cho Marketing section | Frontend | 1 ngày | P0 | — |
| A4 | Wire Revenue Dashboard tới real order data | Backend + Frontend | 1 tuần | P1 | 900 |
### Sprint 2 (Week 3-4) — Merchant Retention
| # | Action | Owner | Effort | Priority |
|---|--------|-------|--------|----------|
| A5 | Vertical-specific reporting (Karaoke + Cafe ưu tiên) | Backend + Frontend | 2 tuần | P1 |
| A6 | Fix QR Customer Menu → Order submission flow | Backend + Frontend | 3 ngày | P1 |
| A7 | Clarify & fix Onboarding Device Pairing (bước 5) | Frontend + Backend | 2 ngày | P1 |
| A8 | Add post-onboarding "First Order" tutorial | Frontend | 2 ngày | P2 |
### Sprint 3 (Week 5-8) — Differentiation
| # | Action | Owner | Effort | Priority |
|---|--------|-------|--------|----------|
| A9 | Customer Feedback post-payment (QR rating) | Backend + Frontend | 2 tuần | P2 |
| A10 | Marketing Section — Option B (stub with real data) | Backend + Frontend | 2 tuần | P2 |
| A11 | iOS Mobile App — Staff clock-in/order notifications | Mobile | 3 tuần | P2 |
| A12 | Audit Neon PostgreSQL credentials (per-service isolation) | DevOps | 1 tuần | P1 |
---
## Success Metrics
| Metric | Baseline | Target (3 months) | How to Measure |
|--------|----------|-------------------|----------------|
| Merchant activation rate (first order after onboarding) | Unknown | >70% | Analytics: orders/merchant in first 7 days |
| Voucher redemption rate | 0% (feature broken) | >15% orders with voucher | Order service: count orders with promotion_id |
| Marketing section churn trigger | Unknown | <5% abandon after seeing stub | Frontend: page bounce rate on Marketing |
| Dashboard daily active usage | Unknown | >60% merchants view weekly | Analytics: dashboard page views/week |
| Payment failure rate | Unknown | <2% | Wallet service: failed payments / total |
| NPS (Merchant) | Baseline needed | >50 | Survey via in-app prompt |
---
## Competitive Positioning Assessment
| Feature | GoodGo Status | KiotViet | Sapo POS | iPOS |
|---------|:-------------:|:--------:|:--------:|:----:|
| Multi-vertical POS | ✅ 5 verticals | ❌ Retail only | ❌ | ✅ 2 |
| KDS Kitchen Display | ✅ | ❌ | ✅ | ✅ |
| AI-powered Operations | ✅ MCP (Cafe) | ❌ | ❌ | ❌ |
| Loyalty stamps/levels | ✅ Working | Basic | Basic | ✅ |
| Marketing CRM | ⚠️ Stub only | Basic | ✅ | ❌ |
| Real-time analytics | ⚠️ Demo data | ✅ | ✅ | ✅ |
| Omnichannel (Web+Mobile) | ✅ Architecture | ✅ | ✅ | ❌ |
| Booking/Scheduling | ✅ | ❌ | ❌ | ✅ |
| Multi-tenant (shops) | ✅ | ✅ | ✅ | ✅ |
**Assessment**: GoodGo dẫn đầu về multi-vertical và AI integration (MCP server là unique). Nhưng Real-time Analytics và Marketing CRM — nơi Sapo và KiotViet đang mạnh — đang là liability. Cần fix analytics trước mọi marketing effort.
---
## Product Gaps Summary (MoSCoW)
### Must Have (cho MVP launch)
- ✅ Multi-vertical POS order flow
- ✅ Payment methods UI
-**Real payment gateway integration verified**
-**Voucher redemption at POS**
-**Real analytics data (not demo)**
### Should Have (launch + 30 days)
- ❌ Marketing section với real data (Option B)
- ❌ Customer feedback loop
- ❌ Per-vertical reporting
- ✅ Loyalty stamps working
- ✅ KDS Kitchen Display
### Could Have (Q2 2026)
- ❌ Full AI Chatbot (Zalo/Facebook/WhatsApp)
- ❌ AI Content Studio
- ❌ Social Hub management
- ❌ iOS mobile app complete
- ❌ Customer segmentation CRM
### Won't Have (không ưu tiên)
- Enterprise multi-location analytics
- 3rd-party ERP integration
- Marketplace integration (Shopee, Lazada)
---
*Audit completed: 2026-03-20*
*Next review: After Sprint 1 completion (2026-04-03)*
*Related: [ROADMAP.md](../../ROADMAP.md) | [CTO_REPORT_SHOP_DELETE.md](../../CTO_REPORT_SHOP_DELETE.md)*

376
docs/audit/qa.md Normal file
View File

@@ -0,0 +1,376 @@
# QA Engineer — Audit Report: GoodGo POS System
**Date**: 2026-03-20
**Auditor**: QA Engineer (TechBi / GoodGo Platform)
**Scope**: Test coverage, test quality, CI/CD test pipelines, missing test scenarios
**Working Directory**: `/Users/velikho/Desktop/WORKING/pos-system`
---
## Executive Summary
The GoodGo POS System has a well-structured testing foundation for backend .NET microservices — all 26 services follow Clean Architecture test patterns with xUnit + Moq + FluentAssertions, and the IAM service demonstrates a gold-standard test suite. However, **only 1 of 26 services has a CI test pipeline**, there is **zero coverage for the MCP server** (a critical AI operations component with 12 tools), **no contract testing** between microservices, and **no performance testing** anywhere in the stack. Coverlet is installed in all test projects but never generates reports or enforces thresholds. Frontend testing is surface-level smoke tests only beyond 8 Playwright E2E specs.
---
## Critical Issues
### C1 — Only 1 of 26 Services Has CI Test Pipeline
**Severity**: Critical (P0)
**File**: `.github/workflows/ci-iam-service.yml`
Only `iam-service-net` has a dedicated CI workflow that runs unit tests and functional tests on pull requests. The remaining 25 services (merchant, order, fnb-engine, booking, catalog, inventory, wallet, promotion, membership, chat, social, storage, mining, mission, ads-manager, ads-serving, ads-billing, ads-tracking, ads-analytics, mkt-facebook, mkt-whatsapp, mkt-x, mkt-zalo, and both template services) have **no automated test execution on PRs or merges**.
This means regressions in 25 services are not caught until staging deployment or manual testing.
**Impact**: A broken handler, validator, or domain entity in any of the 25 uncovered services can be deployed to staging undetected.
**Evidence**:
- `.github/workflows/ci-iam-service.yml` — only service CI that runs tests
- No `ci-merchant-service.yml`, `ci-order-service.yml`, etc.
- `deploy-staging.yml` runs EF Core migrations but NOT test suites before deployment
---
### C2 — MCP Server Has Zero Tests (12 Production Tools)
**Severity**: Critical (P0)
**File**: `services/goodgo-mcp-server/package.json`, `services/goodgo-mcp-server/src/`
The `goodgo-mcp-server` is a TypeScript MCP server providing 12 tools for AI-assisted F&B operations:
- **Catalog tools** (`src/tools/catalog-tools.ts`): list/create/update/delete products, menu items
- **Inventory tools** (`src/tools/inventory-tools.ts`): check stock, record intake (nhập kho), record usage (xuất kho), low-stock alerts
- **Recipe tools** (`src/tools/recipe-tools.ts`): list/create recipes with ingredients
- **Analytics tools** (`src/tools/analytics-tools.ts`): popular items, cost analysis
The `package.json` has no `test` script, no test framework (Jest, Vitest, Mocha) installed, and no test files exist in the project directory.
**Impact**: Any bug in the MCP tool definitions, input validation via `zod`, API client error handling (`src/services/error-handler.ts`), or tool logic will silently surface as incorrect AI-generated operations on merchant data.
---
### C3 — No Coverage Thresholds or Reports Enforced
**Severity**: Critical (P0)
**File**: All `*.UnitTests.csproj` files (e.g., `services/order-service-net/tests/OrderService.UnitTests/OrderService.UnitTests.csproj`)
All test projects include `coverlet.collector` v6.0.2, but:
- No `.runsettings` or `coverlet.runsettings` file defines coverage thresholds
- No CI step collects or uploads coverage reports
- No quality gate exists to fail a build below a coverage percentage
- Minimum 80% coverage standard (per project requirements) is stated but not enforced
**Evidence**:
```xml
<!-- In OrderService.UnitTests.csproj — coverlet installed but unconfigured -->
<PackageReference Include="coverlet.collector" Version="6.0.2">
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
<PrivateAssets>all</PrivateAssets>
</PackageReference>
```
No `--collect:"XPlat Code Coverage"` flag in any CI `dotnet test` command outside `ci-iam-service.yml`, and even there, no coverage threshold or report upload step exists.
---
### C4 — No Contract Testing Between Microservices
**Severity**: Critical (P0)
The system has 26 microservices communicating via REST API and RabbitMQ events. There is **no contract testing** (Pact.io or similar) anywhere in the codebase. Service contracts are verified only by:
- Shared TypeScript types in `packages/types/` (but packages have no tests either)
- Integration test patterns that use InMemoryDatabase (no real cross-service calls)
In a microservice architecture, breaking changes to API contracts — even minor ones like renaming a field from `merchantId` to `merchant_id` — can silently break consumers. Currently, such breaks would only surface after both services are deployed.
**Evidence**: Search across entire codebase found no Pact files, no `pactum` package, no consumer/provider test annotations.
---
## Warnings
### W1 — Shared Packages Have No Tests
**Severity**: High (P1)
**Files**: `packages/` directory (6 packages)
All 6 shared Node.js packages have zero test coverage:
- `packages/types/``@goodgo/types` (shared TypeScript types used across frontend + MCP server)
- `packages/http-client/``@goodgo/http-client` (axios wrapper used by all frontends)
- `packages/auth-sdk/``@goodgo/auth-sdk` (JWT utilities for auth flows)
- `packages/logger/``@goodgo/logger`
- `packages/tracing/``@goodgo/tracing`
- `packages/config/``@goodgo/config`
A breaking change in `@goodgo/http-client` or `@goodgo/auth-sdk` can cascade through all frontend applications (3 apps), the MCP server, and any consumer service.
---
### W2 — E2E Tests Require Live Backend (No Mocking)
**Severity**: High (P1)
**File**: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/playwright.config.ts`
The Playwright config targets `http://localhost:5092` with no mock server or fixture setup. The E2E tests depend on a fully running backend stack (IAM service, Merchant service, etc.) to function. In CI (`ci-web.yml`), these tests run against hardcoded test credentials (`admin@goodgo.vn / Admin@123`) with no evidence of a test database being seeded.
This means:
- E2E tests are brittle — they fail if the backend is not pre-seeded
- Tests cannot run in isolation for frontend-only changes
- Test data hardcoded in `helpers/test-data.ts` uses magic UUIDs (`TEST_SHOP_ID = 00000000-0000-0000-0000-000000000001`) that must exist in a live DB
**Evidence**:
```typescript
// helpers/test-data.ts
export const TEST_SHOP_ID = '00000000-0000-0000-0000-000000000001';
export const ADMIN_USER = { email: 'admin@goodgo.vn', password: 'Admin@123' };
```
---
### W3 — Playwright Tests Do Not Cover Mutation Scenarios
**Severity**: High (P1)
**Files**: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/tests/*.spec.ts`
All 8 Playwright E2E specs are **read-only / navigation-only** tests. They verify that pages load and elements are visible, but do not test:
- Creating an order (POST to order-service)
- Adding items to a POS order session
- Processing a payment (wallet-service integration)
- Creating a booking (booking-service)
- Staff login with RBAC restrictions
- Merchant onboarding wizard submission
- CRM campaign creation
- Promotions applied to orders
- Inventory decrement after order
The E2E layer only validates that pages render, not that they function end-to-end.
---
### W4 — No Performance / Load Testing
**Severity**: Medium (P2)
No k6, Artillery, Gatling, or any load testing framework exists in the repository. Critical paths with known performance sensitivity have no benchmarks:
- `POST /api/v1/orders` — Order creation under concurrent load
- `GET /api/v1/catalog/products` — Catalog listing (expected high read volume)
- `POST /api/v1/ads-tracking/events` — Ad event ingestion (bursty, high-throughput)
- SignalR connections in `chat-service-net` — WebSocket scalability
- `goodgo-mcp-server` — LLM tool calls during peak restaurant hours
Without baselines, there is no way to detect performance regressions introduced by refactors.
---
### W5 — Frontend Has No Component-Level Tests
**Severity**: Medium (P2)
**Files**: `apps/web-client-tpos-net/`, `apps/web-client-base-net/`
The Blazor WASM apps have only:
- 2 unit tests in `WebClientTpos.SmokeTests/ApiResponseTests.cs` (tests `ApiResponse<T>` DTO, not Razor components)
- 8 Playwright E2E specs for the POS app
There are no component-level tests for:
- `AuthButton` (5 variants: orange, blue, green, outline, ghost)
- `OtpInput` — complex state management, timer countdown
- `PosDataService` — 4-format deserialization logic (critical business logic in frontend service)
- `AuthStateService` — singleton JWT token state management
- MudBlazor form validation in onboarding wizard
- Shop sidebar vertical-specific menus (`ShopSidebarConfig`)
**Note**: `PosDataService` with 4-format deserialization (`plain array`, `paged {items}`, `wrapped {data:{items}}`, `direct {data:[]}`) is exactly the kind of complex branching logic that should have unit tests. A regression here silently breaks API data display for all merchants.
---
### W6 — MAUI and Swift Apps Have Minimal Testing
**Severity**: Medium (P2)
**Files**: `apps/app-client-base-net/tests/`, `apps/app-client-base-swift/`
- **MAUI** (`app-client-base-net`): 1 unit test file (`AppClientBase.UnitTests`) — content not analyzed but project is described as "Template phase"
- **Swift iOS** (`app-client-base-swift`): 1 smoke test referenced in `ci-mobile.yml` — no Xcode test plans found
Mobile apps in "template phase" with near-zero tests will carry this debt forward into feature development.
---
### W7 — Test Parallelism Disabled in Playwright
**Severity**: Low (P3)
**File**: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/playwright.config.ts`
```typescript
fullyParallel: false,
workers: process.env.CI ? 1 : undefined,
```
With 8 spec files and no parallelism, E2E runs are unnecessarily slow. Parallel execution is safe when tests are properly isolated (each test navigates independently). Disabling parallelism suggests tests may be sharing state or there are concerns about race conditions on the backend — both are red flags.
---
## Improvements
### I1 — Add CI Pipelines for All 25 Missing Services
Use `ci-iam-service.yml` as the template. Each service pipeline should:
1. Trigger on changes to `services/{service-name}/**`
2. Spin up PostgreSQL 16-alpine service container
3. Run `dotnet test` with `--collect:"XPlat Code Coverage"` flag
4. Upload coverage to Codecov or similar
5. Fail the build if coverage drops below 80%
**Reference**: `.github/workflows/ci-iam-service.yml`
**Effort**: Low (copy-paste + variable substitution for 25 services)
---
### I2 — Add Tests for goodgo-mcp-server
Add Vitest (preferred for TypeScript ESM) with the following coverage targets:
```
services/goodgo-mcp-server/
src/
tools/
catalog-tools.test.ts ← mock axios, test zod validation, tool schema
inventory-tools.test.ts ← test stock calculation, low-stock alerts
recipe-tools.test.ts ← test recipe ingredient validation
analytics-tools.test.ts ← test aggregation logic
services/
error-handler.test.ts ← test all error code paths
api-client.test.ts ← test interceptors, auth headers
```
**Effort**: Medium (3-5 days)
---
### I3 — Enforce Coverage Thresholds in CI
Add a `.runsettings` file at repo root:
```xml
<?xml version="1.0" encoding="utf-8"?>
<RunSettings>
<DataCollectionRunSettings>
<DataCollectors>
<DataCollector friendlyName="XPlat Code Coverage">
<Configuration>
<Format>cobertura</Format>
<Threshold>80</Threshold>
<ThresholdType>Line</ThresholdType>
<ThresholdStat>Minimum</ThresholdStat>
</Configuration>
</DataCollector>
</DataCollectors>
</DataCollectionRunSettings>
</RunSettings>
```
And update CI test commands:
```bash
dotnet test --settings ../../.runsettings --collect:"XPlat Code Coverage"
```
---
### I4 — Add Contract Tests for Critical Service Boundaries
Priority contracts to establish (consumer → provider):
1. `order-service``catalog-service` (product price, availability)
2. `order-service``wallet-service` (payment processing)
3. `order-service``inventory-service` (stock decrement)
4. `merchant-service``iam-service` (user/RBAC validation)
5. `promotion-service``order-service` (discount application)
Use Pact.io with .NET consumer SDK (`PactNet`) and add contract verification to each service's CI pipeline.
---
### I5 — Add Mutation E2E Tests for Critical Workflows
Extend Playwright specs to cover actual business transactions:
- Complete order creation: table selection → add items → submit → verify order in KDS
- Payment flow: order total → wallet/VNPay mock → confirm receipt
- Booking creation: select date/time → confirm → verify in calendar
- Inventory: place order → verify stock decremented
**File to extend**: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/tests/pos-restaurant.spec.ts`
---
### I6 — Unit Test PosDataService (Frontend)
`PosDataService` deserializes 4 different API response formats. Add unit tests in `WebClientTpos.SmokeTests/` or a new `WebClientTpos.UnitTests/` project:
```csharp
// PosDataServiceTests.cs
[Fact]
public async Task Deserialize_PlainArray_ShouldReturnItems();
[Fact]
public async Task Deserialize_PagedResponse_ShouldReturnItemsWithCount();
[Fact]
public async Task Deserialize_WrappedDataResponse_ShouldUnwrapCorrectly();
[Fact]
public async Task Deserialize_DirectDataArray_ShouldReturnItems();
```
---
### I7 — Add k6 Performance Tests for Critical Paths
Create `tests/performance/` at repo root:
```
tests/performance/
k6/
order-creation.js ← VUs: 100, ramp to 500
catalog-listing.js ← VUs: 200, sustained load
ads-event-ingestion.js ← VUs: 1000 burst
auth-token-refresh.js ← VUs: 50
```
Add a weekly `performance-test.yml` GitHub Actions workflow that runs k6 against staging.
---
## Action Items (Prioritized)
| # | Priority | Action | Effort | Owner |
|---|----------|--------|--------|-------|
| A1 | P0 — Critical | Generate CI pipelines for 25 missing services (copy `ci-iam-service.yml` template) | 1 day | DevOps |
| A2 | P0 — Critical | Add Vitest + test suite to `goodgo-mcp-server` (12 tools, error handler, api-client) | 3-5 days | Backend Dev |
| A3 | P0 — Critical | Add `.runsettings` coverage threshold (80% minimum) and upload in all CI pipelines | 0.5 days | DevOps |
| A4 | P0 — Critical | Add Pact.io contract tests for top 5 service boundaries | 5-7 days | Backend Dev |
| A5 | P1 — High | Add unit tests for all 6 shared packages in `packages/` | 3-4 days | Frontend Dev |
| A6 | P1 — High | Fix E2E backend dependency: add DB seeding/reset step in `ci-web.yml` | 2 days | DevOps |
| A7 | P1 — High | Extend Playwright specs with mutation tests (order creation, payment, booking) | 3-4 days | QA |
| A8 | P2 — Medium | Add `PosDataService` unit tests (4-format deserialization) | 1 day | Frontend Dev |
| A9 | P2 — Medium | Enable Playwright `fullyParallel: true` with proper test isolation | 0.5 days | QA |
| A10 | P2 — Medium | Add k6 performance baseline tests for 4 critical paths | 3 days | QA |
| A11 | P3 — Low | Expand MAUI unit test coverage beyond template placeholder | 2-3 days | Mobile Dev |
| A12 | P3 — Low | Add Xcode test plan for Swift iOS app | 2-3 days | Mobile Dev |
---
## Appendix: Current Test Inventory
### Backend (.NET)
| Layer | Count | Framework |
|-------|-------|-----------|
| Unit Tests (all services) | ~100+ files | xUnit + Moq + FluentAssertions |
| Functional Tests (all services) | ~50+ files | xUnit + WebApplicationFactory + InMemory DB |
| Smoke Tests (Blazor) | 4 projects | xUnit |
| MAUI Unit Tests | 1 project | xUnit |
### Frontend (TypeScript)
| Layer | Count | Framework |
|-------|-------|-----------|
| E2E Specs (Playwright) | 8 files | Playwright/test (Chromium only) |
| Smoke DTO Tests | 2 tests | xUnit via .NET project |
| Shared package tests | **0** | — |
| MCP server tests | **0** | — |
### CI Coverage
| Service | Has CI Tests? |
|---------|--------------|
| iam-service-net | ✅ Yes (`ci-iam-service.yml`) |
| 25 other services | ❌ No |
| web-client-tpos-net | ⚠️ Partial (E2E in `ci-web.yml`) |
| goodgo-mcp-server | ❌ No |
| packages/* (6) | ❌ No |
### Key Test Files Referenced
- IAM unit test: `services/iam-service-net/tests/IamService.UnitTests/Application/Commands/RegisterUserCommandHandlerTests.cs`
- IAM functional test: `services/iam-service-net/tests/IamService.FunctionalTests/Controllers/AuthControllerTests.cs`
- Template unit test: `services/_template_dot_net/tests/MyService.UnitTests/Application/CreateSampleCommandHandlerTests.cs`
- Playwright config: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/playwright.config.ts`
- Playwright helpers: `apps/web-client-tpos-net/tests/WebClientTpos.E2ETests/helpers/test-data.ts`
- Smoke test: `apps/web-client-tpos-net/tests/WebClientTpos.SmokeTests/ApiResponseTests.cs`

View File

@@ -0,0 +1,229 @@
# Audit Report: POS System — Research Analyst Perspective
**Date**: 2026-03-20
**Auditor**: Research Analyst (TechBi Company)
**Scope**: Tech stack relevance, outdated dependencies, market alignment
**Task Reference**: TEC-231 (parent: TEC-217)
---
## Executive Summary
GoodGo Platform là một monorepo enterprise-grade với 26 microservices (.NET 10), 5 frontend apps, và AI integration (MCP server). Tech stack hiện đại, phù hợp thị trường, và có kiến trúc Clean Architecture + CQRS được enforce nhất quán. **Rủi ro chính là 3 services critical bị incomplete** (inventory, promotion, mission) — trực tiếp chặn order fulfillment và marketing campaigns tại production.
---
## Critical Issues
### CRIT-01: 3 Services Thiếu Handler Implementation — Chặn Core Features
**Severity**: 🔴 Blocker / P0
**Affected Services**:
- `services/inventory-service-net/` — 12 controller endpoints nhưng chỉ có **1 command handler**. Queries: 0. Không thể truy vấn tồn kho.
- `services/promotion-service-net/` — 0 commands, 0 queries. Controllers reference logic không tồn tại.
- `services/mission-service-net/` — 0 commands, 0 queries. Gamification engine hoàn toàn non-functional.
**Business Impact**:
- Order fulfillment bị block (không validate stock)
- Discount/voucher campaigns không hoạt động
- Loyalty missions/gamification không chạy được
**Evidence**: `ROADMAP.md` đã track vấn đề này ở line 71. Services scaffold nhưng chưa được implement.
**Recommendation**: Ưu tiên P0 sprint cho 3 services này. Reference `services/_template_dot_net/` cho patterns.
---
### CRIT-02: Port Conflicts trong Docker Compose Local — Cản Local Development
**Severity**: 🔴 Critical / P0
**Files**: `deployments/local/docker-compose.yml`
4 marketing services (mkt-facebook, mkt-whatsapp, mkt-zalo, mkt-x) đều dùng port `5000`, gây conflict khi start local stack đồng thời.
**Business Impact**: Developer không thể test marketing integrations locally, làm chậm development cycle.
**Recommendation**: Assign unique ports (ví dụ 5021-5024) cho 4 mkt-* services trong docker-compose.yml. Cập nhật Traefik service definitions tương ứng.
---
## Warnings
### WARN-01: Test Coverage Thấp (~15%) — Rủi Ro Production
**Severity**: 🟡 High / P1
| Layer | Coverage Estimate |
|-------|:-----------------:|
| iam-service-net | ~40% |
| order-service-net | ~25% |
| fnb-engine-net | ~35% |
| Tất cả services còn lại | ~510% |
| **Trung bình toàn hệ thống** | **~15%** |
371 test files tồn tại (`*.cs` trong thư mục tests/) nhưng phân bổ không đều. Nhiều services production-ready chưa có functional tests.
**Recommendation**: Target 50%+ coverage (unit + functional) trước khi scale thêm merchants. Ưu tiên: wallet, catalog, booking services.
---
### WARN-02: ads-serving Service Chỉ Read-Only
**Severity**: 🟡 Medium / P2
`services/ads-serving-service-net/` — tất cả endpoints là read-only, 0 command handlers. Ad auction logic và budget management không thể được điều khiển qua API.
**Business Impact**: Ads revenue pipeline bị giới hạn, không thể quản lý ads dynamically.
---
### WARN-03: Mobile Apps Chưa Complete
**Severity**: 🟡 Medium / P2
- `apps/app-client-base-net/` (MAUI) — scaffold, 0 pages implemented
- `apps/app-client-base-swift/` (SwiftUI iOS) — in-progress, 34 files nhưng thiếu POS workflow
**Business Impact**: GoodGo không có native mobile POS app. Cạnh tranh với KiotViet, Sapo (đều có mobile app hoàn chỉnh).
---
### WARN-04: K8s Secrets Dùng Placeholder Values
**Severity**: 🟡 Medium / P1
`deployments/staging/kubernetes/secrets.yaml` — chứa placeholder values thay vì sealed secrets thực tế.
**Recommendation**: Implement [sealed-secrets](https://github.com/bitnami-labs/sealed-secrets) operator (đã referenced trong codebase). Rotate tất cả staging credentials.
---
### WARN-05: PostgreSQL Exporter Chưa Configure
**Severity**: 🟡 Low / P2
`infra/observability/prometheus.yml` đề cập pg_exporter nhưng chưa deploy. DB metrics (connections, query latency, replication lag) không được collect.
**Recommendation**: Deploy `postgres-exporter` sidecar hoặc standalone. Thêm vào Prometheus targets và Grafana dashboard.
---
## Improvements
### IMP-01: Payment Integration — Mở Rộng Sang Momo, ZaloPay
**Priority**: High (Competitive)
Hiện tại: VNPay sandbox integrated. Chưa có: MoMo, ZaloPay — 2 ví điện tử phổ biến nhất tại Việt Nam (chiếm ~60% digital payment market share).
**Market Context**: Theo NAPAS 2025, thanh toán không tiền mặt tăng 40% YoY tại SMB sector. KiotViet và iPOS đều đã tích hợp đầy đủ.
**Recommendation**: Tạo payment adapter pattern trong `wallet-service-net/`, implement MoMo QR (IPN webhook) và ZaloPay API. Roadmap: Q2 2026.
---
### IMP-02: Regulatory Compliance — Nghị Định 123/2020 (E-Invoice)
**Priority**: High (Legal Requirement)
Từ 01/07/2022, tất cả businesses tại VN bắt buộc dùng hóa đơn điện tử (e-invoice) theo Nghị định 123/2020/NĐ-CP. GoodGo order service cung cấp invoice data nhưng chưa integrate với nhà cung cấp hóa đơn điện tử (VNPT, Viettel, FastBill).
**Recommendation**: Tạo `einvoice-service-net` hoặc integration layer trong order-service. Ưu tiên đối tác: VNPT Invoice hoặc Viettel-CA.
---
### IMP-03: AI/MCP Server — Mở Rộng Từ F&B Sang Các Vertical Khác
**Priority**: Medium (Differentiation)
`services/goodgo-mcp-server/` có 12 tools AI cho F&B. Đây là **điểm khác biệt lớn** so với KiotViet, Sapo, iPOS (không có AI). Tuy nhiên chưa có tools cho: Karaoke (room optimization), Spa (therapist scheduling AI), Retail (demand forecasting).
**Recommendation**: Roadmap thêm 8-10 MCP tools cho Karaoke và Retail verticals trong Q3 2026. ROI cao: AI features → premium tier justification.
---
### IMP-04: Observability — Thêm Business Metrics
**Priority**: Medium
Hiện tại Prometheus + Grafana tốt cho infrastructure metrics. Chưa có **business metrics** (orders/giờ, conversion rate, churn per merchant, revenue per vertical).
**Recommendation**: Implement custom Prometheus gauges trong order-service, wallet-service. Tạo Grafana dashboard "Business KPIs" cho Product team.
---
### IMP-05: Caching Strategy — Redis Chưa Được Tận Dụng Fully
**Priority**: Low / Medium
Redis đang được dùng làm SignalR backplane. Chưa thấy application-level caching cho:
- Catalog (menu items) — thường xuyên đọc, hiếm thay đổi
- Membership levels — static data
- Shop configuration — read-heavy
**Recommendation**: Implement `IMemoryCache` + Redis distributed cache cho catalog và shop config. Giảm DB load, tăng response time.
---
## Action Items
| # | Item | Priority | Owner | Timeline |
|---|------|:--------:|-------|----------|
| 1 | Implement inventory-service commands + queries (stock check, adjust, reserve) | P0 🔴 | Backend Dev | Sprint 1 |
| 2 | Implement promotion-service (create/apply discount, voucher validation) | P0 🔴 | Backend Dev | Sprint 1 |
| 3 | Implement mission-service (create mission, track progress, complete mission) | P0 🔴 | Backend Dev | Sprint 2 |
| 4 | Fix Docker Compose port conflicts (mkt-* services: assign 5021-5024) | P0 🔴 | DevOps | Sprint 1 |
| 5 | Rotate staging secrets, implement sealed-secrets operator | P1 🟡 | DevOps | Sprint 1 |
| 6 | Increase test coverage: wallet, catalog, booking services (target 50%) | P1 🟡 | QA + Devs | Sprint 2-3 |
| 7 | Integrate MoMo QR + ZaloPay API trong wallet-service | P1 🟡 | Backend Dev | Q2 2026 |
| 8 | E-invoice integration (Nghị Định 123/2020) — VNPT hoặc Viettel-CA | P1 🟡 | Backend Dev | Q2 2026 |
| 9 | iOS app (SwiftUI) — complete POS workflow cho Restaurant vertical | P2 🟡 | Mobile Dev | Q3 2026 |
| 10 | Expand MCP AI tools sang Karaoke + Retail verticals | P2 🟡 | Backend Dev | Q3 2026 |
| 11 | Deploy postgres-exporter, add Business KPI Grafana dashboard | P2 🟡 | DevOps | Sprint 3 |
| 12 | Implement Redis application-level caching (catalog, shop config) | P3 🟢 | Backend Dev | Q3 2026 |
---
## Market Alignment Summary
| Dimension | GoodGo | KiotViet | Sapo POS | iPOS | Assessment |
|-----------|:------:|:--------:|:--------:|:----:|-----------|
| Multi-vertical (5+) | ✅ | ❌ | 🟡 | 🟡 | **Competitive advantage** |
| AI-powered features | ✅ | ❌ | ❌ | ❌ | **Unique differentiator** |
| Real-time POS (KDS) | ✅ | ❌ | ❌ | ✅ | **Competitive parity** |
| Microservices scalability | ✅ | ❌ | ❌ | ❌ | **Enterprise advantage** |
| Mobile app (production) | 🔴 | ✅ | ✅ | ✅ | **Critical gap** |
| Payment (Momo/ZaloPay) | 🔴 | ✅ | ✅ | ✅ | **Critical gap** |
| E-invoice compliance | 🟡 | ✅ | ✅ | ✅ | **Legal requirement** |
| Loyalty + Marketing CRM | ✅ | ✅ | ✅ | ✅ | **Competitive parity** |
**Overall Assessment**: GoodGo có kiến trúc và AI capabilities vượt trội so với đối thủ VN, nhưng bị chặn bởi incomplete features và thiếu payment integrations phổ biến tại VN. Sau khi fix P0 items và thêm MoMo/ZaloPay, positioning sẽ rất mạnh cho enterprise SMB segment.
---
## Tech Stack Assessment
### Đánh giá tổng thể: ✅ Modern, Production-Grade, No EOL Risks
| Component | Version | Industry Standard | Assessment |
|-----------|---------|-----------------|-----------|
| .NET / C# | 10.0 / 14 | 8.0+ (LTS) | ✅ Ahead of curve |
| MediatR | 12.4.1 | 12.x | ✅ Current |
| EF Core + Npgsql | 10.0 / 10.0 | 8.0+ | ✅ Current |
| Duende IdentityServer | 7.0.8 | 7.x | ✅ Current |
| Blazor WASM | .NET 10 | .NET 8+ | ✅ Current |
| MudBlazor | 8.15.0 | 8.x | ✅ Current |
| Kubernetes | RKE2 | 1.28+ | ✅ Production-grade |
| Traefik | v3.3 | v3.x | ✅ Current |
| PostgreSQL | 16 | 15/16 | ✅ Current |
| TypeScript | 5.9.3 | 5.x | ✅ Current |
| Node.js | 25+ | 22 LTS | ⚠️ Using non-LTS; consider 22 LTS |
**Note**: Node.js 25 là Odd release (non-LTS). Recommend tracking Node.js 22 LTS cho stability, nếu không có specific reason để dùng 25.
---
*Audit completed: 2026-03-20 | Research Analyst — TechBi Company*
*Task: [TEC-231](/TEC/issues/TEC-231) | Parent: [TEC-217](/TEC/issues/TEC-217)*

466
docs/audit/security.md Normal file
View File

@@ -0,0 +1,466 @@
# Security Audit Report — GoodGo POS System
**Auditor**: Security Engineer
**Date**: 2026-03-20
**Scope**: Auth implementation, OWASP risks, secrets management, CORS, dependencies
**Classification**: INTERNAL — CONFIDENTIAL
---
## Executive Summary
The GoodGo POS System demonstrates a solid security architecture in its design patterns (CQRS, Clean Architecture, MediatR pipeline with validation, account lockout, TOTP-based 2FA, comprehensive audit logging). However, there is one systemic critical failure that overrides all other positives: **real production credentials are hardcoded and tracked in git across 19+ microservices**. Until this is remediated, the entire platform is at risk of full compromise. Additionally, JWT tokens are stored in `localStorage` (XSS-extractable), CSP headers are absent, and the MCP server has a live bearer token committed to version control.
---
## Critical Issues
### CRIT-01 — Production Database Credentials Committed to Git (19 Services)
**Severity**: Critical
**CVSS**: 9.8 (AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H)
**Affected Components**: All 19 `.NET` microservices `appsettings.json`
**Description**: The Neon PostgreSQL production password `npg_Ssfy6HKO0cXI` and the Redis production password `Velik@2026` (with public IP `167.114.174.113`) are hardcoded in `appsettings.json` files that are tracked by git.
**Affected Files** (sample — all 19 services affected):
```
services/iam-service-net/src/IamService.API/appsettings.json:33
services/order-service-net/src/OrderService.API/appsettings.json:33
services/wallet-service-net/src/WalletService.API/appsettings.json:33
services/merchant-service-net/src/MerchantService.API/appsettings.json:33
services/chat-service-net/src/ChatService.API/appsettings.json:33
... (14 more services)
```
**Leaked Credentials**:
```
PostgreSQL: Host=ep-holy-glitter-a4hongg7-pooler.us-east-1.aws.neon.tech
Username=neondb_owner | Password=npg_Ssfy6HKO0cXI
Redis: Host=167.114.174.113:6379 (public IP)
Password=Velik@2026
SMTP: SmtpPassword=a469e9333580ef5dbb141f01e33864ef-51afd2db-6c014754
(appsettings.json:54 in iam-service-net)
JWT Secret: goodgo-iam-service-secret-key-32chars!
(appsettings.json:44 in iam-service-net)
```
**Impact**: Anyone with read access to this repository can authenticate directly to the production database and Redis. All customer data, merchant data, orders, wallets, and PII are at risk of exfiltration or destruction.
**Proof of Concept**:
```bash
psql "Host=ep-holy-glitter-a4hongg7-pooler.us-east-1.aws.neon.tech;Username=neondb_owner;Password=npg_Ssfy6HKO0cXI;Database=iam_service;SSL Mode=Require"
```
**Remediation**:
1. **Rotate all credentials immediately** — DB password, Redis password, SMTP key, JWT secret
2. Replace all `appsettings.json` values with environment variable placeholders:
```json
"DefaultConnection": ""
```
3. Inject secrets at runtime via Kubernetes Secrets or HashiCorp Vault
4. Add `appsettings.*.json` pattern to `.gitignore` for non-Development configs, or use User Secrets locally
5. Run `git filter-repo` or BFG Repo Cleaner to purge secrets from git history
**Timeline**: **IMMEDIATE — within 24 hours**
---
### CRIT-02 — Active JWT Bearer Token Committed in MCP Server .env
**Severity**: Critical
**Affected Component**: `services/goodgo-mcp-server/.env:3`
**Description**: A live, signed JWT bearer token is committed to git history in the MCP server's `.env` file, which is tracked by git (`git ls-files` confirms).
```
API_TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6IjQ3NjE1OTQ3...
```
Decoded payload includes: `sub`, `email`, `auth_time`, `jti` — a real user/service account token.
**Impact**: Any party with repository access can replay this token to authenticate as the associated service account until it expires or is revoked. The MCP server has 12 operational tools that can read/write F&B data.
**Remediation**:
1. Revoke the token immediately via IAM service
2. Remove `.env` from git tracking: add `services/goodgo-mcp-server/.env` to `.gitignore`
3. Purge from git history
4. Rotate the service account credentials
**Timeline**: **IMMEDIATE — within 24 hours**
---
### CRIT-03 — IdentityServer Using `AddDeveloperSigningCredential()` in All Environments
**Severity**: Critical
**Affected Component**: `services/iam-service-net/src/IamService.Infrastructure/DependencyInjection.cs:142`
**Description**: The code unconditionally calls `AddDeveloperSigningCredential()`, which generates a temporary RSA key on startup that is not persisted between restarts. This means:
- JWT tokens are invalidated on every service restart
- There is no certificate-based signing in production
- The signing key changes silently
```csharp
.AddDeveloperSigningCredential(); // Line 142 — applied in ALL environments
```
**Impact**: In production, this causes all active sessions to be invalidated on restart. More critically, a production RSA private key should be stored in a secret vault (Vault/K8s Secret), not re-generated on each boot.
**Remediation**:
```csharp
if (env.IsDevelopment())
identityServer.AddDeveloperSigningCredential();
else
identityServer.AddSigningCredential(LoadCertificateFromVault()); // HSM or K8s TLS secret
```
**Timeline**: Before next production deployment
---
## Warnings
### WARN-01 — JWT Tokens Stored in `localStorage` (XSS Risk)
**Severity**: High
**Affected Component**: `apps/web-client-tpos-net/src/WebClientTpos.Client/Services/AuthService.cs:147`
**Description**: Access tokens are stored in `localStorage` using role-specific keys (`aPOS_token_owner`, `aPOS_token_staff`):
```csharp
await _js.InvokeVoidAsync("localStorage.setItem", TokenKey(role), token.AccessToken);
```
`localStorage` is accessible by any JavaScript running on the same origin. If a single XSS vulnerability is introduced anywhere in the Blazor WASM app, an attacker can exfiltrate all tokens.
**Impact**: XSS → token theft → full account takeover.
**Remediation**:
- Use `HttpOnly` cookies for token storage (tokens never accessible to JS)
- If `localStorage` is retained for Blazor WASM constraints, implement strict CSP (see WARN-02) to minimize XSS risk
- Add token binding or short expiry (already 15 min — good) with refresh rotation
---
### WARN-02 — No Content-Security-Policy (CSP) Header
**Severity**: High
**Affected Component**: `infra/traefik/dynamic/middlewares.yml`
**Description**: The `secure-headers` middleware is missing `Content-Security-Policy`:
```yaml
secure-headers:
headers:
contentTypeNosniff: true # ✓
browserXssFilter: true # ✓ (deprecated but present)
frameDeny: true # ✓
stsSeconds: 31536000 # ✓
# CSP: MISSING ✗
# Referrer-Policy: MISSING ✗
# Permissions-Policy: MISSING ✗
```
Without CSP, the browser has no instructions to block inline scripts, restrict `eval()`, or limit allowed resource origins — the primary defense against XSS exploitation.
**Remediation** — Add to `secure-headers` middleware:
```yaml
contentSecurityPolicy: >
default-src 'self';
script-src 'self' 'wasm-unsafe-eval';
style-src 'self' 'unsafe-inline';
img-src 'self' data: https:;
connect-src 'self' https://api.goodgo.vn wss://;
font-src 'self';
object-src 'none';
frame-ancestors 'none';
referrerPolicy: "strict-origin-when-cross-origin"
permissionsPolicy: "camera=(), microphone=(), geolocation=()"
```
Note: Blazor WASM requires `wasm-unsafe-eval` for .NET runtime compilation.
---
### WARN-03 — CORS `allowCredentials: true` with Multiple Origins
**Severity**: High
**Affected Component**: `infra/traefik/dynamic/middlewares.yml:27`
**Description**:
```yaml
cors:
headers:
accessControlAllowOriginList:
- "http://localhost:3000" # Development origins
- "http://localhost:3001"
- "https://goodgo.vn" # Production origin
accessControlAllowCredentials: true # ← HIGH RISK
```
Combining `allowCredentials: true` with a list of origins that includes localhost development URLs is dangerous. If a developer's machine is compromised, the credentials flag allows the attacker to make authenticated cross-origin requests from any of these origins.
**Remediation**:
- Per-environment CORS config: production allows only `https://goodgo.vn`
- Development config in a separate file that is NOT applied to production
- `admin.goodgo.vn` should be added to production allowlist if used
---
### WARN-04 — `sslRedirect: false` in Shared Middleware Config
**Severity**: High
**Affected Component**: `infra/traefik/dynamic/middlewares.yml:5`
**Description**:
```yaml
sslRedirect: false # EN: Disabled for local development
```
This file is used across all environments. If this same `middlewares.yml` is deployed to staging/production (common in this monorepo pattern), HTTPS redirection will be disabled.
**Remediation**:
- Separate `middlewares.yml` per environment, or use Traefik environment variable substitution
- Explicitly set `sslRedirect: true` in staging/production overlay
---
### WARN-05 — `Jwt__RequireHttpsMetadata=false` Across All Services
**Severity**: High
**Affected Component**: `deployments/local/docker-compose.yml` (lines ~219, 265, 372, 420, etc.)
**Description**: Every service in docker-compose overrides JWT metadata HTTPS requirement:
```yaml
- Jwt__RequireHttpsMetadata=false
```
This is acceptable for local Docker networking. However, if these environment variables are inadvertently propagated to staging/production K8s ConfigMaps, JWT validation will succeed over plain HTTP.
**Remediation**:
- Verify K8s ConfigMaps do NOT contain this override: `grep -r RequireHttpsMetadata deployments/staging/ deployments/production/`
- Add to deployment runbook: `RequireHttpsMetadata` must be `true` in staging and production
---
### WARN-06 — Traefik Dashboard Exposed Without Authentication
**Severity**: Medium
**Affected Component**: `deployments/local/docker-compose.yml:121`
**Description**:
```yaml
command:
- "--api.insecure=true"
```
Traefik API/dashboard is accessible at `http://localhost:8080` with no authentication in local development. While local-only, this exposes full routing configuration, middleware definitions, and service topology.
**Remediation**:
- Use `--api.insecure=false` and enable `api.dashboard: true` with `BasicAuth` middleware
- Verify staging/production Traefik deployment does NOT have `--api.insecure=true`
---
### WARN-07 — Unauthenticated Ad Tracking Endpoints Without Rate Limiting
**Severity**: Medium
**Affected Component**: Routes for `/api/v1/pixels`, `/api/v1/conversions`, `/api/v1/ads/events`
**Description**: These endpoints are intentionally public (pixel tracking requires no auth) but have no documented rate limiting in the Traefik route labels. The only defined rate limits are `auth-ratelimit`, `payment-ratelimit`, `api-ratelimit`, and `hub-ratelimit`.
**Impact**: An attacker can spam fake conversion/pixel events to corrupt analytics, inflate ad billing, or DoS the tracking services.
**Remediation**:
- Apply a dedicated `tracking-ratelimit` (e.g., 200 req/min per IP) to these routes
- Implement HMAC signature validation on pixel requests with a short-lived nonce
---
### WARN-08 — AllowedHosts Wildcard in IAM Service
**Severity**: Medium
**Affected Component**: `services/iam-service-net/src/IamService.API/appsettings.json:79`
**Description**:
```json
"AllowedHosts": "*"
```
A wildcard `AllowedHosts` allows the IAM service (which issues JWT tokens) to respond to requests regardless of the `Host` header value. This can be exploited in DNS rebinding attacks or Host header injection to redirect token issuance.
**Remediation**:
```json
"AllowedHosts": "iam-service;localhost;127.0.0.1"
```
Production should explicitly list `iam.goodgo.vn` or the internal K8s service name.
---
### WARN-09 — K8s Staging Secrets File Contains Placeholders
**Severity**: Medium
**Affected Component**: `deployments/staging/kubernetes/secrets.yaml`
**Description**: The secrets manifest contains literal placeholder strings:
```yaml
Jwt__Secret: "PLACEHOLDER-staging-jwt-secret-min-32-chars"
```
If a CI/CD pipeline or developer applies this file directly without substitution, the platform will launch with predictable secrets.
**Remediation**:
- Use `secrets.yaml.example` pattern (like production) — remove the live staging `secrets.yaml` from git
- Adopt External Secrets Operator or Sealed Secrets for K8s secret management
- Add a CI pre-flight check that rejects `PLACEHOLDER` values in K8s secret manifests
---
### WARN-10 — TOTP Verification Window Allows 90-Second Replay
**Severity**: Low
**Affected Component**: `services/iam-service-net/src/IamService.Infrastructure/TwoFactor/TotpTwoFactorService.cs:86`
**Description**:
```csharp
new VerificationWindow(previous: 1, future: 1)
```
A tolerance of ±1 time step (30 seconds each) means a valid TOTP code has a 90-second effective validity window. Combined with no used-code tracking, a captured OTP could be replayed within 90 seconds.
**Remediation**:
- Track used TOTP codes in Redis with a 90-second TTL to prevent replay
- Consider reducing window to `previous: 0, future: 0` if clock skew is controlled
---
## Improvements
### IMP-01 — Implement Secrets Management Pipeline
Replace all hardcoded credentials with a proper secrets lifecycle:
```
Local Dev: dotnet user-secrets / .env (gitignored)
CI/CD: GitHub Actions Secrets → injected at build time
Staging/Prod: Kubernetes External Secrets Operator → HashiCorp Vault
or AWS Secrets Manager / Azure Key Vault
```
Reference implementation pattern for each service:
```csharp
// Program.cs — replace appsettings.json secret loading
builder.Configuration.AddEnvironmentVariables();
// In K8s: environment variables injected from K8s Secrets
```
---
### IMP-02 — Add Serilog Sensitive Data Destructuring
Prevent accidental PII/credential logging:
```csharp
Log.Logger = new LoggerConfiguration()
.Destructure.ByTransforming<ApplicationUser>(u => new { u.Id, u.Email })
.Filter.ByExcluding(e => e.Properties.ContainsKey("password"))
.WriteTo.Console()
.CreateLogger();
```
---
### IMP-03 — Add Security Scanning to CI/CD Pipeline
```yaml
# .github/workflows/security.yml
- name: Secret scanning
uses: gitleaks/gitleaks-action@v2
- name: Dependency audit (.NET)
run: dotnet list package --vulnerable --include-transitive
- name: SAST
uses: github/codeql-action/analyze@v3
with:
languages: csharp, javascript
```
---
### IMP-04 — Implement Refresh Token Rotation
Current refresh tokens have 7-day sliding window with no rotation. Add one-time-use refresh tokens:
```csharp
// In ResourceOwnerPasswordValidator or token endpoint
// Revoke old refresh token on each use
await _tokenRevocationService.RevokeAsync(oldRefreshToken);
var newRefreshToken = await _tokenService.IssueRefreshTokenAsync(user);
```
---
### IMP-05 — Add `Referrer-Policy` and `Permissions-Policy` Headers
Extend `secure-headers` middleware:
```yaml
customResponseHeaders:
Referrer-Policy: "strict-origin-when-cross-origin"
Permissions-Policy: "camera=(), microphone=(), geolocation=(self), payment=(self)"
Cross-Origin-Opener-Policy: "same-origin"
Cross-Origin-Resource-Policy: "same-site"
```
---
## Action Items
| Priority | ID | Action | Owner | Deadline |
|----------|----|--------|-------|----------|
| P0 | CRIT-01 | Rotate all production credentials, remove from git, purge history | DevOps + CTO | 24h |
| P0 | CRIT-02 | Revoke MCP server JWT token, remove `.env` from git tracking | DevOps | 24h |
| P0 | CRIT-03 | Replace `AddDeveloperSigningCredential()` with cert-based signing for production | Backend Lead | Before next deploy |
| P1 | WARN-01 | Evaluate `HttpOnly` cookie storage vs. localStorage for JWT tokens | Frontend Lead | 1 week |
| P1 | WARN-02 | Add CSP header to Traefik `secure-headers` middleware | DevOps | 1 week |
| P1 | WARN-03 | Separate CORS config by environment; remove localhost from production | DevOps | 1 week |
| P1 | WARN-04 | Create per-environment `middlewares.yml`; enforce `sslRedirect: true` in prod | DevOps | 1 week |
| P1 | WARN-05 | Audit K8s manifests for `RequireHttpsMetadata=false` — must not appear in staging/prod | DevOps | 1 week |
| P2 | WARN-06 | Disable Traefik insecure API, add BasicAuth to dashboard | DevOps | 2 weeks |
| P2 | WARN-07 | Apply rate limiting to tracking/pixel endpoints | Backend | 2 weeks |
| P2 | WARN-08 | Set explicit `AllowedHosts` in all service `appsettings.json` | Backend Lead | 2 weeks |
| P2 | WARN-09 | Remove staging `secrets.yaml` from git, adopt External Secrets Operator | DevOps | 2 weeks |
| P2 | WARN-10 | Add Redis-backed TOTP replay prevention | Backend | 2 weeks |
| P3 | IMP-01 | Implement full secrets management pipeline (Vault / External Secrets) | DevOps | 1 month |
| P3 | IMP-02 | Add Serilog sensitive data destructuring to all services | Backend | 1 month |
| P3 | IMP-03 | Add GitLeaks + CodeQL + dotnet vulnerability scan to CI/CD | DevOps | 1 month |
| P3 | IMP-04 | Implement refresh token rotation | Backend | 1 month |
| P3 | IMP-05 | Add Referrer-Policy and Permissions-Policy headers | DevOps | 1 month |
---
## Positive Findings
The following security controls are correctly implemented and should be maintained:
- ✅ **Account lockout**: 5 failed attempts → 15-minute lockout (`DependencyInjection.cs:93-96`)
- ✅ **TOTP 2FA**: OtpNet library, 160-bit secret, QR code provisioning, 10 recovery codes
- ✅ **Password hashing**: ASP.NET Core Identity PBKDF2/SHA256, enforced complexity rules
- ✅ **Audit logging**: 52 event types tracked with IP, UserAgent, actor, resource (`AuditEventType.cs`)
- ✅ **OTP hashing**: `SHA256.HashData()` used for phone/email OTPs — not stored in plaintext
- ✅ **Generic auth errors**: Login failures return generic messages to prevent username enumeration
- ✅ **Auth rate limiting**: 10 req/min on login/register/2FA endpoints
- ✅ **RBAC policies**: `RequireSuperAdmin`, `RequireAdmin`, `OwnerOrAdmin` custom handler
- ✅ **Health checks**: `/health/live` and `/health/ready` endpoints on all services
- ✅ **Short-lived access tokens**: 15-minute expiry
- ✅ **Security headers**: `X-Content-Type-Options`, `X-XSS-Protection`, `X-Frame-Options`, `HSTS`
-**Input validation**: FluentValidation in MediatR pipeline on all Commands
---
*Report generated by Security Engineer — TechBi / GoodGo Platform*
*Next audit scheduled: 2026-06-20*

View File

@@ -0,0 +1,191 @@
# Technical Writer Audit Report — GoodGo POS System
**Auditor**: Technical Writer Agent
**Date**: 2026-03-20
**Scope**: Documentation quality, API docs, README files, code comments
**Working Directory**: `/Users/velikho/Desktop/WORKING/pos-system`
---
## Executive Summary
The GoodGo Platform has a strong documentation foundation — 102+ markdown files in `/docs`, bilingual EN/VI coverage, and `SERVICE_DOCS.md` for all 25 microservices. The primary gaps are **service-level README files** (52% missing) and **OpenAPI specification files** (96% missing across services), both of which are addressable without major effort.
---
## Critical Issues
### C1 — OpenAPI Specs Missing for 24/25 Services
**Severity**: Critical
**Location**: `/docs/en/api/openapi/` (only `iam-service.yaml` present)
Only `iam-service.yaml` exists as a committed OpenAPI specification. The remaining 24 services (`merchant-service-net`, `order-service-net`, `fnb-engine-net`, `booking-service-net`, `catalog-service-net`, `inventory-service-net`, `wallet-service-net`, `promotion-service-net`, `membership-service-net`, `chat-service-net`, `social-service-net`, `storage-service-net`, `mining-service-net`, `mission-service-net`, all 5 ads services, all 4 mkt services) have no committed spec file.
**Impact**: No API discoverability for 96% of services. Clients cannot generate SDKs, cannot import into Postman/Insomnia, and there is no machine-readable contract for integration tests.
**Recommendation**: Export Swagger specs from each `.NET` service at build time and commit to `/docs/en/api/openapi/{service-name}.yaml`. Create a script under `scripts/` to automate collection.
---
### C2 — Postman Collections Directory is Empty
**Severity**: Critical
**Location**: `/docs/en/api/postman/` (0 files)
The directory exists but contains no collection files. Without Postman collections, developers must manually reconstruct requests from `SERVICE_DOCS.md`.
**Impact**: Developer onboarding friction — no ready-to-run API test collection.
**Recommendation**: Export Postman collections from Swagger specs or create manually. Commit to `/docs/en/api/postman/{service-name}.json`.
---
## Warnings
### W1 — Service README Files Missing for 13/25 Services (52% Gap)
**Severity**: High
**Affected Services**:
- `ads-analytics-service-net/` — no `README.md`
- `ads-billing-service-net/` — no `README.md`
- `ads-manager-service-net/` — no `README.md`
- `ads-serving-service-net/` — no `README.md`
- `ads-tracking-service-net/` — no `README.md`
- `booking-service-net/` — no `README.md`
- `catalog-service-net/` — no `README.md`
- `fnb-engine-net/` — no `README.md`
- `inventory-service-net/` — no `README.md`
- `order-service-net/` — no `README.md`
- `promotion-service-net/` — no `README.md`
- `social-service-net/` — no `README.md`
- `wallet-service-net/` — no `README.md`
**Mitigation**: All missing-README services have comprehensive `SERVICE_DOCS.md` files (auto-generated, covering API endpoints, database schema, domain models, configuration). This partially compensates, but `README.md` is typically the first file a new developer opens and contains prerequisites and quick-start instructions not found in `SERVICE_DOCS.md`.
**Recommendation**: Add `README.md` to all 13 services using `iam-service-net/README.md` as the template (294 lines, gold standard).
---
### W2 — Missing `storage-service-net/SERVICE_DOCS.md`
**Severity**: Medium
**Location**: `services/storage-service-net/`
`storage-service-net` has a `README.md` but the audit found no corresponding `SERVICE_DOCS.md`. All other services have both. This is an inconsistency in the documentation generation process.
**Recommendation**: Run the documentation generation audit on `storage-service-net` and create `SERVICE_DOCS.md`.
---
### W3 — Architecture Doc `storage-proposal.md` Exists Only in Vietnamese
**Severity**: Medium
**Location**: `/docs/vi/architecture/storage-proposal.md` (EN version missing)
The Vietnamese architecture folder contains `storage-proposal.md` with no corresponding English version at `/docs/en/architecture/storage-proposal.md`. This breaks the bilingual parity maintained across all other documentation.
**Recommendation**: Create `/docs/en/architecture/storage-proposal.md` as an English translation of the Vietnamese file.
---
### W4 — No CHANGELOG / Release Notes
**Severity**: Medium
**Location**: Root level (no `CHANGELOG.md`)
No formal release history exists. While conventional commits (`feat:`, `fix:`, `docs:`) provide an informal changelog via `git log`, there is no generated or maintained `CHANGELOG.md`.
**Mitigation**: Commit messages follow conventional commits format consistently — this is recoverable.
**Recommendation**: Add a root `CHANGELOG.md` and automate generation using `standard-changelog` or `conventional-changelog` as part of the release CI pipeline.
---
### W5 — `packages/config/README.md` Missing
**Severity**: Low
**Location**: `packages/config/` (no `README.md`)
5/6 shared Node.js packages have README files. `@goodgo/config` is undocumented.
**Recommendation**: Create `packages/config/README.md` documenting configuration variables, usage, and examples. Use `packages/types/README.md` as a template.
---
## Improvements
### I1 — No Visual System Architecture Diagram
Architecture docs are text-heavy but there is no Mermaid or D2 diagram showing all 25 services, their communication patterns, and data flows in a single view. `/docs/en/architecture/system-design.md` covers this conceptually but lacks a visual overview.
**Recommendation**: Add a Mermaid `graph LR` or `C4Context` diagram to `system-design.md` illustrating service topology.
---
### I2 — VitePress Documentation Site Not Publishing Architecture Docs
`/apps/web-docs/` is configured (VitePress v1.6.3, Mermaid plugin) but the deployment/CI pipeline does not appear to publish the documentation site. The `web-docs` package has `dev`, `build`, `preview` scripts but no CI workflow (`ci-web-docs.yml`) to deploy it.
**Recommendation**: Add a GitHub Actions workflow to build and deploy the VitePress site to GitHub Pages or a subdomain (e.g., `docs.goodgo.vn`).
---
### I3 — `.cursor/skills/` and `/docs/skills/` Duplication
Skills documentation exists in two places: `.cursor/skills/` (15 skill files) and `/docs/en/skills/` (13+ docs). They are partially overlapping. This creates maintenance burden — updates may be applied to one location and not the other.
**Recommendation**: Establish a single source of truth. Either have `.cursor/skills/` reference `/docs/en/skills/`, or consolidate into one location with symlinks.
---
### I4 — No API Documentation Index
There is no `/docs/en/api/README.md` or central index listing all services, their Swagger UI URLs (e.g., `http://localhost:{port}/swagger`), and their OpenAPI spec paths. Developers must know which service runs on which port.
**Recommendation**: Create `/docs/en/api/README.md` with a table:
| Service | Port | Swagger UI | OpenAPI Spec |
|---------|------|------------|--------------|
| iam-service-net | 5001 | http://localhost:5001/swagger | openapi/iam-service.yaml |
| merchant-service-net | 5002 | http://localhost:5002/swagger | — |
| ... | ... | ... | ... |
---
### I5 — MCP Server Documentation Could Be Expanded
`services/goodgo-mcp-server/SERVICE_DOCS.md` covers the 12 MCP tools but lacks a dedicated integration guide for AI clients (Claude, Cursor, etc.) showing how to configure the server, authenticate, and invoke each tool with examples.
**Recommendation**: Create `services/goodgo-mcp-server/docs/integration-guide.md` with per-tool usage examples and configuration instructions for AI clients.
---
## Action Items (Prioritized)
| # | Action | Priority | Effort | Owner |
|---|--------|----------|--------|-------|
| A1 | Export and commit OpenAPI YAML specs for all 24 missing services | P0 | High | Backend Dev + DevOps |
| A2 | Create Postman collections from Swagger specs | P0 | Medium | Backend Dev |
| A3 | Add `README.md` to 13 missing services (copy from `iam-service-net` template) | P1 | Medium | Tech Lead |
| A4 | Create `/docs/en/architecture/storage-proposal.md` (translate from VI) | P1 | Low | Technical Writer |
| A5 | Create `services/storage-service-net/SERVICE_DOCS.md` | P1 | Low | Backend Dev |
| A6 | Add CHANGELOG.md + automate with `standard-changelog` in CI | P2 | Medium | DevOps |
| A7 | Create `packages/config/README.md` | P2 | Low | Technical Writer |
| A8 | Add GitHub Actions workflow to deploy VitePress docs site | P2 | Low | DevOps |
| A9 | Create `/docs/en/api/README.md` — API documentation index with service port table | P2 | Low | Technical Writer |
| A10 | Add system architecture Mermaid diagram to `system-design.md` | P3 | Medium | Technical Writer |
| A11 | Consolidate `.cursor/skills/` and `/docs/en/skills/` duplication | P3 | Low | Technical Writer |
| A12 | Create MCP Server integration guide for AI clients | P3 | Medium | Technical Writer |
---
## Appendix — Documentation Coverage Matrix
| Area | Coverage | Quality | Notes |
|------|----------|---------|-------|
| Root README | ✅ 100% | Excellent | Bilingual, comprehensive |
| /docs architecture | ✅ 100% | Excellent | 10 docs × 2 languages |
| /docs guides | ✅ 100% | Excellent | 12 docs × 2 languages |
| /docs runbooks | ✅ 100% | Good | 3 runbooks × 2 languages |
| /docs skills | ✅ 100% | Good | 13 docs × 2 languages |
| SERVICE_DOCS.md | ✅ 96% | Excellent | 24/25 (storage missing) |
| Service README | ⚠️ 48% | Good | 12/25 services |
| OpenAPI specs | ❌ 4% | N/A | 1/25 services |
| Postman collections | ❌ 0% | N/A | Directory empty |
| App READMEs | ✅ 100% | Good | All 4 apps covered |
| Package READMEs | ⚠️ 83% | Minimal | 5/6 packages |
| C# XML comments | ✅ 100% | Excellent | Bilingual, consistent |
| CHANGELOG | ❌ 0% | N/A | No formal changelog |
---
*Report generated by Technical Writer Agent (7e5b8915-eef4-42c4-b102-1734a966d5fd) on 2026-03-20*

View File

@@ -0,0 +1,310 @@
# UX/UI Designer Audit — GoodGo POS System
**Auditor:** UX/UI Designer Agent
**Date:** 2026-03-20
**Scope:** `apps/web-client-tpos-net` — Blazor WASM + MudBlazor frontend
**Focus:** UI consistency, component reuse, theme & styling, accessibility, UX flows
---
## Executive Summary
The GoodGo POS frontend has a **solid design foundation** — centralized theming via `AppTheme.cs`, a reusable auth component library, and responsive multi-layout architecture. However, the codebase suffers from **critical accessibility failures** (no keyboard navigation, missing ARIA attributes, no focus-visible styles), **2,316+ inline style attributes** that undermine design system consistency, and **pervasive hardcoded Vietnamese strings** that break English localization throughout the POS UI. Overall UI/UX maturity: **6/10**.
---
## Critical Issues
Issues that block production readiness or violate WCAG 2.1 AA standards.
### 1. Missing `:focus-visible` Styles — WCAG 2.1 §2.4.7
**Files:** `wwwroot/css/admin.css`, `wwwroot/css/pos.css`, `wwwroot/css/auth.css`, `wwwroot/css/marketing.css`
**Impact:** Keyboard users cannot see which element has focus — entire app is effectively unusable without a mouse.
No CSS file defines `:focus-visible` styles. Browsers' default focus ring is often suppressed by MudBlazor reset styles.
**Fix:**
```css
/* Add to app.css global scope */
:focus-visible {
outline: 2px solid #FF5C00;
outline-offset: 2px;
border-radius: 4px;
}
```
---
### 2. Clickable `<div>` Elements Instead of `<button>` — WCAG 2.1 §4.1.2
**Files:** `Pages/Pos/Cafe/CafeDesktop.razor`, `Pages/Pos/Restaurant/RestaurantDesktop.razor`, and all other POS vertical desktop pages
**Impact:** Product cards, order items, and action elements are `<div>` with `@onclick` handlers. Screen readers cannot identify them as interactive. Keyboard users cannot Tab to them or activate with Enter/Space.
**Example (CafeDesktop.razor):**
```razor
<!-- BAD: div with onclick -->
<div class="pos-product-card" @onclick="() => AddToCart(product)">
...
</div>
<!-- GOOD: semantic button -->
<button class="pos-product-card" @onclick="() => AddToCart(product)"
aria-label="Thêm @product.Name vào giỏ hàng">
...
</button>
```
---
### 3. No ARIA Labels on Interactive Icons — WCAG 2.1 §4.1.2
**Files:** `Components/Auth/AuthButton.razor`, `Components/Auth/AuthInput.razor` (line 37), all layout files
**Impact:** Icon-only buttons (close, toggle, back) have no accessible names — screen readers announce nothing or raw Unicode.
**Example (AuthInput.razor ~line 37):**
```razor
<!-- BAD: no aria-label -->
<button @onclick="TogglePassword">
<i data-lucide="eye"></i>
</button>
<!-- GOOD -->
<button @onclick="TogglePassword"
aria-label="@(showPassword ? "Ẩn mật khẩu" : "Hiện mật khẩu")"
aria-pressed="@showPassword">
<i data-lucide="@(showPassword ? "eye-off" : "eye")"></i>
</button>
```
---
### 4. Error/Success Messages Missing `role="alert"` — WCAG 2.1 §4.1.3
**Files:** `Pages/Auth/LoginAdmin.razor` (lines 2840), `Pages/Pos/Cafe/CafeDesktop.razor`
**Impact:** Dynamic status changes (login error, payment success, cart update) are not announced to screen readers.
**Example (LoginAdmin.razor ~line 29):**
```razor
<!-- BAD: static div, not announced -->
<div style="background:rgba(239,68,68,0.12);color:#EF4444;">
@_errorMessage
</div>
<!-- GOOD -->
<div role="alert" aria-live="assertive" class="auth-error-message">
@_errorMessage
</div>
```
---
### 5. Hardcoded Vietnamese Strings in POS UI — Localization Failure
**File:** `Pages/Pos/Cafe/CafeDesktop.razor` and all POS vertical pages
**Impact:** English-language users see Vietnamese text. App cannot support international merchants.
**Examples found (CafeDesktop.razor):**
| Line | Hardcoded String | Should Be |
|------|-----------------|-----------|
| 24 | `"Đang tải..."` | `@L["Common_Loading"]` |
| 25 | `"Không thể tải dữ liệu"` | `@L["Common_LoadError"]` |
| 66 | `"Đơn hàng"` | `@L["Pos_OrderPanel_Title"]` |
| 67 | `"món"` | `@L["Pos_CartItem_Unit"]` |
| 89 | `"Nhập mã voucher..."` | `@L["Pos_Voucher_Placeholder"]` |
| 97 | `"Giảm giá"` | `@L["Pos_Discount_Label"]` |
| 107 | `"Tổng cộng"` | `@L["Pos_Total_Label"]` |
| 117 | `"Đang tạo đơn..."` | `@L["Pos_CreatingOrder"]` |
| 122 | `"Thanh toán"` | `@L["Pos_Checkout_Button"]` |
Also found in layouts:
- `AdminLayout.razor` (line ~160): `"Có lỗi xảy ra"`, `"Vui lòng thử lại"`, `"Tải lại"`
- `StaffLayout.razor` (line 148): Same error boundary text
- `PosLayout.razor` (lines 29, 65, 115): `"GoodGo POS"`, `"Menu"`, `"Đơn hàng"`
---
### 6. Hardcoded Color Values Instead of CSS Variables
**Files:** `Pages/Pos/Cafe/CafeDesktop.razor`, `Pages/Auth/LoginAdmin.razor`, multiple admin pages
**Impact:** Theme overrides are impossible. Colors are not consistent with the design token system.
**Examples:**
```razor
<!-- BAD: hardcoded rgba -->
style="background:rgba(139,92,246,0.1);color:#8B5CF6;"
style="background:rgba(239,68,68,0.12);color:#EF4444;"
style="background-color:rgba(139,92,246,0.125);"
<!-- GOOD: CSS variables -->
class="status-purple"
class="status-danger"
```
**Hardcoded colors to replace:**
| Value | Should Be |
|-------|-----------|
| `#16A34A` | `var(--color-success)` |
| `#EF4444` | `var(--color-danger)` |
| `#8B5CF6` | `var(--color-purple)` |
| `rgba(139,92,246,0.1)` | `var(--color-purple-subtle)` |
| `rgba(239,68,68,0.12)` | `var(--color-danger-subtle)` |
---
## Warnings
Issues that create significant technical debt or UX inconsistency.
### 7. 2,316 Inline Style Attributes
Inline styles are scattered across all pages — especially POS desktop views, admin shop pages, and dashboard. This makes CSS maintenance impossible and prevents design system enforcement.
**Top offending patterns:**
```razor
style="display:flex;flex-direction:column;gap:24px;"
style="padding:6px 12px;border-radius:8px;border:none;"
style="position:absolute;left:@{X}px;top:@{Y}px;"
```
**Strategy:** Create CSS utility classes and component classes. Move inline styles to scoped `.razor.css` files.
---
### 8. Inconsistent Spacing Scale
The design uses 10+ different spacing values without a defined scale:
- Gaps: 4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 32, 48px
- Padding: 6, 8, 10, 12, 16, 20, 24px
- Border radius: 6, 8, 10, 12, 14, 20, 24px
**Recommended scale:**
```css
/* Spacing: 4-point scale */
--space-1: 4px; --space-2: 8px; --space-3: 12px;
--space-4: 16px; --space-5: 20px; --space-6: 24px;
--space-8: 32px; --space-12: 48px;
/* Border radius: 3 values */
--radius-sm: 8px; --radius-md: 12px; --radius-lg: 16px;
```
---
### 9. Missing Contrast Ratio for Secondary Text — WCAG 2.1 §1.4.3
**File:** `wwwroot/css/admin.css`
```css
/* Current — MAY fail WCAG AAA */
--admin-text-secondary: #ADADB0; /* ratio ~5.08:1 on #1A1A1D — FAILS AAA (7:1) */
--admin-text-tertiary: #8B8B90; /* ratio ~4.13:1 on #1A1A1D — FAILS AA (4.5:1) */
```
`--admin-text-tertiary` likely **fails WCAG AA** for normal text. Needs contrast validation.
---
### 10. Cart Items Use `<div>` Instead of Semantic List
**File:** `Pages/Pos/Cafe/CafeDesktop.razor`
Cart items rendered as generic `<div>` elements. Should be `<ul>/<li>` structure for screen readers to announce list count and items.
---
### 11. No `.razor.css` Scoped Stylesheets Per Component
All 8 reusable components (`AuthButton`, `AuthInput`, `OtpInput`, etc.) have zero scoped CSS files. Styles are defined globally in `auth.css`. This creates unintended style leakage and makes component refactoring dangerous.
---
### 12. OTP Input Missing ARIA Group Label
**File:** `Components/Auth/OtpInput.razor` (lines 1219)
The 6-digit OTP input is a group of individual `<input>` elements but lacks `role="group"` and `aria-label="One-time password"`. Screen readers will announce 6 separate unlabeled inputs.
---
### 13. No Focus Trap in Modal Overlays
**Files:** `Layout/PosLayout.razor`, admin dialog pages
When mobile overlays or modals open, focus is not trapped within them. Users pressing Tab will cycle through background content, violating WCAG 2.1 §2.1.2.
---
## Improvements
Recommendations that would improve UX quality and developer experience.
### A. Create Component-Scoped CSS Files
For each component in `Components/`, create a matching `.razor.css` file:
```
Components/Auth/AuthButton.razor.css
Components/Auth/AuthInput.razor.css
Components/Pos/ResponsiveOrderPanel.razor.css
```
### B. Add Password Strength Indicator
**File:** `Pages/Auth/Register.razor`
Registration form has no visual password strength feedback. Add a 4-step strength bar (weak → fair → good → strong) using the existing `PasswordStrengthCalculator` pattern from Swift app.
### C. Standardize Icon Sizes to 3 Tiers
Currently uses: 12, 14, 16, 18, 20, 24, 28, 32px. Reduce to:
```css
--icon-sm: 16px; /* Inline, label */
--icon-md: 20px; /* Button, nav */
--icon-lg: 24px; /* Header, feature */
```
### D. Add `aria-expanded` to Expandable Sections
**File:** `Pages/Admin/Shop/ShopRecipes.razor`
Accordion-style expand/collapse sections lack `aria-expanded` and `aria-controls`. Screen readers cannot determine collapsed state.
### E. Add `aria-busy` to Async Loading States
Async data-fetch loading spinners don't set `aria-busy="true"` on the container, so screen readers don't know content is loading.
### F. Introduce Design Token Documentation
Create a living style guide page at `/admin/design-system` (dev only) that shows all color tokens, spacing, typography, and component variants. Reference: MudBlazor Theme Manager pattern.
### G. Validate Table Semantics
Admin tables (staff management, inventory, orders) should verify `<th scope="col">` on all headers. MudBlazor's `MudTable` typically handles this, but custom HTML tables may not.
---
## Action Items
Prioritized next steps.
| # | Priority | Effort | Item | Files |
|---|----------|--------|------|-------|
| 1 | 🔴 Critical | Small | Add `:focus-visible` styles globally | `wwwroot/css/app.css` |
| 2 | 🔴 Critical | Large | Replace clickable `<div>` with `<button>` in POS pages | All `*Desktop.razor` |
| 3 | 🔴 Critical | Medium | Add `aria-label` to all icon-only buttons | Auth components, layouts |
| 4 | 🔴 Critical | Medium | Add `role="alert"` to all error/success messages | Login pages, POS |
| 5 | 🔴 Critical | Large | Move hardcoded Vietnamese strings to L["key"] in POS | All POS vertical pages |
| 6 | 🔴 Critical | Medium | Replace hardcoded colors with CSS variables | CafeDesktop, Dashboard |
| 7 | 🟠 High | XL | Extract 2,316 inline styles to CSS classes | All pages |
| 8 | 🟠 High | Medium | Fix secondary text contrast (`--admin-text-tertiary`) | `admin.css` |
| 9 | 🟠 High | Small | Add `role="group"` + `aria-label` to OTP input | `OtpInput.razor` |
| 10 | 🟠 High | Medium | Implement focus trap in modal overlays | Layouts, dialogs |
| 11 | 🟠 High | Small | Localize error boundary strings in layouts | `AdminLayout`, `StaffLayout` |
| 12 | 🟡 Medium | Medium | Standardize spacing scale (4-point system) | All CSS files |
| 13 | 🟡 Medium | Small | Standardize border-radius (3 values) | All CSS files |
| 14 | 🟡 Medium | Medium | Create `.razor.css` per component | `Components/Auth/` |
| 15 | 🟡 Medium | Small | Add `aria-expanded` to expandable sections | `ShopRecipes.razor` |
| 16 | 🟡 Medium | Small | Add password strength indicator | `Register.razor` |
| 17 | 🟡 Medium | Small | Standardize icon size scale (3 tiers) | Icon usage site-wide |
| 18 | 🟡 Medium | Medium | Add cart item `<ul>/<li>` semantics | POS order panels |
| 19 | 🟢 Low | Large | Create design system documentation page | New page |
| 20 | 🟢 Low | Medium | Screen reader testing (NVDA/JAWS) | Full app regression |
---
*Audit performed by UX/UI Designer Agent — GoodGo Platform — 2026-03-20*