Files
pos-system/microservices/docs/audit/database-architect.md
Ho Ngoc Hai 76d75c753b Migrate
2026-05-23 18:37:02 +07:00

330 lines
15 KiB
Markdown

# Database Architecture Audit — GoodGo POS System
**Auditor:** Database Architect (TechBi Company)
**Date:** 2026-03-20
**Scope:** Schema design, migrations, indexing, query performance, data modeling
**Coverage:** 23 microservices, 24 PostgreSQL databases, Redis 7 caching layer
---
## Executive Summary
The platform follows a sound **database-per-service** architecture with EF Core 10 + Fluent API configuration, domain-driven entity design, and consistent snake_case column naming across most services. The foundation is solid for a POS microservices platform. However, three critical issues require immediate attention: a broken `InventoryItem.CreatedAt` field mapping that silently returns `DateTime.MinValue` from the database, missing optimistic concurrency on wallet balance updates (race condition risk under concurrent POS load), and UUID v4 used universally instead of the UUID v7 standard specified in the architecture guidelines. Additionally, IAM's Phase 4A migration introduced PascalCase table names that break the snake_case convention across the rest of the system.
---
## Critical Issues
### C-1 · `InventoryItem.CreatedAt` Backing Field Not Mapped — Silent Data Loss
**File:** `services/inventory-service-net/src/InventoryService.Infrastructure/EntityConfigurations/InventoryItemEntityTypeConfiguration.cs:34`
```csharp
// WRONG — Ignores public property but NEVER maps the private backing field _createdAt
builder.Ignore(i => i.CreatedAt);
```
The domain entity has `private DateTime _createdAt` and `public DateTime CreatedAt => _createdAt;`. The `created_at` column exists in the migration and database. However, because `builder.Ignore(i => i.CreatedAt)` discards the property *and* no `builder.Property<DateTime>("_createdAt").HasColumnName("created_at")` mapping exists, EF Core never reads `created_at` from the DB into the backing field. Every entity loaded from the database will have `CreatedAt == DateTime.MinValue`.
Compare with the correct pattern used in `InventoryItemEntityTypeConfiguration` for other fields:
```csharp
builder.Property(i => i.ShopId).HasField("_shopId").HasColumnName("shop_id").IsRequired(); // ✓ correct
```
**Fix:**
```csharp
// Replace builder.Ignore(i => i.CreatedAt) with:
builder.Property<DateTime>("_createdAt")
.HasColumnName("created_at")
.IsRequired();
builder.Ignore(i => i.CreatedAt); // keep ignoring the computed public property
```
No migration needed — the column already exists.
---
### C-2 · No Optimistic Concurrency on Wallet Balances — Race Condition Risk
**File:** `services/wallet-service-net/src/WalletService.Infrastructure/EntityConfigurations/WalletItemEntityTypeConfiguration.cs`
`wallet_items.balance` (the per-currency balance) has no concurrency token. In a busy POS terminal, two concurrent `Debit` or `Credit` operations on the same wallet can interleave:
```
Tx1: READ balance=100 → compute 100-30=70
Tx2: READ balance=100 → compute 100-50=50
Tx1: WRITE balance=70
Tx2: WRITE balance=50 ← overwrites Tx1's debit, 30 VND disappears
```
**Fix — add PostgreSQL `xmin` system column as concurrency token:**
```csharp
// WalletItemEntityTypeConfiguration.cs
builder.UseXminAsConcurrencyToken(); // Npgsql built-in
```
No migration needed — `xmin` is a PostgreSQL system column. EF Core will throw `DbUpdateConcurrencyException` on conflict, which the application must retry.
Also applies to `wallet_transactions` if idempotent writes are not already enforced by `ClientRequest` idempotency table.
---
### C-3 · UUID v4 (`Guid.NewGuid()`) Used Everywhere Instead of UUID v7
**Files:** All domain aggregate roots across all 23 services
Architecture spec mandates **UUID v7** (time-sortable, B-tree friendly). Every service uses `Guid.NewGuid()` (UUID v4 — random). At scale, random UUIDs cause **index fragmentation** in PostgreSQL B-tree indexes: each new row's primary key is a random value, forcing page splits and bloating indexes.
**Examples:**
- `services/order-service-net/src/OrderService.Domain/AggregatesModel/OrderAggregate/Order.cs:163``Id = Guid.NewGuid()`
- `services/merchant-service-net/src/MerchantService.Domain/AggregatesModel/ShopAggregate/Shop.cs:191``Id = Guid.NewGuid()`
**Fix (.NET 9+):**
```csharp
// Create a shared utility or use directly:
Id = Guid.CreateVersion7(); // .NET 9+ built-in, time-sortable
```
This is a cross-cutting change. Recommend introducing a `NewId()` factory in the `SeedWork/Entity.cs` base class:
```csharp
protected static Guid NewId() => Guid.CreateVersion7();
```
---
### C-4 · IAM Phase4A Migration — PascalCase Table Names Break Convention
**File:** `services/iam-service-net/src/IamService.Infrastructure/Migrations/20260114091114_Phase4A_AuditAndCompliance.cs`
Audit and compliance tables were created with PascalCase names, breaking the snake_case convention used by every other service:
| Table Created | Expected |
|---|---|
| `AuditLogs` | `audit_logs` |
| `ComplianceReports` | `compliance_reports` |
| `ComplianceViolations` | `compliance_violations` |
| `IX_AuditLogs_ActorId` | `ix_audit_logs_actor_id` |
| `IX_AuditLogs_Timestamp` | `ix_audit_logs_timestamp` |
PostgreSQL is case-insensitive for unquoted identifiers but stores them lowercase. Quoted identifiers in EF queries will fail if mixed. The `AuditLogRepository` likely works by accident (EF generates quoted DDL), but it creates a maintainability hazard.
**Fix — additive migration:**
```sql
ALTER TABLE "AuditLogs" RENAME TO audit_logs;
ALTER TABLE "ComplianceReports" RENAME TO compliance_reports;
ALTER TABLE "ComplianceViolations" RENAME TO compliance_violations;
ALTER INDEX "IX_AuditLogs_ActorId" RENAME TO ix_audit_logs_actor_id;
-- ... etc
```
Then update `EntityTypeConfiguration` and `AuditLogEntityTypeConfiguration` to use `builder.ToTable("audit_logs")`.
Also note: **OpenIddict tables** (`OpenIddictApplications`, `OpenIddictTokens`, etc.) in the same IAM database use PascalCase — this is OpenIddict's default and acceptable since it manages its own schema, but is worth documenting.
---
## Warnings
### W-1 · Missing Global Query Filter for Soft-Deleted Records
**File:** `services/merchant-service-net/src/MerchantService.Infrastructure/MerchantServiceContext.cs`
Both `Merchant` and `Shop` entities have `is_deleted` columns (correctly designed). However, `MerchantServiceContext.OnModelCreating` does not register a global query filter:
```csharp
// Missing in MerchantServiceContext.OnModelCreating:
modelBuilder.Entity<Merchant>()
.HasQueryFilter(m => EF.Property<bool>(m, "_isDeleted") == false);
modelBuilder.Entity<Shop>()
.HasQueryFilter(s => EF.Property<bool>(s, "_isDeleted") == false);
```
Without this, every repository query must manually add `.Where(x => !x.IsDeleted)`. Any query that omits this filter will return deleted records silently. This is a data leak risk.
**Fix:** Add `HasQueryFilter` in `MerchantServiceContext.OnModelCreating`. Ensure `IgnoreQueryFilters()` is available for admin/hard-delete operations.
---
### W-2 · Missing Composite Indexes on High-Traffic Query Patterns
The most common POS reporting queries are not supported by composite indexes:
| Service | Missing Index | Common Query |
|---|---|---|
| `order-service-net` | `(shop_id, created_at DESC)` | "Orders today for shop X" |
| `order-service-net` | `(shop_id, status_id)` | "Pending orders for shop X" |
| `catalog-service-net` | `(shop_id, is_active)` | "Active products for shop X" |
| `booking-service-net` | `(shop_id, start_time)` | "Appointments today for shop X" |
| `inventory-service-net` | `(shop_id, quantity)` | "Low stock items for shop X" |
Current separate single-column indexes (`ix_orders_shop_id` + `ix_orders_created_at`) require a bitmap index scan merge, which is slower than a single composite index for the `WHERE shop_id = ? AND created_at BETWEEN ? AND ?` pattern.
**Fix example for orders:**
```csharp
// OrderEntityTypeConfiguration.cs
builder.HasIndex("_shopId", "_createdAt")
.HasDatabaseName("ix_orders_shop_id_created_at");
builder.HasIndex("_shopId", "StatusId")
.HasDatabaseName("ix_orders_shop_id_status_id");
```
---
### W-3 · Index Naming Inconsistency — `idx_` vs `ix_` Prefix
**Files:** `services/ads-analytics-service-net/src/AdsAnalyticsService.Infrastructure/EntityConfigurations/`
All services use `ix_` prefix for index names. The `AdsAnalyticsService` (and its `ReportEntityTypeConfiguration`) uses `idx_` prefix:
- `idx_campaign_metrics_campaign_id` ← should be `ix_campaign_metrics_campaign_id`
- `idx_reports_advertiser_id` ← should be `ix_reports_advertiser_id`
Minor but affects tooling, migrations consistency, and DBA runbooks.
---
### W-4 · `Debug.WriteLine` Left in Production Code
**File:** `services/order-service-net/src/OrderService.Infrastructure/OrderContext.cs:59,71`
```csharp
System.Diagnostics.Debug.WriteLine("OrderContext::ctor - " + GetHashCode());
System.Diagnostics.Debug.WriteLine("OrderContext::ctor (tenant) - " + GetHashCode());
```
Debug output left in a production DbContext constructor. Under load, this generates noise. Should be removed or replaced with structured `ILogger` at `Debug` level.
---
### W-5 · Inconsistent DbContext File Naming
**File:** `services/mkt-facebook-service-net/src/FacebookService.Infrastructure/MyServiceContext.cs`
The file is named `MyServiceContext.cs` — the template placeholder name was not updated when the service was created. The class inside is correctly named `FacebookServiceContext`, but the file name reveals copy-paste from `_template_dot_net` without cleanup. Affects code navigation and grep tooling.
---
### W-6 · `ads-analytics-service-net` and `ads-billing-service-net` Have No Repositories
These two services access `DbContext` directly from command/query handlers, bypassing the Repository pattern used everywhere else. This:
- Makes unit testing harder (must mock DbContext instead of a simple IRepository)
- Breaks the Clean Architecture boundary (Infrastructure DbContext leaks into Application layer)
- Creates inconsistency for future maintainers
---
### W-7 · Missing `expires_at` Index on Vouchers
**File:** `services/promotion-service-net/src/PromotionService.Infrastructure/EntityConfigurations/VoucherEntityTypeConfiguration.cs`
Voucher expiry queries (`WHERE expires_at < NOW() AND status_id = 'active'`) have no index on `expires_at`. For a promotion system that may hold millions of vouchers, periodic expiry sweeps will cause sequential scans.
```csharp
// Add:
builder.HasIndex(v => v.ExpiresAt)
.HasDatabaseName("ix_vouchers_expires_at")
.HasFilter("expires_at IS NOT NULL");
```
---
## Improvements
### I-1 · Enable `pg_stat_statements` in `init.sql`
**File:** `infra/databases/postgres/init.sql`
The `init.sql` is nearly empty. Add the essential PostgreSQL extension for query monitoring:
```sql
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; -- useful for uuid_generate_v7() if Guid.CreateVersion7 is not used
```
---
### I-2 · Add PgBouncer for Local Development
**File:** `deployments/local/docker-compose.yml`
PgBouncer is referenced in staging/production (via Neon's built-in pooler with `?pgbouncer=true`). However, local development has no connection pooler. With 25+ microservices each holding a connection pool, the local PostgreSQL can hit `max_connections=200` quickly during integration testing.
Add a lightweight PgBouncer container to `docker-compose.yml`:
```yaml
pgbouncer:
image: bitnami/pgbouncer:1.22
environment:
POSTGRESQL_HOST: postgres
POSTGRESQL_PORT: 5432
PGBOUNCER_POOL_MODE: transaction
PGBOUNCER_MAX_CLIENT_CONN: 500
PGBOUNCER_DEFAULT_POOL_SIZE: 10
```
Services should connect to `pgbouncer:5432` locally.
---
### I-3 · Consider Table Partitioning for High-Volume Append-Only Tables
As the platform grows, these tables will accumulate millions of rows and benefit from range partitioning by month:
| Table | Service | Strategy |
|---|---|---|
| `wallet_transactions` | wallet-service | RANGE on `created_at` |
| `inventory_transactions` | inventory-service | RANGE on `created_at` |
| `order_items` | order-service | inherited from `orders` RANGE on `created_at` |
| `audit_logs` (IAM) | iam-service | RANGE on `timestamp` |
EF Core 10 + Npgsql support declarative partitioning. This is not urgent for current scale but should be planned for tables expected to exceed 10M rows.
---
### I-4 · Add Partial Index for Active Products
**File:** `catalog-service-net`
For POS menu loading, ~90% of queries filter on `is_active = true`. A partial index reduces index size by ~10x:
```csharp
builder.HasIndex(p => new { p.ShopId, p.CategoryId })
.HasDatabaseName("ix_products_active_shop_category")
.HasFilter("is_active = true");
```
---
### I-5 · Materialized View for Daily Sales Reporting
**Service:** `order-service-net`
The most common merchant dashboard query — "sales by day for the last 30 days" — requires aggregating `orders` and `order_items` over large date ranges. A nightly-refreshed materialized view would eliminate expensive aggregation queries at runtime:
```sql
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
shop_id,
DATE(created_at) AS sale_date,
COUNT(*) AS order_count,
SUM(total_amount) AS total_revenue,
SUM(discount_amount) AS total_discount
FROM orders
WHERE status_id != 4 -- exclude cancelled
GROUP BY shop_id, DATE(created_at);
CREATE UNIQUE INDEX ON mv_daily_sales(shop_id, sale_date);
```
Refresh with `REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales;` via a scheduled job.
---
## Action Items (Prioritized)
| Priority | Item | Service(s) | Effort |
|---|---|---|---|
| **P0** | Fix `InventoryItem._createdAt` backing field mapping | `inventory-service-net` | 30 min |
| **P0** | Add optimistic concurrency token to `wallet_items` | `wallet-service-net` | 1 hour |
| **P1** | Migrate IAM AuditLogs/ComplianceReports to snake_case | `iam-service-net` | 2 hours |
| **P1** | Add `HasQueryFilter` for soft-deleted Merchant + Shop | `merchant-service-net` | 1 hour |
| **P1** | Replace `Guid.NewGuid()` with `Guid.CreateVersion7()` in all domain entities | All 23 services | 4 hours |
| **P2** | Add composite indexes: `(shop_id, created_at)` for orders, `(shop_id, is_active)` for products | order, catalog | 1 hour |
| **P2** | Remove `Debug.WriteLine` from `OrderContext` constructors | `order-service-net` | 10 min |
| **P2** | Add `expires_at` index on `vouchers` | `promotion-service-net` | 15 min |
| **P2** | Standardize index prefix to `ix_` in ads-analytics | `ads-analytics-service-net` | 30 min |
| **P3** | Rename `MyServiceContext.cs` to `FacebookServiceContext.cs` | `mkt-facebook-service-net` | 5 min |
| **P3** | Add PgBouncer to local docker-compose | `deployments/local` | 2 hours |
| **P3** | Enable `pg_stat_statements` in `init.sql` | `infra/databases` | 15 min |
| **P3** | Add Repository pattern to ads-analytics + ads-billing | `ads-*-service-net` | 4 hours |
---
## Database Inventory Summary
| Category | Count |
|---|---|
| Microservices with own database | 23 |
| Total migrations | 46 |
| DbContext files | 24 |
| EntityTypeConfiguration files | 95 |
| Repository implementations | ~91 |
| PostgreSQL databases (local) | 24 |
| Redis usage | SignalR backplane (chat), caching |
**Overall Assessment:** Architecture is well-structured and production-viable. The critical issues (C-1 to C-4) should be resolved before the next production deployment. Warnings are tech debt that should be addressed in the next sprint.