Files
pos-system/docs/audit/founding-engineer.md

12 KiB
Raw Blame History

Audit Report: POS System — Founding Engineer Perspective

Date: 2026-03-20 Auditor: Founding Engineer Scope: Code quality, shared patterns, template compliance, DRY violations Working Directory: /Users/velikho/Desktop/WORKING/pos-system


Executive Summary

The GoodGo Platform is a well-structured enterprise microservices monorepo with 23 .NET services, 5 frontend applications, and 6 shared Node.js packages. Clean Architecture and CQRS patterns are consistently applied across all services. However, three critical issues must be resolved before production: insecure CORS configuration (all 23 services), broken health check authentication (18 services), and ~5,100 lines of duplicated MediatR behaviors.


Critical Issues

C1 — Insecure CORS Configuration (All 23 Services)

Severity: HIGH — Security vulnerability Files: services/*/src/*/Program.cs lines ~113120

Every service uses AllowAnyOrigin():

// Current (INSECURE)
policy.AllowAnyOrigin()
      .AllowAnyMethod()
      .AllowAnyHeader();

This allows any website to issue cross-origin requests to the API, enabling CSRF attacks. There is no origin validation. AllowAnyOrigin() is also incompatible with AllowCredentials(), silently breaking cookie/credential flows.

Fix:

var allowedOrigins = builder.Configuration
    .GetSection("Cors:AllowedOrigins")
    .Get<string[]>() ?? Array.Empty<string>();

policy.WithOrigins(allowedOrigins)
      .AllowAnyMethod()
      .AllowAnyHeader()
      .AllowCredentials();

Add Cors__AllowedOrigins__0=https://goodgo.vn per service in docker-compose.yml and Kubernetes manifests.


C2 — Health Check Endpoints Require Authentication (18 of 23 Services)

Severity: HIGH — Breaks Kubernetes liveness/readiness probes Example: services/merchant-service-net/src/MerchantService.API/Program.cs lines 132137

// Current (missing .AllowAnonymous())
app.MapHealthChecks("/health");
app.MapHealthChecks("/health/live", new() { Predicate = _ => false });
app.MapHealthChecks("/health/ready");

Kubernetes kubelet cannot authenticate; probes return 401 Unauthorized → pod marked unhealthy → restart loop. Docker Compose healthcheck (curl -f http://localhost:8080/health/live) also fails.

Services without .AllowAnonymous(): merchant, iam, order, wallet, storage, membership, booking, catalog, inventory, chat, social, mining, mission, promotion, ads-manager, ads-analytics, mkt-facebook, mkt-whatsapp (and others).

Services correctly configured: ads-serving, ads-tracking, ads-billing, mkt-x, mkt-zalo.

Fix:

app.MapHealthChecks("/health").AllowAnonymous();
app.MapHealthChecks("/health/live", new() { Predicate = _ => false }).AllowAnonymous();
app.MapHealthChecks("/health/ready").AllowAnonymous();

C3 — MediatR Behaviors Duplicated Across All 23 Services

Severity: HIGH — Maintenance risk; single bug requires 23 fixes Files: services/*/src/*/Application/Behaviors/

Three behavior files are copied verbatim in every service:

File Lines per copy Total copies Total duplicated lines
LoggingBehavior.cs ~59 23 ~1,357
ValidatorBehavior.cs ~41 23 ~943
TransactionBehavior.cs ~50 23 ~1,150
Total ~3,450

Example of identical files:

  • services/merchant-service-net/src/MerchantService.API/Application/Behaviors/LoggingBehavior.cs
  • services/order-service-net/src/OrderService.API/Application/Behaviors/LoggingBehavior.cs

Only the namespace differs. Any fix in one behavior must be manually propagated to 22 other copies.

Fix: Extract to a shared NuGet package GoodGo.MediatR.Behaviors:

packages/
  GoodGo.MediatR.Behaviors/
    LoggingBehavior.cs
    ValidatorBehavior.cs
    TransactionBehavior.cs

Reference from each service:

<PackageReference Include="GoodGo.MediatR.Behaviors" Version="1.0.0" />

C4 — JWT Signature Validation Disabled in Development

Severity: MEDIUM — Allows any token to be accepted in dev environments Files: services/*/src/*/Program.cs lines ~91108

ValidateIssuerSigningKey = builder.Environment.IsDevelopment() ? false : true,
SignatureValidator = builder.Environment.IsDevelopment()
    ? (token, _) => new JsonWebToken(token)
    : null,

In development, any crafted JWT with valid exp/nbf claims is accepted. Tokens are not signature-verified. Risk is mitigated by IsDevelopment() check but raises the question of why this is needed at all.

Root Cause: Likely the IAM service (Duende IdentityServer) uses a rotating signing key not pre-shared with other services in Docker Compose local dev.

Fix: Seed a deterministic development signing key via environment variable and share it across services in docker-compose.yml. This enables full signature validation in all environments.


Warnings

W1 — SeedWork Classes Duplicated Across Domain Layers

Impact: Medium — Changes to base patterns require multi-service updates Files: services/*/src/*/Domain/SeedWork/

Entity.cs, IAggregateRoot.cs, IRepository.cs, ValueObject.cs, IUnitOfWork.cs, Enumeration.cs are copied into every service's Domain layer. This is partially justified by DDD isolation (Domain layer must have no external NuGet dependencies), but it means fixing a bug in Entity.cs requires updates in 23 places.

Recommendation: Document this as an intentional architectural trade-off in CLAUDE.md (currently undocumented). If the team wants single-source-of-truth, create GoodGo.Domain.SeedWork as a source-only NuGet (no DI dependencies).


W2 — API Versioning Configured But Not Fully Utilized

Impact: Low — Unused NuGet package capability Files: services/*/src/*/Program.cs lines 4256

builder.Services.AddApiVersioning(options => {
    options.DefaultApiVersion = new ApiVersion(1, 0);
    options.AssumeDefaultVersionWhenUnspecified = true;  // Bypasses versioning
});

Controllers use hardcoded routes ([Route("api/v1/merchants")]) rather than the {version:apiVersion} parameter. Asp.Versioning NuGet is installed but the API version attribute system is not used. Routes work correctly but versioning capability is nullified.

Options:

  1. Accept current pattern (hardcoded v1 routes) and document it
  2. Migrate to [ApiVersion("1.0")] on controllers and use {version:apiVersion} route parameter

W3 — Missing README.md in 14 Services

Impact: Low — Developer onboarding friction

Services with README: iam-service-net, merchant-service-net, membership-service-net, mkt-facebook-service-net, mkt-whatsapp-service-net, mkt-x-service-net, mkt-zalo-service-net (7 of 23)

Services without README: order, wallet, storage, booking, catalog, inventory, chat, social, mining, mission, promotion, ads-manager, ads-serving, ads-tracking, ads-billing, ads-analytics (16 of 23)


W4 — init-databases.sh Must Be Manually Maintained

File: deployments/local/init-databases.sh

When a new service is created, the database must be manually added to this script. There is no automated enforcement. If a developer creates a new service and forgets this step, the service's database won't be initialized in local dev.

Recommendation: Add a pre-start validation script or generate the init script from docker-compose.yml service names.


W5 — One Incomplete Test TODO

File: services/iam-service-net/tests/IamService.FunctionalTests/Controllers/AuthorizationPolicyTests.cs

/// TODO: Configure test authentication to properly inject role claims.

Minor — single comment in test code, not production code. But indicates an incomplete test scenario for authorization policies.


Improvements

I1 — Extract Common Infrastructure to Shared NuGet Packages

Priority: High

Create a GoodGo.* suite of shared NuGet packages:

  • GoodGo.MediatR.Behaviors — LoggingBehavior, ValidatorBehavior, TransactionBehavior
  • GoodGo.AspNetCore.Extensions — Common Program.cs middleware setup
  • GoodGo.Domain.SeedWork (optional, source-only) — Entity, ValueObject, IAggregateRoot

This eliminates the bulk of cross-service duplication.


I2 — Standardize CORS via Configuration

Priority: High

Move CORS allowed origins out of code and into configuration. Each service should read from Cors:AllowedOrigins environment variable. Docker Compose and Kubernetes manifests set per-environment values. This is a one-time templating effort with high security payoff.


I3 — Add Health Check Tests to All Functional Test Suites

Priority: Medium

Verify that /health/live and /health/ready return 200 without authentication. Currently only some functional test suites include these tests. Adding them to all 23 services would catch the C2 issue automatically in CI.

[Fact]
public async Task HealthLive_ReturnsOk_WithoutAuthentication()
{
    var response = await _client.GetAsync("/health/live");
    response.StatusCode.Should().Be(HttpStatusCode.OK);
}

I4 — Centralize Docker Compose Environment Defaults

Priority: Medium

Several environment variables are repeated across 23+ service definitions in docker-compose.yml. Consider using YAML anchors or a .env fragment approach to reduce drift:

x-common-env: &common-env
  ASPNETCORE_ENVIRONMENT: Development
  Serilog__MinimumLevel: Information
  ConnectionStrings__Redis: redis:6379

I5 — Add Consistent CI Health Check Validation

Priority: Low

Add a CI step that calls /health/live on each newly-built Docker image before pushing to Docker Hub. This catches health check regressions before staging deployment.


Action Items (Prioritized)

# Action Priority Effort Owner
1 Fix CORS — replace AllowAnyOrigin() with WithOrigins() in all 23 services P0 2h Founding Engineer
2 Fix health check authentication — add .AllowAnonymous() to 18 services P0 1h Founding Engineer
3 Extract MediatR behaviors to GoodGo.MediatR.Behaviors NuGet P1 4h Founding Engineer
4 Fix JWT development signing key — use deterministic key via env var P1 2h Founding Engineer
5 Add health check tests to all functional test suites P1 3h QA Engineer
6 Create README.md for 16 missing services P2 8h Tech Lead
7 Document SeedWork duplication trade-off in CLAUDE.md P2 0.5h Tech Lead
8 — Resolve IAM authorization test TODO P2 1h Backend Dev
9 Clarify API versioning strategy and document or clean up P3 1h CTO

Architecture Strengths

The following areas are well-implemented and should serve as reference patterns:

  • Clean Architecture: All 23 services strictly enforce the Domain → no external dependencies rule
  • CQRS: Commands, Queries, and Handlers follow naming conventions consistently
  • Entity Pattern: Private fields + behavior methods + domain events is correctly applied everywhere
  • Test Structure: Every service has both unit tests (xUnit + Moq + FluentAssertions) and functional tests (WebApplicationFactory + InMemory)
  • Dockerfile: Multi-stage build, non-root user (dotnetuser:1001), health check executable installed, port 8080
  • Traefik: Path-based routing, rate limiting, secure headers, CORS middleware all configured
  • Bilingual: Comments, logs, validation messages in EN + VI across the entire codebase
  • Observability: Prometheus + Grafana + Loki stack configured in infra/observability/
  • Monorepo: pnpm workspaces + Turborepo correctly set up with shared packages/