Files
goodgo-platform/docs/osm-data-model.md
Ho Ngoc Hai 1e9ef567a9
Some checks failed
CI / AI Services (Python) — Smoke (push) Failing after 35s
Deploy / Build Web Image (push) Failing after 30s
Deploy / Build AI Services Image (push) Failing after 11s
E2E Tests / Playwright E2E (push) Failing after 37s
CI / Lint → Typecheck → Test → Build (22) (push) Failing after 11m1s
Deploy / Build API Image (push) Failing after 10m40s
Backup Verification / Backup Restore Verification (push) Failing after 14s
Deploy / Deploy to Staging (push) Has been cancelled
Deploy / Smoke Test Staging (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Deploy / Rollback Staging (push) Has been cancelled
Deploy / Deploy to Production (push) Has been cancelled
Deploy / Smoke Test Production (push) Has been cancelled
Deploy / Rollback Production (push) Has been cancelled
Security Scanning / Dependency Audit (pnpm) (push) Failing after 6s
Security Scanning / Trivy Scan — API Image (push) Failing after 42s
Security Scanning / Trivy Scan — Web Image (push) Failing after 27s
Security Scanning / Trivy Scan — AI Services Image (push) Failing after 26s
Security Scanning / Trivy Filesystem Scan (push) Failing after 23s
Security Scanning / Security Gate (push) Failing after 1s
CodeQL Analysis / CodeQL (javascript-typescript) (push) Failing after 49s
docs(osm): note 2025 VN admin reform — vn_districts now holds ward/commune layer
Vietnam dropped the district administrative level in the 2025 reform
(Nghị quyết về sắp xếp đơn vị hành chính). Only two levels remain:
province (level=4) and ward/commune (level=6).

OSM has updated tagging accordingly: every former xã/phường/thị trấn
that survived the merge is now `admin_level=6`, no `admin_level=8`
features for VN. Our sync confirmed this — 3,189 level=6 units inserted
across 33 provinces, level=8 returns zero.

The schema column "vn_districts" stays as-is to avoid a cascade-rename
across IndustrialPark / ProjectDevelopment / Property FKs. Documented
the semantic shift in osm-data-model.md so future ops don't think
something is broken when wards are empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:13:26 +07:00

5.2 KiB
Raw Blame History

OSM Data Model — GoodGo Platform

This document is the canonical reference for every OpenStreetMap-sourced table in the GoodGo database, the sync pipelines that populate them, and the query patterns that use them.

Tables at a glance

Table Source Geometry Sync cadence Used by
vn_provinces OSM boundary=administrative + admin_level=4 MultiPolygon Weekly (Mon 02:30 ICT) GeoLookupService, KCN sync, address auto-fill
vn_districts OSM admin_level=6 MultiPolygon Weekly (Wed 02:30 ICT) Same as above. After the 2025 reform this table effectively holds the new ward / commune layer (~3,200 units), since Vietnam dropped the district level. The schema name is kept for backwards-compat with goodgo's existing FK references.
vn_wards OSM admin_level=8 MultiPolygon Weekly (Sat 02:30 ICT) Same as above. Note: after the 2025 admin reform Vietnam only uses level=4 (province) + level=6 (ward/commune). OSM doesn't currently tag any VN feature with admin_level=8, so this table will stay empty until/unless the policy changes. Kept for forward-compat.
Poi OSM nodes/ways/relations matching 20 category selectors Point Daily 1 category rotation (02:00 ICT) /poi/nearby, /poi/by-bbox, listing sidebar, search filter
TransportLine OSM `route=subway train highway` relations MultiLineString
IndustrialPark OSM landuse=industrial ways/relations Point + MultiPolygon boundary Monthly (1st 03:00 ICT, 4 chunks) /industrial/parks/*, KCN catalog
OsmSyncRun Generated by orchestrator Append-only audit /admin/osm dashboard

All sync writes are gated by OSM_SYNC_ENABLED=true so dev / staging environments don't hit Overpass accidentally.

GeoLookupService — the foundation

Every other layer depends on vn_provinces.geometry for PostGIS ST_Contains lookups. The service exposes:

const r = await geo.lookup(lng, lat);
// → { province: { code, name }, district: { code, name }, ward: { code, name } }

const inside = await geo.isInVietnam(lng, lat);
// → boolean

const cov = await geo.coverage();
// → { provinces: { total, withGeometry, lastSyncedAt }, districts: ..., wards: ... }

It replaces the old nearestProvince() heuristic that walked a hardcoded centroid table.

Quality gates baked into sync scripts

  1. Geographic gateisPointInVietnam(lng, lat) from scripts/data/vn-country-polygon.ts rejects rows whose centroid falls outside the VN mainland polygon (catches China / Laos / Cambodia bleed across the Overpass bbox chunks).
  2. Name gate — rows whose name contains zero Latin/Vietnamese letters (/[A-Za-zÀ-ỹ]/) are dropped (filters CJK / Khmer / Thai).
  3. Lock gate — when an admin sets osmLocked=true or adds a column to lockedFields, the next sync skips that row entirely (or that column) so manual edits survive.

Adding a new POI category

  1. Add the enum value to PoiCategory in prisma/schema.prisma and create a Prisma migration that ALTER TYPE "PoiCategory" ADD VALUE.
  2. Add the Overpass selector to CATEGORY_QUERIES in scripts/sync-osm-poi.ts.
  3. Append the same enum value to the POI_CATEGORIES rotation list in OsmSyncCronService so the cron picks it up.
  4. Add labels + icons + colour to apps/web/lib/poi-api.ts so the UI chips render.

That's it — OsmSyncService.findLayer('poi', 'YOUR_CAT') will return a def automatically because SYNC_LAYERS is generated from the enum keys.

Operational runbook

  • Sync hangs / 504 from Overpasskubectl describe pod on the Kaniko-style sync runner shows the chunk in flight. The script has a 5× retry on the clone step (HTTP 504 from Gitea is transient). For Overpass itself, raise the per-script [out:json][timeout:N] by editing the script. Default 180s for POI, 300s for boundaries.
  • Runs stuck in RUNNING stateOsmSyncOrchestrator writes the row before spawning the script. If the script process dies without emitting an exit event, the row stays RUNNING. Mitigation: cron job to flip RUNNING > 6h old to FAILED with errorMessage='timeout'.
  • Conflict logs — when sync updates a column the admin had locked, it skips the column silently. There is no separate conflict table (yet). To audit, search Loki for [osm-sync] skipping locked field.

Phase status

Phase Status Notes
0 — Admin boundaries + GeoLookupService Schema, sync, service done. Provinces synced (33), districts in progress
1 — POI catalog + sync Schema + sync script + NestJS module + sidebar component done. Hospital category synced (~500 rows)
2 — Transport (metro/railway/airport) 🟡 Stations synced via POI; lines layer pending
3 — Buildings / landuse Deferred — admin says low priority
4 — Sync orchestrator + admin dashboard Service + cron + Prometheus-friendly stats + admin UI done
5 — User-facing UX 🟡 Listing + KCN sidebar wired; search filter widget built; map overlays pending
6 — Performance hardening Materialized views + Redis cache pending