# OSM Data Model — GoodGo Platform This document is the canonical reference for every OpenStreetMap-sourced table in the GoodGo database, the sync pipelines that populate them, and the query patterns that use them. ## Tables at a glance | Table | Source | Geometry | Sync cadence | Used by | |-------|--------|----------|--------------|---------| | `vn_provinces` | OSM `boundary=administrative + admin_level=4` | MultiPolygon | Weekly (Mon 02:30 ICT) | `GeoLookupService`, KCN sync, address auto-fill | | `vn_districts` | OSM `admin_level=6` | MultiPolygon | Weekly (Wed 02:30 ICT) | Same as above. **After the 2025 reform** this table effectively holds the new ward / commune layer (~3,200 units), since Vietnam dropped the district level. The schema name is kept for backwards-compat with goodgo's existing FK references. | | `vn_wards` | OSM `admin_level=8` | MultiPolygon | Weekly (Sat 02:30 ICT) | Same as above. **Note**: after the 2025 admin reform Vietnam only uses level=4 (province) + level=6 (ward/commune). OSM doesn't currently tag any VN feature with admin_level=8, so this table will stay empty until/unless the policy changes. Kept for forward-compat. | | `Poi` | OSM nodes/ways/relations matching 20 category selectors | Point | Daily 1 category rotation (02:00 ICT) | `/poi/nearby`, `/poi/by-bbox`, listing sidebar, search filter | | `TransportLine` | OSM `route=subway|train|highway` relations | MultiLineString | Monthly | Distance scoring, planned for Phase 2 UX | | `IndustrialPark` | OSM `landuse=industrial` ways/relations | Point + MultiPolygon boundary | Monthly (1st 03:00 ICT, 4 chunks) | `/industrial/parks/*`, KCN catalog | | `OsmSyncRun` | Generated by orchestrator | — | Append-only audit | `/admin/osm` dashboard | All sync writes are gated by `OSM_SYNC_ENABLED=true` so dev / staging environments don't hit Overpass accidentally. ## GeoLookupService — the foundation Every other layer depends on `vn_provinces.geometry` for PostGIS `ST_Contains` lookups. The service exposes: ```ts const r = await geo.lookup(lng, lat); // → { province: { code, name }, district: { code, name }, ward: { code, name } } const inside = await geo.isInVietnam(lng, lat); // → boolean const cov = await geo.coverage(); // → { provinces: { total, withGeometry, lastSyncedAt }, districts: ..., wards: ... } ``` It replaces the old `nearestProvince()` heuristic that walked a hardcoded centroid table. ## Quality gates baked into sync scripts 1. **Geographic gate** — `isPointInVietnam(lng, lat)` from `scripts/data/vn-country-polygon.ts` rejects rows whose centroid falls outside the VN mainland polygon (catches China / Laos / Cambodia bleed across the Overpass bbox chunks). 2. **Name gate** — rows whose `name` contains zero Latin/Vietnamese letters (`/[A-Za-zÀ-ỹ]/`) are dropped (filters CJK / Khmer / Thai). 3. **Lock gate** — when an admin sets `osmLocked=true` or adds a column to `lockedFields`, the next sync skips that row entirely (or that column) so manual edits survive. ## Adding a new POI category 1. Add the enum value to `PoiCategory` in `prisma/schema.prisma` and create a Prisma migration that `ALTER TYPE "PoiCategory" ADD VALUE`. 2. Add the Overpass selector to `CATEGORY_QUERIES` in `scripts/sync-osm-poi.ts`. 3. Append the same enum value to the `POI_CATEGORIES` rotation list in `OsmSyncCronService` so the cron picks it up. 4. Add labels + icons + colour to `apps/web/lib/poi-api.ts` so the UI chips render. That's it — `OsmSyncService.findLayer('poi', 'YOUR_CAT')` will return a def automatically because `SYNC_LAYERS` is generated from the enum keys. ## Operational runbook * **Sync hangs / 504 from Overpass** — `kubectl describe pod` on the Kaniko-style sync runner shows the chunk in flight. The script has a 5× retry on the clone step (HTTP 504 from Gitea is transient). For Overpass itself, raise the per-script `[out:json][timeout:N]` by editing the script. Default 180s for POI, 300s for boundaries. * **Runs stuck in `RUNNING` state** — `OsmSyncOrchestrator` writes the row before spawning the script. If the script process dies without emitting an `exit` event, the row stays RUNNING. Mitigation: cron job to flip RUNNING > 6h old to FAILED with `errorMessage='timeout'`. * **Conflict logs** — when sync updates a column the admin had locked, it skips the column silently. There is no separate conflict table (yet). To audit, search Loki for `[osm-sync] skipping locked field`. ## Phase status | Phase | Status | Notes | |-------|--------|-------| | 0 — Admin boundaries + GeoLookupService | ✅ Schema, sync, service done. Provinces synced (33), districts in progress | | 1 — POI catalog + sync | ✅ Schema + sync script + NestJS module + sidebar component done. Hospital category synced (~500 rows) | | 2 — Transport (metro/railway/airport) | 🟡 Stations synced via POI; lines layer pending | | 3 — Buildings / landuse | ⏳ Deferred — admin says low priority | | 4 — Sync orchestrator + admin dashboard | ✅ Service + cron + Prometheus-friendly stats + admin UI done | | 5 — User-facing UX | 🟡 Listing + KCN sidebar wired; search filter widget built; map overlays pending | | 6 — Performance hardening | ⏳ Materialized views + Redis cache pending |