Files
goodgo-platform/scripts/backfill-osm-provinces.ts
Ho Ngoc Hai c15bdcc6bf
Some checks failed
CI / E2E Tests (push) Has been skipped
CI / Lint → Typecheck → Test → Build (22) (push) Failing after 9s
CI / AI Services (Python) — Smoke (push) Failing after 7s
CodeQL Analysis / CodeQL (javascript-typescript) (push) Failing after 1m7s
Deploy / Build API Image (push) Failing after 16s
Deploy / Build Web Image (push) Failing after 6s
Deploy / Build AI Services Image (push) Failing after 7s
E2E Tests / Playwright E2E (push) Failing after 15s
Security Scanning / Dependency Audit (pnpm) (push) Failing after 5s
Security Scanning / Trivy Scan — API Image (push) Failing after 1m13s
Security Scanning / Trivy Scan — Web Image (push) Failing after 49s
Security Scanning / Trivy Scan — AI Services Image (push) Failing after 40s
Security Scanning / Trivy Filesystem Scan (push) Failing after 40s
Deploy / Deploy to Staging (push) Has been skipped
Deploy / Smoke Test Staging (push) Has been skipped
Deploy / Deploy to Production (push) Has been skipped
Deploy / Smoke Test Production (push) Has been skipped
Security Scanning / Security Gate (push) Failing after 1s
Deploy / Rollback Staging (push) Has been skipped
Deploy / Rollback Production (push) Has been skipped
fix(industrial): improve OSM review UX + public map visibility
Four UX issues surfaced when reviewing the new OSM-sync pipeline against
the actual 2,193 imports — fixed in this commit:

1. Admin queue surfaced noise first.
   `ListOsmPendingHandler` now sorts by `totalAreaHa DESC` (real KCN
   first, single-factory `landuse=industrial` polygons last) and accepts
   `minAreaHa` (default 50 ha) plus a `region` filter. The admin page
   exposes both as dropdowns — "Tất cả / ≥ 5 / ≥ 50 / ≥ 200 / ≥ 500 ha".
   Top-of-queue is now Bàu Bàng (2,597 ha) and Nhơn Trạch (2,535 ha).

2. Promote dialog said "KCN KCN Đại An" — duplicate prefix.
   Reworded to "Sắp promote: <name>" so the row name stands on its own.

3. Province was "Chưa xác định" on 2,107 of 2,193 OSM rows.
   The OSM tags lacked any addr:* hint, so the importer never had
   anything to write. Added `scripts/data/vn-province-centroids.ts` (63
   provinces with capital-city coords) and a `nearestProvince(lat, lng)`
   fallback in `parseFeature()`. Shipped a one-shot backfill script
   `scripts/backfill-osm-provinces.ts` and ran it — every existing OSM
   row now has a province (Hồ Chí Minh: 408, Lạng Sơn: 232,
   Quảng Ninh: 220, Hà Nội: 172, Hải Phòng: 105, …). Admin can correct
   on promote if the nearest-centroid heuristic picked the wrong
   neighbour for a long-thin province.

4. Public map looked empty — only 20 curated parks visible.
   Added an opt-in toggle "Hiển thị KCN OSM" with a small legend above
   the map. When on, the bbox endpoint returns OSM raw rows too; markers
   render in amber (vs. green for curated) at slightly smaller radius
   and lower opacity, so the visual hierarchy stays clear. Refetch is
   wired through a ref so the toggle takes effect without remounting
   the map.

Verified in browser preview: zoom-out shows clusters of 320 / 71 / etc.
across the country with the toggle on, and just three small clusters
(20 curated parks) when off.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 00:09:24 +07:00

95 lines
2.9 KiB
TypeScript

/**
* One-shot backfill for OSM-imported `IndustrialPark` rows whose
* `province` is "Chưa xác định" (the placeholder we wrote when the OSM
* tags lacked any addr:* hints).
*
* Usage:
* NODE_OPTIONS="-r dotenv/config" DOTENV_CONFIG_PATH=.env \
* pnpm tsx scripts/backfill-osm-provinces.ts [--dry-run]
*
* What it does:
* 1. Selects every row where dataSource = 'OSM' AND province =
* 'Chưa xác định'.
* 2. Reads the centroid via ST_X / ST_Y from the `location` Point.
* 3. Looks up the nearest province from VN_PROVINCE_CENTROIDS.
* 4. Updates the row in batches.
*
* Safe to re-run: skips rows where province is already filled in.
*/
import 'dotenv/config';
import { PrismaPg } from '@prisma/adapter-pg';
import { PrismaClient } from '@prisma/client';
import pg from 'pg';
import { nearestProvince } from './data/vn-province-centroids';
const pool = new pg.Pool({ connectionString: process.env['DATABASE_URL'] });
const adapter = new PrismaPg(pool);
const prisma = new PrismaClient({ adapter });
const dryRun = process.argv.includes('--dry-run');
interface Row {
id: string;
lat: number;
lng: number;
}
async function main(): Promise<void> {
console.log('🔍 Finding OSM rows with province="Chưa xác định"…');
const rows = await prisma.$queryRawUnsafe<Row[]>(
`SELECT id, ST_Y(location::geometry) AS lat, ST_X(location::geometry) AS lng
FROM "IndustrialPark"
WHERE "dataSource"::text = 'OSM' AND province = 'Chưa xác định'`,
);
console.log(`${rows.length} rows need a province.`);
if (!rows.length) {
console.log('✓ Nothing to do.');
return;
}
const updates = new Map<string, string[]>();
for (const row of rows) {
const province = nearestProvince(row.lat, row.lng);
if (!updates.has(province)) updates.set(province, []);
updates.get(province)!.push(row.id);
}
// Sort by impact for the dry-run preview.
const summary = Array.from(updates.entries()).sort((a, b) => b[1].length - a[1].length);
console.log(' → Distribution by inferred province:');
for (const [province, ids] of summary) {
console.log(` ${province.padEnd(24)} ${ids.length}`);
}
if (dryRun) {
console.log('💡 --dry-run: no writes performed.');
return;
}
let totalUpdated = 0;
for (const [province, ids] of updates) {
// UPDATE in batches of 500 ids to avoid huge IN-lists.
for (let i = 0; i < ids.length; i += 500) {
const batch = ids.slice(i, i + 500);
const result = await prisma.industrialPark.updateMany({
where: { id: { in: batch } },
data: { province },
});
totalUpdated += result.count;
}
}
console.log(`✓ Updated ${totalUpdated} rows.`);
}
main()
.catch((err) => {
console.error(err);
process.exitCode = 1;
})
.finally(async () => {
await prisma.$disconnect();
await pool.end();
});