Some checks failed
CI / Lint → Typecheck → Test → Build (22) (push) Failing after 8s
CI / E2E Tests (push) Has been skipped
CI / AI Services (Python) — Smoke (push) Failing after 7s
CodeQL Analysis / CodeQL (javascript-typescript) (push) Failing after 1m8s
Deploy / Build API Image (push) Failing after 7s
Deploy / Build Web Image (push) Failing after 6s
Deploy / Build AI Services Image (push) Failing after 5s
E2E Tests / Playwright E2E (push) Failing after 9s
Security Scanning / Dependency Audit (pnpm) (push) Failing after 3s
Security Scanning / Trivy Scan — API Image (push) Failing after 40s
Security Scanning / Trivy Scan — Web Image (push) Failing after 44s
Security Scanning / Trivy Scan — AI Services Image (push) Failing after 45s
Security Scanning / Trivy Filesystem Scan (push) Failing after 1m8s
Deploy / Deploy to Staging (push) Has been skipped
Deploy / Smoke Test Staging (push) Has been skipped
Deploy / Deploy to Production (push) Has been skipped
Deploy / Smoke Test Production (push) Has been skipped
Security Scanning / Security Gate (push) Failing after 1s
Deploy / Rollback Staging (push) Has been skipped
Deploy / Rollback Production (push) Has been skipped
The OSM bbox sync was picking up `landuse=industrial` polygons that sit just across the borders in Laos, Thailand, Cambodia and southern China. After the bulk promote we ended up with 220 of those in the public catalog — Vientiane SEZ, Phnom Penh SEZ, Sihanoukville SEZ, several Thai industrial estates etc. Two-part fix: 1. `scripts/data/vn-country-polygon.ts` — a hand-traced ~30-vertex GeoJSON polygon that follows VN's land + sea border. The eastern edge is generous (110°E) so every coastal industrial zone (Vũng Áng / Formosa, Dung Quất, Nhơn Hội, Vũng Tàu / Long Sơn) sits comfortably inside; the western/northern edges trace the actual neighbour borders. Includes a pure-JS `isPointInVietnam(lng, lat)` ray-cast helper for the sync script (no extra dep). 2. `scripts/prune-non-vietnam-osm.ts` — one-shot cleaner. Uses PostGIS `ST_Within(location, polygon)` to delete every OSM row whose centroid falls outside. Verified the polygon doesn't reject genuine VN parks (Formosa Hà Tĩnh, Dung Quất, Nhơn Hội, KCN Đất Đỏ etc. all pass). 3. `sync-osm-industrial-parks.ts` `parseFeature()` now calls `isPointInVietnam` after computing the centroid and bails early on a miss, so the next monthly cron run won't re-import them. Run on dev: removed 220 rows. Final catalog 1,483 KCN, all inside the Vietnam mainland polygon. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
91 lines
3.0 KiB
TypeScript
91 lines
3.0 KiB
TypeScript
/**
|
|
* Prune `IndustrialPark` rows whose centroid is outside the Vietnam
|
|
* mainland polygon. Catches the cross-border bleed (Laos, Thailand,
|
|
* Cambodia) that the Overpass bbox sync inevitably picks up.
|
|
*
|
|
* Usage:
|
|
* NODE_OPTIONS="-r dotenv/config" DOTENV_CONFIG_PATH=.env \
|
|
* pnpm tsx scripts/prune-non-vietnam-osm.ts [--dry-run]
|
|
*
|
|
* Strategy:
|
|
* 1. Build a PostGIS polygon from `VN_COUNTRY_POLYGON_GEOJSON`.
|
|
* 2. SELECT rows where `NOT ST_Within(location, polygon)`, scoped to
|
|
* OSM-sourced rows (we never want to delete a manually-curated
|
|
* row even if its centroid is wonky).
|
|
* 3. DELETE in one statement (cascade removes any IndustrialListing
|
|
* rows attached to those parks).
|
|
*
|
|
* Safe to re-run: idempotent.
|
|
*/
|
|
import 'dotenv/config';
|
|
import { PrismaPg } from '@prisma/adapter-pg';
|
|
import { PrismaClient } from '@prisma/client';
|
|
import pg from 'pg';
|
|
import { VN_COUNTRY_POLYGON_GEOJSON } from './data/vn-country-polygon';
|
|
|
|
const pool = new pg.Pool({ connectionString: process.env['DATABASE_URL'] });
|
|
const adapter = new PrismaPg(pool);
|
|
const prisma = new PrismaClient({ adapter });
|
|
|
|
const dryRun = process.argv.includes('--dry-run');
|
|
|
|
async function main(): Promise<void> {
|
|
const polygonSql = `ST_SetSRID(ST_GeomFromGeoJSON('${VN_COUNTRY_POLYGON_GEOJSON.replace(
|
|
/'/g,
|
|
"''",
|
|
)}'), 4326)`;
|
|
|
|
const outsideRows = await prisma.$queryRawUnsafe<
|
|
{ id: string; name: string; province: string; lat: number; lng: number; ha: number }[]
|
|
>(
|
|
`SELECT id, name, province,
|
|
ROUND(ST_Y(location::geometry)::numeric, 3)::float AS lat,
|
|
ROUND(ST_X(location::geometry)::numeric, 3)::float AS lng,
|
|
COALESCE("totalAreaHa", 0) AS ha
|
|
FROM "IndustrialPark"
|
|
WHERE "dataSource" IN ('OSM', 'OSM_PROMOTED')
|
|
AND NOT ST_Within(location::geometry, ${polygonSql})
|
|
ORDER BY ha DESC NULLS LAST`,
|
|
);
|
|
|
|
console.log(`📍 Found ${outsideRows.length} OSM rows OUTSIDE the VN polygon.`);
|
|
|
|
if (outsideRows.length === 0) {
|
|
console.log('✓ Catalog is clean.');
|
|
return;
|
|
}
|
|
|
|
// Show the top 15 by area so the operator can sanity-check before deleting.
|
|
console.log(' Top 15 by area (will be deleted):');
|
|
for (const row of outsideRows.slice(0, 15)) {
|
|
console.log(
|
|
` ${row.name.slice(0, 50).padEnd(50)} ${row.province.slice(0, 16).padEnd(16)} ${
|
|
row.ha
|
|
} ha (${row.lat}, ${row.lng})`,
|
|
);
|
|
}
|
|
|
|
if (dryRun) {
|
|
console.log('💡 --dry-run: no writes performed.');
|
|
return;
|
|
}
|
|
|
|
console.log(`\n🗑 Deleting ${outsideRows.length} rows…`);
|
|
const result = await prisma.$executeRawUnsafe(
|
|
`DELETE FROM "IndustrialPark"
|
|
WHERE "dataSource" IN ('OSM', 'OSM_PROMOTED')
|
|
AND NOT ST_Within(location::geometry, ${polygonSql})`,
|
|
);
|
|
console.log(`✓ Removed ${result} rows.`);
|
|
}
|
|
|
|
main()
|
|
.catch((err) => {
|
|
console.error(err);
|
|
process.exitCode = 1;
|
|
})
|
|
.finally(async () => {
|
|
await prisma.$disconnect();
|
|
await pool.end();
|
|
});
|