- Add pg-backup container with daily automated pg_dump (02:00 UTC) and 7-day retention - Add backup/restore scripts with documented recovery procedure - Add Loki + Promtail for centralized log aggregation from all Docker containers - Add Loki as Grafana datasource with correlation ID derived fields - Add Grafana logs dashboard with volume, error rate, HTTP request, and log viewer panels - Configure Promtail to parse Pino structured JSON logs with level/context labels - Enhance LoggerService with string-level formatter and service base field - Configure 15-day log retention in Loki Co-Authored-By: Paperclip <noreply@paperclip.ing>
103 lines
2.3 KiB
Markdown
103 lines
2.3 KiB
Markdown
# Database Backup & Restore Procedures
|
|
|
|
## Overview
|
|
|
|
Automated daily PostgreSQL backups run inside the `pg-backup` Docker container using `pg_dump` with custom format compression. Backups are stored in the `pg_backups` Docker volume.
|
|
|
|
## Backup Configuration
|
|
|
|
| Setting | Default | Environment Variable |
|
|
|---------|---------|---------------------|
|
|
| Schedule | Daily at 02:00 UTC | Cron in `pg-backup` service |
|
|
| Retention | 7 days | `BACKUP_RETENTION_DAYS` |
|
|
| Format | Custom (`pg_dump --format=custom`) | — |
|
|
| Compression | Level 6 | — |
|
|
| Storage | `pg_backups` Docker volume | — |
|
|
|
|
## Listing Backups
|
|
|
|
```bash
|
|
docker exec goodgo-pg-backup ls -lh /backups/
|
|
```
|
|
|
|
## Manual Backup
|
|
|
|
```bash
|
|
docker exec goodgo-pg-backup /scripts/pg-backup.sh
|
|
```
|
|
|
|
## Restore Procedure
|
|
|
|
### 1. Identify the backup to restore
|
|
|
|
```bash
|
|
docker exec goodgo-pg-backup ls -lht /backups/
|
|
```
|
|
|
|
### 2. Stop application services
|
|
|
|
```bash
|
|
docker compose stop ai-services
|
|
# Stop any NestJS API processes
|
|
```
|
|
|
|
### 3. Run restore
|
|
|
|
```bash
|
|
docker exec -it goodgo-pg-backup /scripts/pg-restore.sh /backups/goodgo_YYYYMMDD_HHMMSS.sql.gz
|
|
```
|
|
|
|
The restore script will:
|
|
- Terminate active database connections
|
|
- Drop and recreate the database
|
|
- Restore from the selected backup
|
|
|
|
### 4. Verify restore
|
|
|
|
```bash
|
|
docker exec goodgo-postgres psql -U goodgo -d goodgo -c '\dt'
|
|
docker exec goodgo-postgres psql -U goodgo -d goodgo -c 'SELECT count(*) FROM "User";'
|
|
```
|
|
|
|
### 5. Run Prisma migrations (if needed)
|
|
|
|
```bash
|
|
pnpm prisma migrate deploy
|
|
```
|
|
|
|
### 6. Restart services
|
|
|
|
```bash
|
|
docker compose up -d
|
|
```
|
|
|
|
## Backup Verification
|
|
|
|
Check the backup log:
|
|
|
|
```bash
|
|
docker exec goodgo-pg-backup cat /var/log/pg-backup.log
|
|
```
|
|
|
|
Verify backup integrity without restoring:
|
|
|
|
```bash
|
|
docker exec goodgo-pg-backup pg_restore --list /backups/goodgo_YYYYMMDD_HHMMSS.sql.gz
|
|
```
|
|
|
|
## Disaster Recovery
|
|
|
|
For complete data loss (volume destroyed):
|
|
|
|
1. Retrieve backup from external storage (if configured)
|
|
2. Recreate the `pg_backups` volume and copy backup file in
|
|
3. Follow the restore procedure above
|
|
|
|
## Log Aggregation
|
|
|
|
Logs are aggregated via Loki + Promtail and viewable in Grafana:
|
|
|
|
- **Grafana**: http://localhost:3002 (dashboard: "GoodGo - Logs")
|
|
- **Loki**: http://localhost:3100
|
|
- **Log retention**: 15 days (configured in `monitoring/loki/loki-config.yml`)
|