- Updated Mermaid diagrams in the deployment and development guides for better visual representation and consistency. - Improved formatting and clarity in the Kubernetes local deployment and IAM migration guides, including detailed workflows and troubleshooting sections. - Enhanced the Vietnamese documentation to align with the English version, ensuring consistency across guides. - Added quick tips and common issues sections to facilitate user navigation and understanding.
424 lines
12 KiB
Markdown
424 lines
12 KiB
Markdown
# Troubleshooting Guide
|
|
|
|
> **Note**: This guide focuses on debugging the GoodGo Microservices Platform in a local development environment (Docker Compose).
|
|
|
|
## Table of Contents
|
|
|
|
1. [General Diagnosis](#general-diagnosis)
|
|
2. [Infrastructure Issues](#infrastructure-issues)
|
|
- [Database (Neon/PostgreSQL)](#database-neonpostgresql)
|
|
- [Redis](#redis)
|
|
- [Traefik Gateway](#traefik-gateway)
|
|
3. [Service Issues](#service-issues)
|
|
- [Service Fails to Start](#service-fails-to-start)
|
|
- [Prisma/Database Errors](#prismadatabase-errors)
|
|
- [Authentication Errors](#authentication-errors)
|
|
4. [Debugging Tools](#debugging-tools)
|
|
5. [FAQ](#faq)
|
|
|
|
---
|
|
|
|
## General Diagnosis
|
|
|
|
When something goes wrong, follow this checklist:
|
|
|
|
1. **Check Service Status**:
|
|
```bash
|
|
cd deployments/local
|
|
docker-compose ps
|
|
```
|
|
*All services should be `Up` or `Running`.*
|
|
|
|
2. **Check Logs**:
|
|
```bash
|
|
# View logs for a specific service
|
|
docker-compose logs -f <service-name>
|
|
|
|
# View last 100 lines for all
|
|
docker-compose logs --tail=100
|
|
```
|
|
|
|
3. **Check Connectivity**:
|
|
* Can you reach the Gateway? `curl http://localhost/health`
|
|
* Can you reach the Dashboard? http://localhost:8080
|
|
|
|
### Troubleshooting Flowchart
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Start([ Issue Detected]) --> CheckStatus{Check Service<br/>Status}
|
|
|
|
CheckStatus -->|All Running| CheckLogs[ Check Logs]
|
|
CheckStatus -->|Some Down| IdentifyService[ Identify Failed<br/>Service]
|
|
|
|
IdentifyService --> ServiceType{Service Type?}
|
|
|
|
ServiceType -->|Infrastructure| InfraCheck[ Infrastructure<br/>Check]
|
|
ServiceType -->|Application| AppCheck[ Application<br/>Check]
|
|
|
|
InfraCheck --> DBCheck{Database?}
|
|
InfraCheck --> RedisCheck{Redis?}
|
|
InfraCheck --> TraefikCheck{Traefik?}
|
|
|
|
DBCheck -->|Yes| DBSolution[ Check DATABASE_URL<br/> Verify Neon connection<br/> Check IP whitelist]
|
|
RedisCheck -->|Yes| RedisSolution[ Restart Redis<br/> Check port mapping<br/> Verify connection string]
|
|
TraefikCheck -->|Yes| TraefikSolution[ Check labels<br/> Verify PathPrefix<br/> Check health status]
|
|
|
|
AppCheck --> ErrorType{Error Type?}
|
|
|
|
ErrorType -->|Config| ConfigFix[ Check .env variables<br/> Run init-project.sh]
|
|
ErrorType -->|Prisma| PrismaFix[ Check migrations<br/> Regenerate client<br/> Reset database]
|
|
ErrorType -->|Auth| AuthFix[ Check token expiry<br/> Verify keys<br/> Sync Docker time]
|
|
|
|
CheckLogs --> LogAnalysis{Log Shows<br/>Error?}
|
|
LogAnalysis -->|Yes| ErrorType
|
|
LogAnalysis -->|No| ConnCheck[ Check Connectivity]
|
|
|
|
ConnCheck --> GatewayTest{Gateway<br/>Reachable?}
|
|
GatewayTest -->|No| TraefikCheck
|
|
GatewayTest -->|Yes| ServiceTest{Service<br/>Reachable?}
|
|
|
|
ServiceTest -->|No| AppCheck
|
|
ServiceTest -->|Yes| Resolved([ Issue Resolved])
|
|
|
|
DBSolution --> Restart[ Restart Services]
|
|
RedisSolution --> Restart
|
|
TraefikSolution --> Restart
|
|
ConfigFix --> Restart
|
|
PrismaFix --> Restart
|
|
AuthFix --> Restart
|
|
|
|
Restart --> Verify{Issue<br/>Fixed?}
|
|
Verify -->|Yes| Resolved
|
|
Verify -->|No| DeepDebug[ Deep Debugging<br/>Required]
|
|
|
|
DeepDebug --> ContainerShell[Access Container Shell]
|
|
DeepDebug --> PrismaStudio[Use Prisma Studio]
|
|
DeepDebug --> RedisInspect[Inspect Redis]
|
|
DeepDebug --> APITest[Direct API Testing]
|
|
|
|
style Start fill:#1a1a2e,color:#fff
|
|
style Resolved fill:#0f3460,color:#fff
|
|
style CheckStatus fill:#16213e,color:#fff
|
|
style ServiceType fill:#16213e,color:#fff
|
|
style ErrorType fill:#16213e,color:#fff
|
|
style DBCheck fill:#16213e,color:#fff
|
|
style RedisCheck fill:#16213e,color:#fff
|
|
style TraefikCheck fill:#16213e,color:#fff
|
|
style GatewayTest fill:#16213e,color:#fff
|
|
style ServiceTest fill:#16213e,color:#fff
|
|
style Verify fill:#16213e,color:#fff
|
|
style LogAnalysis fill:#16213e,color:#fff
|
|
style InfraCheck fill:#1a1a40,color:#fff
|
|
style AppCheck fill:#1a1a40,color:#fff
|
|
style DBSolution fill:#0f4c75,color:#fff
|
|
style RedisSolution fill:#0f4c75,color:#fff
|
|
style TraefikSolution fill:#0f4c75,color:#fff
|
|
style ConfigFix fill:#0f4c75,color:#fff
|
|
style PrismaFix fill:#0f4c75,color:#fff
|
|
style AuthFix fill:#0f4c75,color:#fff
|
|
style Restart fill:#3282b8,color:#fff
|
|
style DeepDebug fill:#1b262c,color:#fff
|
|
style IdentifyService fill:#1a1a40,color:#fff
|
|
style CheckLogs fill:#1a1a40,color:#fff
|
|
style ConnCheck fill:#1a1a40,color:#fff
|
|
style ContainerShell fill:#0f3460,color:#fff
|
|
style PrismaStudio fill:#0f3460,color:#fff
|
|
style RedisInspect fill:#0f3460,color:#fff
|
|
style APITest fill:#0f3460,color:#fff
|
|
```
|
|
|
|
---
|
|
|
|
## Infrastructure Issues
|
|
|
|
### Database (Neon/PostgreSQL)
|
|
|
|
**Problem**: `P1001: Can't reach database server` or `Connection timed out`
|
|
|
|
* **Cause 1**: Internet connectivity issues (Neon is cloud-based).
|
|
* **Cause 2**: Incorrect `DATABASE_URL` in `.env`.
|
|
* **Cause 3**: IP address blocked by Neon.
|
|
|
|
**Solution**:
|
|
1. Verify internet connection: `ping neon.tech`.
|
|
2. Check `deployments/local/.env.local`. The URL should look like:
|
|
`postgres://user:pass@ep-xyz.aws.neon.tech/neondb`
|
|
3. Go to Neon Dashboard -> Settings, ensure "Allow all IPs" or add your current IP.
|
|
|
|
**Problem**: `P1003: Database does not exist`
|
|
|
|
* **Reason**: You are connecting to the wrong database name.
|
|
* **Fix**: Check the end of your connection string (e.g., `/neondb` usually). If you are using a custom DB name, ensure it exists in Neon.
|
|
|
|
### Redis
|
|
|
|
**Problem**: `Redis connection refused` or `ECONNREFUSED`
|
|
|
|
* **Cause**: Redis container is not running or port mapping is wrong.
|
|
|
|
**Solution**:
|
|
1. Check Redis status: `docker-compose ps redis`.
|
|
2. Restart Redis: `docker-compose restart redis`.
|
|
3. Check logs: `docker-compose logs redis`.
|
|
4. Connection string from services:
|
|
* **Inside Docker**: `redis:6379`
|
|
* **From Host**: `localhost:6379`
|
|
|
|
### Traefik Gateway
|
|
|
|
**Problem**: `404 Not Found` when accessing APIs (e.g., `http://localhost/api/v1/auth`)
|
|
|
|
* **Cause**: Service is down or Labels are misconfigured.
|
|
|
|
**Solution**:
|
|
1. Check Traefik Dashboard at http://localhost:8080.
|
|
* Look for "HTTP Routers" and "Services".
|
|
* If your service is missing, check `docker-compose.yml` labels.
|
|
2. Verify `PathPrefix` in labels matches your request.
|
|
```yaml
|
|
- "traefik.http.routers.iam.rule=PathPrefix(`/api/v1/auth`)"
|
|
```
|
|
3. Check if the service passed health checks (Health status in dashboard).
|
|
|
|
**Problem**: `Bad Gateway` or `Gateway Timeout`
|
|
|
|
* **Cause**: Service is crashing or taking too long to respond.
|
|
* **Fix**: Check the specific service logs (`docker-compose logs iam-service`).
|
|
|
|
---
|
|
|
|
## Service Issues
|
|
|
|
### Service Fails to Start
|
|
|
|
**Symptom**: Container status is `Exited (1)` or `Restarting`.
|
|
|
|
**Debugging**:
|
|
1. Check logs immediately:
|
|
```bash
|
|
docker-compose logs iam-service
|
|
```
|
|
2. **Common Error**: `Config validation error`
|
|
* **Fix**: Check environment variables. Using `./scripts/setup/init-project.sh` ensures `.env` exists.
|
|
3. **Common Error**: `PrismaClientInitializationError`
|
|
* **Fix**: Database connectivity issue (see Infrastructure section).
|
|
|
|
### Prisma/Database Errors
|
|
|
|
**Error**: `P2025: Record to update not found`
|
|
|
|
* **Fix**: Logic error. Ensure the ID exists before updating.
|
|
|
|
**Error**: `P2002: Unique constraint failed`
|
|
|
|
* **Fix**: You are trying to insert duplicate data (e.g., same email).
|
|
|
|
**Error**: `Migration failed`
|
|
|
|
* **Fix**:
|
|
1. Delete `prisma/migrations` folder (only in dev!).
|
|
2. Reset database: `pnpm prisma migrate reset`.
|
|
3. Regenerate client: `pnpm prisma generate`.
|
|
|
|
### Authentication Errors
|
|
|
|
**Problem**: `401 Unauthorized` despite valid token
|
|
|
|
* **Cause 1**: Token expired.
|
|
* **Cause 2**: Public key mismatch (Service can't verify token signed by IAM).
|
|
* **Cause 3**: Clock skew (Docker time vs Host time).
|
|
|
|
**Solution**:
|
|
1. Check server logs for JWT verification errors.
|
|
2. Restart services to refresh keys.
|
|
3. Sync Docker time: restart Docker Desktop.
|
|
|
|
---
|
|
|
|
## Debugging Tools
|
|
|
|
### 1. Accessing Container Shell
|
|
|
|
To inspect files or run commands inside a running container:
|
|
|
|
```bash
|
|
docker-compose exec iam-service sh
|
|
# or /bin/bash
|
|
```
|
|
|
|
### 2. Inspecting Database (via Prisma Studio)
|
|
|
|
Use Prisma Studio to view/edit data visually:
|
|
|
|
```bash
|
|
pnpm --filter @goodgo/iam-service prisma studio
|
|
# Opens http://localhost:5555
|
|
```
|
|
|
|
### 3. Inspecting Redis
|
|
|
|
```bash
|
|
docker-compose exec redis redis-cli
|
|
> PING
|
|
PONG
|
|
> KEYS *
|
|
1) "user:123:session"
|
|
```
|
|
|
|
### 4. Direct API Testing
|
|
|
|
Use `curl` or Postman.
|
|
|
|
```bash
|
|
# Health Check
|
|
curl -v http://localhost/api/v1/auth/health/live
|
|
|
|
# Login (example)
|
|
curl -X POST http://localhost/api/v1/auth/login \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"email":"admin@example.com", "password":"password"}'
|
|
```
|
|
|
|
---
|
|
|
|
## FAQ
|
|
|
|
**Q: Why is my change not reflecting?**
|
|
A: If you changed `.env` or `docker-compose.yml`, you must restart:
|
|
```bash
|
|
docker-compose down && docker-compose up -d
|
|
```
|
|
If you changed code, hot-reloading (nodemon) should pick it up. If not, restart container.
|
|
|
|
**Q: How do I reset everything?**
|
|
A: Be careful, this deletes all data!
|
|
```bash
|
|
docker-compose down -v
|
|
# -v removes volumes (Redis data, etc.)
|
|
```
|
|
|
|
**Q: My computer is slow when running everything.**
|
|
A: Docker consumes RAM.
|
|
1. Stop unused services (e.g., `future-service`).
|
|
2. Increase Docker resource limits in Docker Desktop settings.
|
|
|
|
---
|
|
|
|
## Quick Tips
|
|
|
|
### Debugging Shortcuts
|
|
|
|
```bash
|
|
# Quick health check all services
|
|
docker-compose ps | grep -v "Up"
|
|
|
|
# Tail logs for all services
|
|
docker-compose logs -f --tail=50
|
|
|
|
# Restart specific service without rebuilding
|
|
docker-compose restart iam-service
|
|
|
|
# Rebuild and restart service
|
|
docker-compose up -d --build iam-service
|
|
|
|
# Check resource usage
|
|
docker stats --no-stream
|
|
|
|
# Clean up unused resources
|
|
docker system prune -a --volumes
|
|
```
|
|
|
|
### Common Error Patterns
|
|
|
|
| Error Pattern | Likely Cause | Quick Fix |
|
|
|--------------|--------------|-----------|
|
|
| `ECONNREFUSED` | Service not running | `docker-compose restart <service>` |
|
|
| `P1001` | Database unreachable | Check `DATABASE_URL` and internet |
|
|
| `P2002` | Duplicate entry | Check unique constraints |
|
|
| `401 Unauthorized` | Token expired/invalid | Refresh token or re-login |
|
|
| `404 Not Found` | Wrong route/service down | Check Traefik dashboard |
|
|
| `502 Bad Gateway` | Service crashed | Check service logs |
|
|
| `Config validation error` | Missing env vars | Run `init-project.sh` |
|
|
|
|
### Log Analysis Tips
|
|
|
|
**What to look for in logs:**
|
|
- `Server listening on port XXXX` = Service started successfully
|
|
- `Warning:` = Non-critical issues
|
|
- `Error:` = Critical issues requiring attention
|
|
- `Trace:` = Detailed execution flow
|
|
|
|
**Useful grep patterns:**
|
|
```bash
|
|
# Find all errors
|
|
docker-compose logs | grep -i error
|
|
|
|
# Find specific service errors
|
|
docker-compose logs iam-service | grep -i "error\|failed"
|
|
|
|
# Find database connection issues
|
|
docker-compose logs | grep -i "prisma\|database\|p1001\|p1003"
|
|
|
|
# Find auth issues
|
|
docker-compose logs | grep -i "unauthorized\|401\|jwt\|token"
|
|
```
|
|
|
|
### Resource Management
|
|
|
|
**Recommended Docker Resources:**
|
|
- **RAM**: Minimum 4GB, Recommended 8GB
|
|
- **CPU**: Minimum 2 cores, Recommended 4 cores
|
|
- **Disk**: Minimum 10GB free space
|
|
|
|
**Check resource usage:**
|
|
```bash
|
|
# Overall system
|
|
docker system df
|
|
|
|
# Per container
|
|
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
|
|
```
|
|
|
|
**Cleanup commands:**
|
|
```bash
|
|
# Remove stopped containers
|
|
docker container prune
|
|
|
|
# Remove unused images
|
|
docker image prune -a
|
|
|
|
# Remove unused volumes ( deletes data!)
|
|
docker volume prune
|
|
|
|
# Nuclear option ( removes everything!)
|
|
docker system prune -a --volumes
|
|
```
|
|
|
|
### Best Practices
|
|
|
|
1. **Always check logs first** before making changes
|
|
2. **Use Traefik Dashboard** (http://localhost:8080) to verify routing
|
|
3. **Keep `.env.local` updated** with correct credentials
|
|
4. **Don't delete volumes** unless you want to lose data
|
|
5. **Restart Docker Desktop** if experiencing weird networking issues
|
|
6. **Use `docker-compose down && up`** after `.env` changes
|
|
7. **Keep services running** you're actively developing
|
|
8. **Stop services** you're not using to save resources
|
|
|
|
### Visual Indicators
|
|
|
|
When reading logs, look for these patterns:
|
|
- `[INFO]` = Normal operation
|
|
- `[WARN]` = Something to watch
|
|
- `[ERROR]` = Needs immediate attention
|
|
- `[DEBUG]` = Detailed information
|
|
- `[TRACE]` = Very detailed execution flow
|
|
|
|
### Related Resources
|
|
|
|
- [Local Deployment Guide](./local-deployment.md) - Setup instructions
|
|
- [Development Guide](./development.md) - Development workflow
|
|
- [Kubernetes Local Guide](./kubernetes-local.md) - K8s troubleshooting
|
|
- [Neon Database Guide](./neon-database.md) - Database management
|