docs(architecture): Add Multipart Upload architecture and endpoints for large files

- Introduced a new section detailing the Multipart Upload architecture for files larger than 100MB, including a comparison of upload methods.
- Documented the Multipart Upload flow with a sequence diagram illustrating the process from initiation to completion.
- Listed the relevant API endpoints for Multipart Upload, including initiation, part uploads, completion, and progress checking.
- Added a database schema section for tracking multipart uploads and their parts, enhancing clarity on data management.
This commit is contained in:
Ho Ngoc Hai
2026-01-13 23:41:47 +07:00
parent 3756fe6e35
commit 517bea32e7
2 changed files with 194 additions and 0 deletions

View File

@@ -400,6 +400,86 @@ WHERE id = '{file-id}';
| **ConfirmUploadCommandHandler** | Handle confirm-upload with idempotency |
| **SignedUrlController** | `/sign-upload` and `/confirm-upload` endpoints |
## Multipart Upload Architecture (Large Files)
For files larger than 100MB, use Multipart Upload to upload in chunks.
### Upload Methods Comparison
| Aspect | Direct Upload | Multipart Upload |
|--------|---------------|------------------|
| **File size** | < 100MB | > 100MB (up to 5GB+) |
| **Mechanism** | Single PUT request | Multiple part uploads |
| **Resume support** | No | Yes (per part) |
| **Progress tracking** | No | Yes (per part) |
| **Use case** | Small/medium files | Large files, video, archives |
### Multipart Upload Flow
```mermaid
sequenceDiagram
participant Client
participant API as Storage Service
participant DB as PostgreSQL
participant MinIO
rect rgb(200, 230, 200)
Note over Client,API: 1. Initiate Upload
Client->>API: POST /api/v1/files/multipart/initiate
API->>DB: Create MultipartUpload record
API->>MinIO: InitiateMultipartUpload
MinIO-->>API: Provider UploadId
API-->>Client: {uploadId, objectKey, totalChunks}
end
rect rgb(200, 200, 230)
Note over Client,MinIO: 2. Upload Parts (repeat for each chunk)
loop For each part 1..N
Client->>API: POST /api/v1/files/multipart/upload-part
API->>MinIO: UploadPart(partNumber, data)
MinIO-->>API: ETag
API->>DB: Save part info (partNumber, etag)
API-->>Client: {success, etag}
end
end
rect rgb(230, 200, 200)
Note over Client,API: 3. Optional: Check Progress
Client->>API: GET /api/v1/files/multipart/{uploadId}
API->>DB: Get upload + parts
API-->>Client: {progress: 75%, uploadedChunks: 3/4}
end
rect rgb(230, 230, 200)
Note over Client,API: 4. Complete Upload
Client->>API: POST /api/v1/files/multipart/complete
API->>MinIO: CompleteMultipartUpload(parts[])
MinIO-->>API: OK
API->>DB: Create StorageFile, Update quota
API-->>Client: {fileId, objectKey}
end
```
### Multipart Upload Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/api/v1/files/multipart/initiate` | Start upload session |
| `POST` | `/api/v1/files/multipart/upload-part` | Upload 1 chunk |
| `POST` | `/api/v1/files/multipart/complete` | Complete and merge parts |
| `DELETE` | `/api/v1/files/multipart/abort` | Cancel upload, cleanup |
| `GET` | `/api/v1/files/multipart/{uploadId}` | Check progress |
### Multipart Upload Components
| Component | Purpose |
|-----------|---------|
| **InitiateMultipartUploadCommand** | Create upload session, generate object key |
| **UploadPartCommand** | Upload 1 part to storage provider |
| **CompleteMultipartUploadCommand** | Merge parts, create StorageFile record |
| **AbortMultipartUploadCommand** | Cleanup parts, mark as aborted |
| **GetMultipartUploadProgressQuery** | Get upload progress |
| **MultipartUploadController** | API endpoints for multipart |
## Storage Provider Architecture

View File

@@ -400,6 +400,120 @@ WHERE id = '{file-id}';
| **ConfirmUploadCommandHandler** | Xử lý confirm-upload với idempotency |
| **SignedUrlController** | Endpoints `/sign-upload``/confirm-upload` |
## Kiến Trúc Multipart Upload (File Lớn)
Cho files lớn hơn 100MB, sử dụng Multipart Upload để upload theo chunks.
### So Sánh Upload Methods
| Khía cạnh | Direct Upload | Multipart Upload |
|-----------|---------------|------------------|
| **Kích thước file** | < 100MB | > 100MB (lên đến 5GB+) |
| **Mechanism** | Single PUT request | Multiple part uploads |
| **Resume support** | Không | Có (từng part) |
| **Progress tracking** | Không | Có (theo từng part) |
| **Use case** | Files nhỏ/trung bình | Files lớn, video, archives |
### Luồng Multipart Upload
```mermaid
sequenceDiagram
participant Client
participant API as Storage Service
participant DB as PostgreSQL
participant MinIO
rect rgb(200, 230, 200)
Note over Client,API: 1. Initiate Upload
Client->>API: POST /api/v1/files/multipart/initiate
API->>DB: Create MultipartUpload record
API->>MinIO: InitiateMultipartUpload
MinIO-->>API: Provider UploadId
API-->>Client: {uploadId, objectKey, totalChunks}
end
rect rgb(200, 200, 230)
Note over Client,MinIO: 2. Upload Parts (repeat for each chunk)
loop For each part 1..N
Client->>API: POST /api/v1/files/multipart/upload-part
API->>MinIO: UploadPart(partNumber, data)
MinIO-->>API: ETag
API->>DB: Save part info (partNumber, etag)
API-->>Client: {success, etag}
end
end
rect rgb(230, 200, 200)
Note over Client,API: 3. Optional: Check Progress
Client->>API: GET /api/v1/files/multipart/{uploadId}
API->>DB: Get upload + parts
API-->>Client: {progress: 75%, uploadedChunks: 3/4}
end
rect rgb(230, 230, 200)
Note over Client,API: 4. Complete Upload
Client->>API: POST /api/v1/files/multipart/complete
API->>MinIO: CompleteMultipartUpload(parts[])
MinIO-->>API: OK
API->>DB: Create StorageFile, Update quota
API-->>Client: {fileId, objectKey}
end
```
### Database Schema (Multipart)
```sql
-- Tracking multipart upload sessions
CREATE TABLE multipart_uploads (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id VARCHAR(255) NOT NULL,
file_name VARCHAR(255) NOT NULL,
content_type VARCHAR(100),
total_size_bytes BIGINT NOT NULL,
chunk_size_bytes INT NOT NULL,
total_chunks INT NOT NULL,
status VARCHAR(50) DEFAULT 'InProgress', -- InProgress, Completed, Aborted, Failed
provider_upload_id VARCHAR(255),
bucket_name VARCHAR(255) NOT NULL,
object_key VARCHAR(500) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
completed_at TIMESTAMP,
expires_at TIMESTAMP,
INDEX idx_user_status (user_id, status)
);
-- Tracking individual parts
CREATE TABLE multipart_upload_parts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
multipart_upload_id UUID REFERENCES multipart_uploads(id) ON DELETE CASCADE,
part_number INT NOT NULL,
etag VARCHAR(255) NOT NULL,
size_bytes BIGINT NOT NULL,
uploaded_at TIMESTAMP DEFAULT NOW(),
UNIQUE (multipart_upload_id, part_number)
);
```
### Multipart Upload Endpoints
| Method | Endpoint | Mô tả |
|--------|----------|-------|
| `POST` | `/api/v1/files/multipart/initiate` | Khởi tạo upload session |
| `POST` | `/api/v1/files/multipart/upload-part` | Upload 1 chunk |
| `POST` | `/api/v1/files/multipart/complete` | Hoàn thành và merge parts |
| `DELETE` | `/api/v1/files/multipart/abort` | Hủy upload, cleanup |
| `GET` | `/api/v1/files/multipart/{uploadId}` | Kiểm tra tiến độ |
### Components Multipart Upload
| Component | Mục đích |
|-----------|----------|
| **InitiateMultipartUploadCommand** | Tạo upload session, generate object key |
| **UploadPartCommand** | Upload 1 part lên storage provider |
| **CompleteMultipartUploadCommand** | Merge parts, tạo StorageFile record |
| **AbortMultipartUploadCommand** | Cleanup parts, đánh dấu aborted |
| **GetMultipartUploadProgressQuery** | Lấy tiến độ upload |
| **MultipartUploadController** | API endpoints cho multipart |
## Kiến Trúc Storage Provider