A production-style remote file backup system built from scratch using C++17, demonstrating network programming, systems design, and cloud-native practices.
This project originated from a university network programming assignment. Rather than submitting a minimal solution, I treated it as an opportunity to learn and practice real-world backend engineering skills:
| Original Assignment | What I Built |
|---|---|
| Basic UDP file transfer | Dual-channel architecture: UDP for commands + TCP for reliable data transfer |
| Console-only interaction | RESTful HTTP API + Web management dashboard |
| Manual testing | Automated CI/CD with GitHub Actions + unit/integration tests |
| Local execution only | Docker containerization with one-command deployment |
| No observability | Structured logging + real-time metrics (request counts, latency percentiles, throughput) |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Client β
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β CLI β β Commands β β File Data (TCP) β β
β β Interface βββββΆβ (UDP) βββββΆβ - Explicit handshake β β
β βββββββββββββββ βββββββββββββββ β - Timeout & retry β β
β β - Progress tracking β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Server β
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β UDP Cmd β β Command β β HTTP Server (cpp-httplib) β β
β β Listener βββββΆβ Handler βββββΆβ - REST API (/api/files, etc.) β β
β β :35887 β β β β - Web UI (static/index.html) β β
β βββββββββββββββ βββββββββββββββ β - Health check (/healthz) β β
β β β - Metrics (/api/metrics) β β
β βΌ βββββββββββββββββββββββββββββββββββ β
β βββββββββββββββ β
β β Metrics ββββ Request counts, latency (p50/p95), β
β β Collector β bytes transferred, error rates β
β βββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββ β
β β Backup β Validated filenames, atomic writes β
β β Storage β β
β βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Decision | Rationale |
|---|---|
| UDP for commands, TCP for data | Commands are small and latency-sensitive; file transfers need reliability and flow control |
| Explicit handshake before transfer | Prevents race conditions in rename/remove operations (a bug I fixed from the original lab) |
| Singleton Metrics collector | Thread-safe, lock-free counters for high-frequency operations; mutex-protected for percentile sampling |
| Filename validation layer | Defense against path traversal (../), reserved names, and injection attacks |
| Separation of HTTP and UDP interfaces | HTTP for human/dashboard access; UDP for programmatic client access |
- Dual-protocol design: UDP command channel + dynamic TCP data channel
- Non-trivial protocol: Custom binary message format (
CmdMsg,DataMsg) with explicit field packing - Error handling: Timeout, retry, and graceful degradation
// Example: Explicit handshake to prevent race conditions
// Server waits for client ACK before proceeding with rename
sendto(sock, &ackMsg, sizeof(ackMsg), 0, clientAddr, addrLen);
recvfrom(sock, &response, sizeof(response), 0, ...); // Wait for client confirmation- Structured logging with spdlog (file + console, configurable levels)
- Real-time metrics collection:
- Request counts by command type
- Error counts by error code
- Latency tracking with percentile calculation (p50, p95, max)
- Throughput (bytes received/sent)
// RAII-based latency measurement
auto timer = Metrics::instance().startTimer("http_upload");
// ... operation executes ...
// Timer automatically records latency on scope exit- Docker multi-stage build: Minimized image size, reproducible builds
- Docker Compose: One-command deployment with health checks
- GitHub Actions CI: Automated build + test on Linux and Windows
# Health check ensures container is truly ready
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/healthz"]
interval: 10s
timeout: 5s
retries: 3| Threat | Mitigation |
|---|---|
Path traversal (../../../etc/passwd) |
Strict filename validation, reject any path separators |
| Reserved filename injection (Windows) | Block CON, PRN, NUL, etc. |
| Oversized uploads | Configurable maxUploadSize limit |
| Malformed requests | Input validation at protocol and HTTP layers |
cd reliable-remote-backup-system
docker compose up -d
# Open Web UI
# http://localhost:8080
# Test the API
curl http://localhost:8080/healthz
curl http://localhost:8080/api/filescd reliable-remote-backup-system
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel
# Run tests
ctest --output-on-failure
# Start server
./backup-server -port 35887 -http 8080| Endpoint | Method | Description |
|---|---|---|
/healthz |
GET | Health check (for container orchestration) |
/api/files |
GET | List all backed-up files |
/api/upload |
POST | Upload file (multipart/form-data) |
/api/files/{name} |
DELETE | Delete a file |
/api/rename |
POST | Rename a file ({"oldName": "a", "newName": "b"}) |
/api/metrics |
GET | Real-time performance metrics |
Example: Metrics Response
{
"uptime_seconds": 3600,
"requests": {
"ls": 42,
"send": 15,
"http_upload": 8
},
"errors": {
"invalid_filename": 2,
"file_not_found": 1
},
"transfer": {
"bytes_received": 10485760,
"send_sessions": 15,
"send_failures": 0
},
"latency": {
"http_upload": {
"avg_ms": 125.3,
"max_ms": 450.0,
"p95_ms": 280.0
}
}
}| Layer | Tool | Coverage |
|---|---|---|
| Unit tests | GoogleTest | Filename validation, error codes, metrics logic |
| Integration tests | Bash + curl | HTTP API end-to-end flows |
| CI | GitHub Actions | Linux (Ubuntu 22.04) + Windows builds |
# Run integration tests
docker compose up -d
bash scripts/integration_test.sh- C++17 + CMake build system
- Structured logging (spdlog)
- Unit testing (GoogleTest)
- GitHub Actions CI/CD
- Security: filename validation
- HTTP REST API
- Web file management UI
- Metrics collection (requests, errors, latency, throughput)
- Docker Compose deployment
- Integration test suite
- Prometheus metrics export (
/metrics) - Request ID tracing for log correlation
- Connection pooling / thread pool for concurrent transfers
- Chaos testing with
tc netem(network delay/loss simulation)
reliable-remote-backup-system/
βββ include/ # Header files
β βββ client/ # Client interfaces
β βββ common/ # Shared utilities (metrics, logging, file ops)
β βββ protocol/ # Wire protocol definitions
β βββ server/ # Server components (HTTP, command handler)
βββ src/ # Implementation
βββ static/ # Web UI (index.html)
βββ scripts/ # Build and test scripts
βββ tests/ # Unit tests
βββ .github/workflows/ # CI configuration
βββ Dockerfile # Server container
βββ docker-compose.yml # Orchestration config
βββ docs/ # Additional documentation
βββ DOCKER_QUICKSTART.md
βββ LOCAL_RUN_GUIDE.md
Through this project, I gained hands-on experience with:
- Low-level network programming: Socket APIs, protocol design, handling partial reads/writes
- Systems thinking: Trade-offs between UDP/TCP, designing for failure modes
- Observability: Why metrics matter, how to instrument code without overhead
- DevOps practices: Docker, CI/CD, infrastructure-as-code mindset
- Security awareness: Input validation, defense in depth
MIT License β feel free to use this as a reference or starting point.
Author: Frankie Liu
Repository: Computer_Network_lab