Model Serving Platform (Serving Infra)

A production-grade model serving platform focused on high throughput, low latency, and efficient resource utilization.

🏗 Project Architecture

Client
  ↓
API Gateway / Router (Go) - Week 2
  ↓
Scheduler / Batcher (Go) - Week 3
  ↓
Model Worker Pool (Python) - Week 4
  ↓
GPU / CPU (mock or real)

📂 Project Structure

router/: Go-based HTTP/gRPC entry point. Handles admission control.
scheduler/: Go-based core logic for dynamic batching and priority queuing.
worker/: Python-based inference worker. Interfaces with ML models (PyTorch/mock).
proto/: Protobuf definitions for internal service communication.
deploy/: Docker and orchestration configurations.

🚀 Getting Started (Week 1)

Go Setup:
```
cd serving-platform
go mod tidy
```

Python Setup:

cd serving-platform/worker
pip install -r requirements.txt

Generate Proto (Optional):

protoc --go_out=. --go-grpc_out=. proto/serving.proto

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
proto		proto
router		router
scheduler		scheduler
worker		worker
README.md		README.md
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Serving Platform (Serving Infra)

🏗 Project Architecture

📂 Project Structure

🚀 Getting Started (Week 1)

About

Uh oh!

Releases

Packages

Languages

xiangyi-li-git/ML-serving-platform

Folders and files

Latest commit

History

Repository files navigation

Model Serving Platform (Serving Infra)

🏗 Project Architecture

📂 Project Structure

🚀 Getting Started (Week 1)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages