Skip to content

xiangyi-li-git/ML-serving-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Model Serving Platform (Serving Infra)

A production-grade model serving platform focused on high throughput, low latency, and efficient resource utilization.

πŸ— Project Architecture

Client
  ↓
API Gateway / Router (Go) - Week 2
  ↓
Scheduler / Batcher (Go) - Week 3
  ↓
Model Worker Pool (Python) - Week 4
  ↓
GPU / CPU (mock or real)

πŸ“‚ Project Structure

  • router/: Go-based HTTP/gRPC entry point. Handles admission control.
  • scheduler/: Go-based core logic for dynamic batching and priority queuing.
  • worker/: Python-based inference worker. Interfaces with ML models (PyTorch/mock).
  • proto/: Protobuf definitions for internal service communication.
  • deploy/: Docker and orchestration configurations.

πŸš€ Getting Started (Week 1)

  1. Go Setup:

    cd serving-platform
    go mod tidy
  2. Python Setup:

    cd serving-platform/worker
    pip install -r requirements.txt
  3. Generate Proto (Optional):

    protoc --go_out=. --go-grpc_out=. proto/serving.proto

About

Model Serving Platform using Go & Platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published