curl -X POST http://NETWORK_NODE_HOST:9200/admin/v1/nodes \
-H "Content-Type: application/json" \
-d '{
"id": "node1",
"host": "NETWORK_NODE_HOST",
"inference_port": 5000,
"poc_port": 8080,
"max_concurrent": 500,
"models": {
"Qwen/Qwen3-235B-A22B-Instruct-2507-FP8": {
"args": [
"--max-model-len",
"240000",
"--tensor-parallel-size",
"4"
]
}
}
}'For a detailed deployment guide, see README_RU.md (in Russian).