Skip to content

Conversation

@samherring99
Copy link
Collaborator

@samherring99 samherring99 commented Jan 14, 2026

This PR provides the following things:

  • The implementation for the Gateway node type in architectures/inference-only/inference-node/src/bin/gateway-node.rs, which includes the handle_inference method to forward inference requests over the P2P network (using iroh's bidrectional streams), and the Gateway node performs the discovery of available inference nodes through iroh's gossip. This is implemented in the gateway node binary at architectures/inference-only/inference-node/Cargo.toml
  • The Gateway node also writes its endpoint ID to a temporary file for bootstrapping. This will eventually be resolved a different way, but is fine for local testing for now.
  • A direct P2P protocol handler for inference requests under shared/inference/src/protocol_handler.rs which implements iroh's ProtocolHandler trait to accept incoming inference requests over the direct P2P connection.
  • Updates to shared/inference/src/protocol.rs to allow for OpenAPI API style /v1/chat/completions messages and some tests.
  • Refactoring of the Rust bridge at python/python/psyche/vllm/rust_bridge.py to use OpenAPI API style /v1/chat/completions messages. These changes are reflected in shared/inference/src/node.rs, shared/inference/src/vllm.rs, and shared/inference/src/protocol.rs with some testing.
  • The inference node main code at architectures/inference-only/inference-node/src/main.rs can now read bootstrap peers from a given file, and rebroadcast availability over gossip every 30 seconds.
  • (MIGHT BE IMPORTANT ❗ ) Updating shared/network/src/lib.rs and shared/network/src/router.rs to use an internal init_internal method, plus a method called init_with_custom_protocol to use a custom protocol on initialization.
  • Adding axum and tower to dependencies in Cargo.toml and architectures/inference-only/inference-node/Cargo.toml.
  • A test script in scripts/test-inference-e2e.sh to test the end to end inference flow
  • 5 new just commands added to the justfile: inference-node, gateway-node, inference-stack, test-inference, test-inference-e2e

Testing (requires a venv with vLLM installed as of now):

source .venv/bin/activate

cargo build --bin psyche-inference-node && cargo build --bin gateway-node --features gateway

just inference-stack NousResearch/Hermes-4-14B

Output

*Inference Node*
INFO psyche_inference_node: Starting Psyche Inference Node                                                                                 
INFO psyche_inference_node: Model: NousResearch/Hermes-4-14B                                                                               
INFO psyche_inference_node: Tensor Parallel Size: 1                                                                                        
INFO psyche_inference_node: GPU Memory Utilization: 0.9                                                                                    
INFO psyche_inference_node: Discovery mode: N0                                                                                             
INFO psyche_inference_node: Relay kind: N0                                                                                                 
INFO psyche_inference_node: Capabilities: []                                                                                               
INFO psyche_inference_node: Reading bootstrap peers from PSYCHE_GATEWAY_BOOTSTRAP_FILE: "/tmp/psyche-gateway-peer.json"                    
INFO psyche_inference_node: Loaded 1 gateway endpoint(s) from file                                                                         
INFO psyche_inference_node: Initializing Python interpreter...                                                                             
INFO psyche_inference_node: Python interpreter initialized                                                                                 
INFO psyche_inference_node: Initializing vLLM engine...                                                                                    
INFO psyche_inference::node: Initializing inference node with model: NousResearch/Hermes-4-14B 
*vLLM startup*
INFO psyche_inference::node: vLLM engine initialized successfully: inference_node_<node_ID>                                                                  
INFO psyche_inference_node: Initializing P2P network...                                                                                    
INFO psyche_inference_node: Registering inference protocol handler...                                                                      
DEBUG psyche_network: Using relay servers: Default iroh relay (production) servers
INFO relay-actor: iroh::magicsock::transports::relay::actor: home is now relay https://use1-1.relay.n0.iroh-canary.iroh.link./, was None
INFO psyche_network: Our endpoint ID: <endpoint_ID>
INFO psyche_network: Connected!
INFO psyche_inference_node: P2P network initialized
INFO psyche_inference_node:   Endpoint ID: <gateway_endpoint_ID>
INFO psyche_inference_node: Protocol handler registered
INFO gossip{me=031737a1f6}:connect{me=031737a1f6 alpn="/iroh-gossip/1" remote=b927ac690a}:discovery{me=031737a1f6 endpoint=b927ac690a}:add_endpoint_addr:add_endpoint_addr{endpoint=<endpoint>}: iroh::magicsock::endpoint_map: inserting new endpoint in EndpointMap endpoint=<endpoint> relay_url=Some(RelayUrl("https://use1-1.relay.n0.iroh-canary.iroh.link./")) source=dns
INFO gossip{me=031737a1f6}:connect{me=031737a1f6 alpn="/iroh-gossip/1" remote=<endpoint>}:prepare_send:get_send_addrs{endpoint=<endpoint>}: iroh::magicsock::endpoint_map::endpoint_state: new connection type typ=relay(https://use1-1.relay.n0.iroh-canary.iroh.link./)
DEBUG psyche_network: broadcasted gossip message with hash <hash>: NodeAvailable { model_name: "NousResearch/Hermes-4-14B", checkpoint_id: None, capabilities: [] } message_hash=<hash>
INFO psyche_inference_node: Broadcasted availability to network
INFO psyche_inference_node: Inference node ready! Listening for requests...

INFO router.accept{me=031737a1f6 alpn="/psyche/inference/1" remote=<peer_ID_gateway>}: psyche_inference::protocol_handler: Received inference request <request_ID> from <peer_ID_gateway>                                                                                                                
INFO router.accept{me=<peer_ID> alpn="/psyche/inference/1" remote=<peer_ID_gateway>}: psyche_inference::protocol_handler: Processing inference request: <request_ID>
INFO rust_bridge.py:95: Adding request with sampling_params: {'temperature': 1.0, 'top_p': 1.0, 'max_tokens': 250, 'stop_token_ids': [151645], 'stop': ['<|im_end|>']}
INFO rust_bridge.py:106: Final output has 1 completions
INFO rust_bridge.py:108: Final generated text: "Hello! How can I assist you today? I'm Hermes, a large language model created by Nous Research. I'm happy to converse with you and try to help across a broad range of topics, to the best of my abilities. Please provide more context if you have any specific questions or requests for me."
INFO rust_bridge.py:109: Final finish reason: stop


*Gateway Node*
INFO gateway_node: Starting gateway node
INFO gateway_node:   HTTP API: http://127.0.0.1:8000
INFO gateway_node: No bootstrap peers configured (gateway will be a bootstrap node)
INFO gateway_node: Initializing P2P network...
DEBUG psyche_network: Using relay servers: Default iroh relay (production) servers
INFO relay-actor: iroh::magicsock::transports::relay::actor: home is now relay https://use1-1.relay.n0.iroh-canary.iroh.link./, was None
INFO psyche_network: Our endpoint ID: <endpoint_ID>
INFO psyche_network: Connected!
INFO gateway_node: P2P network initialized
INFO gateway_node:   Endpoint ID: <endpoint_ID>
INFO gateway_node: Found PSYCHE_GATEWAY_ENDPOINT_FILE env var: /tmp/psyche-gateway-peer.json
INFO gateway_node: Wrote gateway endpoint to "/tmp/psyche-gateway-peer.json"
INFO gateway_node: Other nodes can bootstrap using this file
INFO gateway_node: Waiting for gossip mesh to stabilize...
INFO gateway_node: Gossip mesh should be ready
INFO gateway_node: Gateway ready! Listening on http://127.0.0.1:8000
INFO gateway_node: Discovering inference nodes...
INFO gateway_node: HTTP server listening on 127.0.0.1:8000

INFO gateway_node: Discovered inference node!
INFO gateway_node:   Peer ID: <peer_ID>
INFO gateway_node:   Model: NousResearch/Hermes-4-14B
INFO gateway_node:   Checkpoint: None
INFO gateway_node:   Capabilities: []
INFO gateway_node: Routing request to node: 031737a1f6 (model: NousResearch/Hermes-4-14B)
INFO gateway_node: Sent inference request <request_id> to network
INFO gateway_node: Sending inference request <request_id> via direct P2P
INFO gateway_node: Connecting to peer <peer_ID> with ALPN Ok("/psyche/inference/1")
INFO gateway_node: Connected, opening bidirectional stream
INFO gateway_node: Sending 77 bytes
INFO gateway_node: Finishing send stream
INFO gateway_node: Reading response...
INFO gateway_node: Received 754 bytes, deserializing
INFO gateway_node: Successfully received inference response
INFO gateway_node: Received inference response for <request_id>

*Test window*
curl -X POST http://127.0.0.1:8000/v1/chat/completions -H "Content-Type: application/json" -d '{"messages": [{"role": "user", "content": "Hello, world!"}], "max_tokens": 250}'

{"id":"chatcmpl-<id>","object":"chat.completion","created":<timestamp>,"model":"NousResearch/Hermes-4-14B","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I assist you today? I'm Hermes, a large language model created by Nous Research. I'm happy to converse with you and try to help across a broad range of topics, to the best of my abilities. Please provide more context if you have any specific questions or requests for me."},"finish_reason":"stop"}]}

The above commands will start up 1 gateway node and 1 inference node, will allow the gateway node to write its endpoint ID to a temp file where the inference node can read it and bootstrap from it, and then will spin up an endpoint at localhost:8000/v1/chat/completions to receive requests to be forwarded to the inference node.

As always, any questions, comments, or concerns with how this is set up are welcome 😄 - streaming, checkpoint updating, and load balancing are all on the future roadmap for this effort, as well as discussion on how to correctly bootstrap from our gateway nodes.

@samherring99 samherring99 force-pushed the inference_networking_gateway branch 5 times, most recently from 09b4132 to ad343a2 Compare January 14, 2026 01:36
@samherring99 samherring99 force-pushed the inference_networking_gateway branch 5 times, most recently from d1f1169 to 2dbbe93 Compare January 16, 2026 17:53
Copy link
Contributor

@pefontana pefontana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @samherring99 !
I noticed two thing that may be easy to change:

  1. the just command doesnt work, because the tmux session dont inherit the nix develop .#dev-python neither the python venv
    To run it I have to run in two diferent terminal:
nix develop .#dev-python
source .venv/bin/activate
 PSYCHE_GATEWAY_BOOTSTRAP_FILE=psyche-gateway-peer.json LIBTORCH_USE_PYTORCH=1 RUST_LOG=info cargo run --bin   psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0


nix develop .#dev-python
source .venv/bin/activate
PSYCHE_GATEWAY_ENDPOINT_FILE=psyche-gateway-peer.json RUST_LOG=info cargo run --bin gateway-node --features gateway -- --discovery-mode n0 --relay-kind n0
  1. With the command
PSYCHE_GATEWAY_BOOTSTRAP_FILE=psyche-gateway-peer.json LIBTORCH_USE_PYTORCH=1 RUST_LOG=info cargo run --bin   psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0

I am getting a NumPy error
ImportError: Numba needs NumPy 2.2 or less. Got NumPy 2.3
I tried to install Numpy 2.2 but the Path is still set to the nix version one
Maybe we can update Numba to fix this and make it easier to run?

@samherring99
Copy link
Collaborator Author

Nice @samherring99 ! I noticed two thing that may be easy to change:

  1. the just command doesnt work, because the tmux session dont inherit the nix develop .#dev-python neither the python venv
    To run it I have to run in two diferent terminal:
nix develop .#dev-python
source .venv/bin/activate
 PSYCHE_GATEWAY_BOOTSTRAP_FILE=psyche-gateway-peer.json LIBTORCH_USE_PYTORCH=1 RUST_LOG=info cargo run --bin   psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0


nix develop .#dev-python
source .venv/bin/activate
PSYCHE_GATEWAY_ENDPOINT_FILE=psyche-gateway-peer.json RUST_LOG=info cargo run --bin gateway-node --features gateway -- --discovery-mode n0 --relay-kind n0
  1. With the command
PSYCHE_GATEWAY_BOOTSTRAP_FILE=psyche-gateway-peer.json LIBTORCH_USE_PYTORCH=1 RUST_LOG=info cargo run --bin   psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0

I am getting a NumPy error ImportError: Numba needs NumPy 2.2 or less. Got NumPy 2.3 I tried to install Numpy 2.2 but the Path is still set to the nix version one Maybe we can update Numba to fix this and make it easier to run?

Could you share the tmux errors you're seeing? It might come down to versioning issues between our setups but the just command starts up tmux with nix develop .#dev-python and runs the cargo commands for my set up.

As for the NumPy errors I think we will resolve this with vllm included in the nix packaging ;) 🤞 - but I will look into it regardless

@samherring99 samherring99 force-pushed the inference_networking_gateway branch from 2dbbe93 to 937d280 Compare January 20, 2026 17:47
@pefontana
Copy link
Contributor

Could you share the tmux errors you're seeing? It might come down to versioning issues between our setups but the just command starts up tmux with nix develop .#dev-python and runs the cargo commands for my set up.

As for the NumPy errors I think we will resolve this with vllm included in the nix packaging ;) 🤞 - but I will look into it regardless

Sure.
Here I run

nix develop .#dev-python

source .venv/bin/activate

pip install vllm

cargo build --bin psyche-inference-node && cargo build --bin gateway-node --features gateway

just inference-stack NousResearch/Hermes-4-14B

And the tmux ouputs:
0. the gateways seems to work all right

  1. I am having this error in the inference session and I think it is because the tmux session doesnt inherit the nix develop .#dev-python
PSYCHE_GATEWAY_BOOTSTRAP_FILE=/tmp/psyche-gateway-peer.json LIBTORCH_BYPASS_VERSION_CHECK=1 RUST_LOG=info,psyche_network=debug cargo run --bin psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0
   Compiling torch-sys v0.22.0 (https://github.com/jquesnelle/tch-rs.git?rev=11d1ca2ef6dbd3f1e5b0986fab0a90fbb6734496#11d1ca2e)
error: failed to run custom build command for `torch-sys v0.22.0 (https://github.com/jquesnelle/tch-rs.git?rev=11d1ca2ef6dbd3f1e5b0986fab0a90fbb6734496#11d1ca2e)`

Caused by:
  process didn't exit successfully: `/tmp/psyche/target/debug/build/torch-sys-5c7b6a5bfffdd2f5/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=LIBTORCH_USE_PYTORCH

  --- stderr
  Error: no cxx11 abi returned by python Output { status: ExitStatus(unix_wait_status(256)), stdout: "", stderr: "Traceback (most recent call last):\n  File \"<string>\", line 3, in <module>\n  File \"/nix/store/m6p1pwa6vk3gmh9q6b48mvgfm0jwiqb4-python3.12-torch-2.9.0/lib/python3.12/site-packages/torch/__init__.py\", line 427, in <module>\n    from torch._C import *  # noqa: F403\n    ^^^^^^^^^^^^^^^^^^^^^^\nImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /nix/store/1a1z3dyygjj8rs690x71y50kc3l8xnx4-cuda12.8-nccl-2.28.7-1/lib/libnccl.so.2)\n" }

@pefontana
Copy link
Contributor

@samherring99
Aside from that, I've been looking at the code and I think it's fine. If we can fix it in the future with the changes in nix, we can merge it.

@samherring99 samherring99 force-pushed the inference_networking_gateway branch from 937d280 to 0af54f3 Compare January 21, 2026 17:03
@samherring99
Copy link
Collaborator Author

Could you share the tmux errors you're seeing? It might come down to versioning issues between our setups but the just command starts up tmux with nix develop .#dev-python and runs the cargo commands for my set up.
As for the NumPy errors I think we will resolve this with vllm included in the nix packaging ;) 🤞 - but I will look into it regardless

Sure. Here I run

nix develop .#dev-python

source .venv/bin/activate

pip install vllm

cargo build --bin psyche-inference-node && cargo build --bin gateway-node --features gateway

just inference-stack NousResearch/Hermes-4-14B

And the tmux ouputs: 0. the gateways seems to work all right

  1. I am having this error in the inference session and I think it is because the tmux session doesnt inherit the nix develop .#dev-python
PSYCHE_GATEWAY_BOOTSTRAP_FILE=/tmp/psyche-gateway-peer.json LIBTORCH_BYPASS_VERSION_CHECK=1 RUST_LOG=info,psyche_network=debug cargo run --bin psyche-inference-node -- --model-name NousResearch/Hermes-4-14B --discovery-mode n0 --relay-kind n0
   Compiling torch-sys v0.22.0 (https://github.com/jquesnelle/tch-rs.git?rev=11d1ca2ef6dbd3f1e5b0986fab0a90fbb6734496#11d1ca2e)
error: failed to run custom build command for `torch-sys v0.22.0 (https://github.com/jquesnelle/tch-rs.git?rev=11d1ca2ef6dbd3f1e5b0986fab0a90fbb6734496#11d1ca2e)`

Caused by:
  process didn't exit successfully: `/tmp/psyche/target/debug/build/torch-sys-5c7b6a5bfffdd2f5/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=LIBTORCH_USE_PYTORCH

  --- stderr
  Error: no cxx11 abi returned by python Output { status: ExitStatus(unix_wait_status(256)), stdout: "", stderr: "Traceback (most recent call last):\n  File \"<string>\", line 3, in <module>\n  File \"/nix/store/m6p1pwa6vk3gmh9q6b48mvgfm0jwiqb4-python3.12-torch-2.9.0/lib/python3.12/site-packages/torch/__init__.py\", line 427, in <module>\n    from torch._C import *  # noqa: F403\n    ^^^^^^^^^^^^^^^^^^^^^^\nImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /nix/store/1a1z3dyygjj8rs690x71y50kc3l8xnx4-cuda12.8-nccl-2.28.7-1/lib/libnccl.so.2)\n" }

FWIW this looks like a CUDA / NCCL error, I'm guessing this is also related to venv / torch / vllm issues. will tag @arilotter for confirmation / final review 🙂

@samherring99 samherring99 force-pushed the inference_networking_gateway branch from 0af54f3 to 958793a Compare January 21, 2026 19:21
Comment on lines 146 to 151
let nodes = state.available_nodes.read().await;
if nodes.is_empty() {
return Err(AppError::NoNodesAvailable);
}

let node = nodes.values().next().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking that the vec is not empty and then calling unwrap, I think we can do:

let nodes = state.available_nodes.read().await;
let node = nodes.values().next().ok_or(AppError::NoNodesAvailable)?;

It’s not really important since we’re unlikely to panic, but I think this is more idiomatic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, I'll test this out and update.

let _ = tx.send(response).await;
}
}
Err(e) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something fails in the send_inference_request call, we’re not cleaning up the request_id from pending_requests. Is that handled somewhere else? Not sure whether it’s correct to remove it if something fails there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout, I'll ensure this is handled correctly and will update here.

peer_id.fmt_short()
);

tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sleep necessary?

Copy link
Collaborator Author

@samherring99 samherring99 Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a debug measure I forgot to remove 🙃 thanks for catching it

I lied, this is necessary because we need to wait to give time for the bytes to flush through the network before the connection was dropped and we need to wait for the receiver to actually read all the data. I'll add a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay, maybe you can do connection.closed().await;? Not really sure though, I didn’t try it. I’m just trying to avoid future problems where the receiver takes more than 100 ms to read things and we end up having the same error, but it’s not as high priority 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about this, will likely address in a later PR if it becomes and issue.

info!("Capabilities: {:?}", capabilities);

// read bootstrap peers from multiple sources in priority order
let bootstrap_peers: Vec<EndpointAddr> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have almost the same logic in the main.rs file of the crate. Can we extract it to an aux function in handle the difference on the implementations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, was being lazy about this, will move to a new lib.rs file for shared implementation.

pub request_id: String,
pub prompt: String,
pub messages: Vec<ChatMessage>,
#[serde(default = "default_max_tokens")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly related to this PR but I think both protocol and gateway-node uses the same default functions. Can they be different at some point? Also you can use the default value directly without using the default functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, to reduce scope I'll probably tackle this in a later PR if thats okay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no worries

Comment on lines +152 to +130
let model_name = req.model.clone().unwrap_or_else(|| node.model_name.clone());
info!(
"Routing request to node: {} (model: {})",
node.peer_id.fmt_short(),
node.model_name
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a question, but here we get an inference node from the list and only use its model_name. Then, in the run_gateway function, we select another node from the list as target_node, which is the one we actually route the request to. I might be misunderstanding something, but wouldn’t it be better to select a single node and route the request to that one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did this because I wasn't passing the peer ID through the channel as part of an InferenceMessage type, but I'll make that change to include it so we select one node only and route the request there.

@samherring99 samherring99 force-pushed the inference_networking_gateway branch 2 times, most recently from 36eea32 to 91248ac Compare January 21, 2026 21:17
send.finish()?;

// wait for a moment to let the connection flush all the bytes to the reciever
tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    conn.close(0u32.into(), b"bye!");
    endpoint.close().await;

this should flush the connection buffer before returning.
see https://github.com/n0-computer/iroh/blob/6ad5ac4238a3cc101791922167aab952d4c99c1e/iroh/examples/echo.rs#L65

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you will test this out!

Copy link
Collaborator Author

@samherring99 samherring99 Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, AFAICT there's no async way to wait for QUIC to flush without closing the entire endpoint... I'm fine with a time delay but would an adaptive delay based on payload size be more reasonable / future proof? I think we want the endpoint to stay open to accept future requests, and according to the comments in what you linked that seems like a requirement 🙁

http::StatusCode,
response::{IntoResponse, Response},
routing::post,
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can group these all in one big use that's feature-flagged ? or just.. rip out the gateway feature IMO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes should be no issue 😎

// Spawn task to handle P2P connection
let endpoint = network.router().endpoint().clone();
let state_clone = state.clone();
tokio::spawn(async move {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task here without any tracking is a little scary - should we keep these in some task pool, add timeouts, monitor, etc? once we get a request we simply throw this into the tokio task pool and can't tell if something works or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked with tokio::task::JoinSet::new() 🙂

…pes, adding initial skeleton of inference-node main loop, wiring inference-node up to iroh gossip updates, updating Cargo toml
… param type to be optional, single protocol, and generic, adding justfile commands and test script to test inference
…rotocolHandler method and custom protocol code path
…g to single node selection for request routing
@samherring99 samherring99 force-pushed the inference_networking_gateway branch from 91248ac to ef35e92 Compare January 22, 2026 23:52
@samherring99 samherring99 force-pushed the inference_networking_gateway branch from ef35e92 to e4cb343 Compare January 23, 2026 00:30
@samherring99 samherring99 self-assigned this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants