Skip to content

Conversation

@nathanielford
Copy link
Collaborator

Motivation

Currently the Channel object is calling into the PersistentChannel internals to do things like provision an active channel and check state. The division of where work and error handling is managed becomes muddied in this situation, which complicates handling errors in ways that aren't unwrap.

This was also motivated by work to implement the always-failing PersistentChannel PR, but it seemed like it could be broken off into it's own PR.

Solution

Moving the actual mechanics into the PersistentChannel object, which owns channel config, and allows Channel to be a lightweight handles for users to interact with. Specifically state probing and channel creation are now in PersistentChannel.

PersistentChannel calls are wrapped in Result objects that are translated at the Channel level to the appropriate public-facing API objects. get_or_create_active_channel() just becomes get_active_channel() and passes the 'create' flag, which controls whether a new channel is created if it does not already exist.

The external semantics should stay the same as they have been.

This does not yet address the error that can surface in call(), because it's not yet clear what the error reporting story is.

@nathanielford nathanielford force-pushed the refactor/PersistentChannel branch from 51ee8a7 to ab8eb2d Compare December 15, 2025 22:46
@nathanielford nathanielford marked this pull request as ready for review December 16, 2025 00:17
@nathanielford nathanielford force-pushed the refactor/PersistentChannel branch 2 times, most recently from ab8eb2d to aa24f63 Compare December 16, 2025 19:55
@nathanielford nathanielford self-assigned this Dec 16, 2025
if let Some(s) = ac.connectivity_state.cur() {
return s;
let state = self.inner.state(connect);
match state {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want the two state() methods to have the same return value so this could simply delegate?

let mut s = self
.active_channel
.lock()
.map_err(|_| "Could not get channel lock.".to_string())?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can stay as unwrap(). IIUC locks can only fail in pretty spectacular ways (someone holding the lock panicked). In that case we might as well panic too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on this, but ultimately decided to leave it as unwrap() on the theory that we really do want lock poisoning to cause an all-stop, and fix whatever causes the poisoning rather than allow apps to recover from it. I'm only vaguely uncomfortable with leaving the unwrap in there.

Comment on lines 247 to 249
s.as_ref()
.cloned()
.ok_or_else(|| "Could not clone channel".into())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By here you know s is Some, so you can do s.unwrap().clone() I think.

Or if you do a if let Some(ac) = s else {} instead of if s.is_none() then it will already be unwrapped.

&self,
connect: bool,
) -> Result<Arc<ActiveChannel>, Box<dyn std::error::Error + Sync + Send>> {
let mut s = self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name this ac or active_channel? I'm not sure what s represents.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

self.runtime.clone(),
));
} else {
return Err("No active channel.".into());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can probably return just an Option?

Also maybe we shouldn't combine these two functionalities into one function. They're a little disjoint.

  • Anyone asking for the active channel itself will always want connect=true.
  • Only when requesting the current state, exiting idle mode is optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without needing to be clear that the error is a lock poisoning, the option makes the most sense, so I changed it to this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also removed the bool flag after going back and forth with various options. After being satisfied that I could arrange the deck chairs any which way settled on the current version. LMK what you think!

@nathanielford nathanielford force-pushed the refactor/PersistentChannel branch 2 times, most recently from a1cca14 to 861d86f Compare January 8, 2026 23:04
Copy link
Collaborator

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely +100 on the motivation behind this change -- the ActiveChannel should be the thing fiddling with its own state, not the PersistentChannel. A few minor nits, but otherwise LGTM, thanks.

let active_channel = if connect {
self.get_active_channel()
} else {
match self.locked_active_channel().as_ref().cloned() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as_ref().cloned() -> clone() ? (And below)

// Internal use only to get the locked active channel. If this panics it means the lock is
// poisoned and we should also panic.
fn locked_active_channel(&self) -> std::sync::MutexGuard<'_, Option<Arc<ActiveChannel>>> {
self.active_channel.lock().unwrap()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure having a method for this one liner is helpful, and in fact makes the call sites more opaque which is probably a bad thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I thought it was worth it was to try and put the unsafe unwrap in one place. But I'll revert it, since I think it's probably not that big a deal.

@nathanielford nathanielford force-pushed the refactor/PersistentChannel branch from 1130c0a to ed26b0e Compare January 12, 2026 18:04
@nathanielford nathanielford enabled auto-merge (squash) January 12, 2026 22:11
Comment on lines +157 to +158
/// Returns the current state of the channel. If there is no underlying active channel,
/// returns Idle. If `connect` is true, will create a new active channel.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaks implementation details into the documentation. Users shouldn't know about our active channel at all. How about something like:

Suggested change
/// Returns the current state of the channel. If there is no underlying active channel,
/// returns Idle. If `connect` is true, will create a new active channel.
/// Returns the current state of the channel. If `connect` is true, will begin connecting.

@nathanielford nathanielford merged commit 7e98e1f into hyperium:master Jan 12, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants