Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions architectures/decentralized/justfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,16 @@ setup-solana-localnet-light-test-run-treasurer run_id="test" *args='':
RUN_ID={{ run_id }} CONFIG_FILE=./config/solana-test/light-config.toml PERMISSIONLESS=true ./scripts/setup-and-deploy-solana-test.sh --treasurer {{ args }}

setup-solana-localnet-permissioned-test-run run_id="test" *args='':
RUN_ID={{ run_id }} ./scripts/deploy-solana-test.sh {{ args }}
RUN_ID={{ run_id }} ./scripts/setup-and-deploy-solana-test.sh {{ args }}

setup-solana-localnet-permissioned-light-test-run run_id="test" *args='':
RUN_ID={{ run_id }} CONFIG_FILE=./config/solana-test/light-config.toml ./scripts/deploy-solana-test.sh {{ args }}
RUN_ID={{ run_id }} CONFIG_FILE=./config/solana-test/light-config.toml ./scripts/setup-and-deploy-solana-test.sh {{ args }}

setup-solana-localnet-permissioned-test-run-treasurer run_id="test" *args='':
RUN_ID={{ run_id }} ./scripts/deploy-solana-test.sh --treasurer {{ args }}
RUN_ID={{ run_id }} ./scripts/setup-and-deploy-solana-test.sh --treasurer {{ args }}

setup-solana-localnet-permissioned-light-test-run-treasurer run_id="test" *args='':
RUN_ID={{ run_id }} CONFIG_FILE=./config/solana-test/light-config.toml ./scripts/deploy-solana-test.sh --treasurer {{ args }}
RUN_ID={{ run_id }} CONFIG_FILE=./config/solana-test/light-config.toml ./scripts/setup-and-deploy-solana-test.sh --treasurer {{ args }}

start-training-localnet-client run_id="test" *args='':
AUTHORIZER={{ AUTHORIZER }} RUN_ID={{ run_id }} ./scripts/train-solana-test.sh {{ args }}
Expand Down
2 changes: 1 addition & 1 deletion architectures/decentralized/solana-authorizer/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions architectures/decentralized/solana-coordinator/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion architectures/decentralized/solana-mining-pool/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions architectures/decentralized/solana-treasurer/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 26 additions & 2 deletions psyche-book/src/enduser/join-run.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ WALLET_PATH=/path/to/your/keypair.json
RPC=https://your-primary-rpc-provider.com
WS_RPC=wss://your-primary-rpc-provider.com

# Required: Which run id to join
# Optional: Which run id to join
# If not set, the client will automatically discover and join an available run
RUN_ID=your_run_id_here

# Recommended: Fallback RPC Endpoints (for reliability)
Expand All @@ -78,6 +79,19 @@ Then, you can start training through the run manager running:
./run-manager --env-file /path/to/your/.env
```

### Automatic Run Selection

If you don't specify a `RUN_ID` in your `.env` file, the run-manager will automatically query the Solana coordinator to find a suitable run to join.
This makes it easier to join training without needing to know the specific run ID in advance. The run-manager will display which run it selected in the logs:

```
INFO RUN_ID not set, discovering available runs...
INFO Found 2 available run(s):
INFO - run_abc123 (state: Waiting for members)
INFO - run_def456 (state: Training)
INFO Selected run: run_abc123 (state: Waiting for members)
```

After the initial setup, you'll see the Psyche client logs streaming in real-time. These logs show training progress, network status, and other important information.

To stop the client, press `Ctrl+C` in the terminal.
Expand All @@ -86,9 +100,19 @@ To stop the client, press `Ctrl+C` in the terminal.

We recommend using a dedicated RPC service such as [Helius](https://www.helius.dev/), [QuickNode](https://www.quicknode.com/), [Triton](https://triton.one/), or self-hosting your own Solana RPC node.

## Filtering by Authorizer

If you want to only join runs authorized by a specific entity, you can use the `--authorizer` flag:

```bash
./run-manager --env-file /path/to/your/.env --authorizer <AUTHORIZER_PUBKEY>
```

This is useful when you want to ensure you only join runs from a trusted coordinator.

## Additional config variables

In general it's not neccesary to change these variables to join a run since we provide sensible defaults,
In general it's not necessary to change these variables to join a run since we provide sensible defaults,
though you might need to.

**`NVIDIA_DRIVER_CAPABILITIES`** - An environment variable that the NVIDIA Container Toolkit uses to determine which compute capabilities should be provided to your container. It is recommended to set it to 'all', e.g. `NVIDIA_DRIVER_CAPABILITIES=all`.
Expand Down
16 changes: 11 additions & 5 deletions scripts/setup-and-deploy-solana-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,19 @@ sleep 3

solana airdrop 10 --url ${RPC} --keypair ${WALLET_FILE}

# Pass treasurer flag to deploy script if set
if [[ "$DEPLOY_TREASURER" == "true" && "$PERMISSIONLESS" == "true" ]]; then

if [[ "$DEPLOY_TREASURER" == "true" ]]; then
WALLET_FILE=${WALLET_FILE} ./scripts/deploy-solana-test.sh --treasurer "${EXTRA_ARGS[@]}"
CONFIG_FILE=${CONFIG_FILE} WALLET_FILE=${WALLET_FILE} ./scripts/create-permissionless-run.sh --treasurer "${EXTRA_ARGS[@]}"
elif [[ "$PERMISSIONLESS" == "true" ]]; then
else
WALLET_FILE=${WALLET_FILE} ./scripts/deploy-solana-test.sh "${EXTRA_ARGS[@]}"
CONFIG_FILE=${CONFIG_FILE} WALLET_FILE=${WALLET_FILE} ./scripts/create-permissionless-run.sh "${EXTRA_ARGS[@]}"
fi

if [[ "$PERMISSIONLESS" == "true" ]]; then
if [[ "$DEPLOY_TREASURER" == "true" ]]; then
CONFIG_FILE=${CONFIG_FILE} WALLET_FILE=${WALLET_FILE} ./scripts/create-permissionless-run.sh --treasurer "${EXTRA_ARGS[@]}"
else
CONFIG_FILE=${CONFIG_FILE} WALLET_FILE=${WALLET_FILE} ./scripts/create-permissionless-run.sh "${EXTRA_ARGS[@]}"
fi
fi
echo -e "\n[+] Testing Solana setup ready, starting Solana logs...\n"

Expand Down
1 change: 1 addition & 0 deletions tools/rust-tools/run-manager/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ rand.workspace = true
rand_chacha.workspace = true
time.workspace = true
solana-client = "=2.1.4"
solana-account-decoder-client-types = "=2.1.4"

[dev-dependencies]
serial_test = "3.0"
Expand Down
153 changes: 143 additions & 10 deletions tools/rust-tools/run-manager/src/docker/coordinator_client.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,30 @@
use anchor_client::solana_sdk::{commitment_config::CommitmentConfig, pubkey::Pubkey};
use anchor_client::solana_sdk::{
commitment_config::CommitmentConfig, pubkey::Pubkey, system_program,
};
use anchor_lang::AccountDeserialize;
use anyhow::{Context, Result};
use psyche_coordinator::RunState;
use psyche_solana_authorizer::state::Authorization;
use psyche_solana_coordinator::{
CoordinatorInstance, coordinator_account_from_bytes, find_coordinator_instance,
logic::JOIN_RUN_AUTHORIZATION_SCOPE,
};
use solana_account_decoder_client_types::UiAccountEncoding;
use solana_client::rpc_client::RpcClient;
use tracing::info;
use solana_client::rpc_config::{RpcAccountInfoConfig, RpcProgramAccountsConfig};
use tracing::{debug, info, warn};

#[derive(Debug, Clone)]
pub struct RunInfo {
pub run_id: String,
pub instance_pubkey: Pubkey,
pub coordinator_account: Pubkey,
pub run_state: RunState,
}

/// Coordinator client for querying Solana
pub struct CoordinatorClient {
rpc_client: RpcClient,
#[allow(dead_code)]
program_id: Pubkey,
}

Expand Down Expand Up @@ -40,19 +54,39 @@ impl CoordinatorClient {
Ok(instance)
}

fn fetch_run_state(&self, coordinator_account: &Pubkey) -> Result<RunState> {
// Fetch the raw Solana account data from the blockchain
let solana_account = self
.rpc_client
.get_account(coordinator_account)
.with_context(|| {
format!(
"Failed to fetch coordinator account {}",
coordinator_account
)
})?;

// Deserialize the account data into a CoordinatorAccount struct
let coordinator =
coordinator_account_from_bytes(&solana_account.data).with_context(|| {
format!(
"Failed to deserialize coordinator account {}",
coordinator_account
)
})?;

Ok(coordinator.state.coordinator.run_state)
}

pub fn get_docker_tag_for_run(&self, run_id: &str, local_docker: bool) -> Result<String> {
info!("Querying coordinator for Run ID: {}", run_id);

let instance = self.fetch_coordinator_data(run_id)?;

// Fetch the coordinator account to get the client version
let coordinator_account_data = self
.rpc_client
.get_account(&instance.coordinator_account)
.context("RPC error: failed to get coordinator account")?;

let coordinator_account = coordinator_account_from_bytes(&coordinator_account_data.data)
.context("Failed to deserialize CoordinatorAccount")?;
let coordinator_account_data =
self.rpc_client.get_account(&instance.coordinator_account)?;
let coordinator_account = coordinator_account_from_bytes(&coordinator_account_data.data)?;

let client_version = String::from(&coordinator_account.state.client_version);

Expand Down Expand Up @@ -82,4 +116,103 @@ impl CoordinatorClient {

Ok(image_name)
}

pub fn get_all_runs(&self) -> Result<Vec<RunInfo>> {
// Fetch all CoordinatorInstance accounts that are owned by the program
let accounts = self
.rpc_client
.get_program_accounts_with_config(
&self.program_id,
RpcProgramAccountsConfig {
account_config: RpcAccountInfoConfig {
encoding: Some(UiAccountEncoding::Base64),
commitment: Some(CommitmentConfig::confirmed()),
..Default::default()
},
..Default::default()
},
)
.map_err(|e| {
anyhow::anyhow!(
"Failed to fetch program accounts from coordinator program {}: {}",
self.program_id,
e
)
})?;

let mut runs = Vec::new();
for (pubkey, account) in accounts {
match CoordinatorInstance::try_deserialize(&mut account.data.as_slice()) {
Ok(instance) => {
if let Ok(run_state) = self.fetch_run_state(&instance.coordinator_account) {
runs.push(RunInfo {
run_id: instance.run_id.clone(),
instance_pubkey: pubkey,
coordinator_account: instance.coordinator_account,
run_state,
});
} else {
debug!(
"Skipping run {} (instance: {}) - could not fetch coordinator state",
instance.run_id, pubkey
);
}
}
Err(e) => {
debug!(
"Failed to deserialize CoordinatorInstance at {}: {}",
pubkey, e
);
}
}
}

Ok(runs)
}

/// Check if a user is authorized to join a specific run.
///
/// This checks both permissionless authorization (grantee = system_program::ID)
/// and user-specific authorization (grantee = user_pubkey).
pub fn can_user_join_run(&self, run: &RunInfo, user_pubkey: &Pubkey) -> Result<bool> {
// Fetch the CoordinatorInstance to get join_authority
let instance = self.fetch_coordinator_data(&run.run_id)?;
let join_authority = instance.join_authority;

// Try permissionless authorization first (grantee = system_program::ID)
if self.check_authorization_for_grantee(&join_authority, &system_program::ID, user_pubkey) {
return Ok(true);
}

// Try user-specific authorization (grantee = user_pubkey)
Ok(self.check_authorization_for_grantee(&join_authority, user_pubkey, user_pubkey))
}

/// Check if an authorization exists and is valid for a specific grantee.
fn check_authorization_for_grantee(
&self,
join_authority: &Pubkey,
grantee: &Pubkey,
user_pubkey: &Pubkey,
) -> bool {
let auth_pda = psyche_solana_authorizer::find_authorization(
join_authority,
grantee,
JOIN_RUN_AUTHORIZATION_SCOPE,
);

let Ok(account) = self.rpc_client.get_account(&auth_pda) else {
return false;
};

let Ok(authorization) = Authorization::try_deserialize(&mut account.data.as_slice()) else {
warn!(
"Failed to deserialize authorization at {}: invalid data",
auth_pda
);
return false;
};

authorization.is_valid_for(join_authority, user_pubkey, JOIN_RUN_AUTHORIZATION_SCOPE)
}
}
Loading