Skip to content

Parallel simulation refactoring#1738

Open
bmalezieux wants to merge 6 commits intosbi-dev:mainfrom
bmalezieux:parallel-simulation-refactoring
Open

Parallel simulation refactoring#1738
bmalezieux wants to merge 6 commits intosbi-dev:mainfrom
bmalezieux:parallel-simulation-refactoring

Conversation

@bmalezieux
Copy link

@bmalezieux bmalezieux commented Jan 22, 2026

This PR refactors the simulation utilities to improve parallel execution efficiency and modularity.

  • New parallelize_simulator decorator/function: introduces a tool that wraps any simulator to enable parallel execution using joblib.

  • New simulate_from_thetas utility: added a lightweight wrapper around parallelize_simulator that directly executes simulations for a given set of parameters (thetas).

  • Refactored simulate_for_sbi (backward compatibility): updated the internal implementation of simulate_for_sbi to leverage the new simulate_from_thetas backend.

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, looks very good already.

I was wondering whether we would also need to refactor process_simulator accordingly? e.g., we can likely drop the internal torch wrapping and accept numpy simulators (because we wrap to numpy as again). Would have to look into it in detail but I think we can simplify the function a bit.

@bmalezieux bmalezieux requested a review from janfb January 26, 2026 16:21
@bmalezieux bmalezieux marked this pull request as ready for review January 26, 2026 16:21
Copy link
Contributor

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.

A question: is it possible to use a return type which is str? It is unclear to me where this would go in the current sbi

seed: Optional[int] = None,
show_progress_bar: bool = True,
) -> Tuple[Tensor, Tensor]:
) -> Tuple[Tensor, Tensor | List[str] | str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be:

Suggested change
) -> Tuple[Tensor, Tensor | List[str] | str]:
) -> Tuple[Theta, X]:

UserWarning,
stacklevel=2,
)
batches = [theta for theta in thetas]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? In this context, we could use the batch parameter of joblib (useful if initialization of the simulator is slow)

)
batches = [theta for theta in thetas]
else:
batches = [theta for theta in thetas]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear why not keep batches = thetas? The current solution duplicates the theta in memory, which can be a large overhead for large number of simulation (probably for later version, right now everything needs to fit in memory).

num_batches = (
num_simulations + simulation_batch_size - 1
) // simulation_batch_size
batches = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a iterator if possible, once again to avoid filling the memory and duplicating. in particular if this is a tensor, use split?


# Run in parallel
# Generate seeds
batch_seeds = np.random.randint(low=0, high=1_000_000, size=(len(batches),))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be able to seed this part no?

Comment on lines +79 to +81
context = parallel_config(n_jobs=num_workers)

with context:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context = parallel_config(n_jobs=num_workers)
with context:
with parallel_config(n_jobs=num_workers):


# 1. No arguments
@parallelize_simulator
def decorated_sim(theta):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use simple_simulator_torch which is the same?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants