Skip to content

Conversation

@aevyrie
Copy link
Member

@aevyrie aevyrie commented Dec 30, 2025

Objective

  • After a series of optimizations making render and postupdate more parallel, write_batched_instance_buffers was regularly one of the largest spans with very low thread use, sitting at 4ms in 1 4ms frame. This makes it an ideal target to improve throughput. Note this screenshot doesn't include some visibility system optimizations:
image

Solution

  • Spawn tasks for writing buffers to the GPU. This is especially helpful for current_input_buffer and previous_input_buffer, which take about the same time and are the longest buffer writes - moving these to tasks effectively halves the time spent in the system.
image
  • In the 250k bevymark_3d stress test, this saves 1.7ms in the system, and 2.8ms in frame time

frametime

image

system

image

Testing

  • cargo rer bevymark_3d --features=debug,trace_tracy -- --benchmark --waves 250 --per-wave 1000

@alice-i-cecile alice-i-cecile added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 30, 2025
@james7132 james7132 self-requested a review December 30, 2025 19:37
Copy link
Contributor

@kfc35 kfc35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for correctness

@aevyrie
Copy link
Member Author

aevyrie commented Jan 2, 2026

Revisted benchmarks on latest main, and improvements are still reproducible.

image

Bottle eck is mesh collection, which is improved in #22297.

@james7132 james7132 added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jan 5, 2026
@alice-i-cecile alice-i-cecile added this to the 0.18 milestone Jan 5, 2026
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Jan 5, 2026
Merged via the queue into bevyengine:main with commit 5066b03 Jan 5, 2026
40 checks passed
@github-project-automation github-project-automation bot moved this to Done in Rendering Jan 5, 2026
cart pushed a commit that referenced this pull request Jan 8, 2026
# Objective

- After a series of optimizations making render and postupdate more
parallel, `write_batched_instance_buffers` was regularly one of the
largest spans with very low thread use, sitting at 4ms in 1 4ms frame.
This makes it an ideal target to improve throughput. Note this
screenshot doesn't include some visibility system optimizations:

<img width="650" height="718" alt="image"
src="https://github.com/user-attachments/assets/bbd6762b-5145-48f8-a427-5da3cb11a04a"
/>


## Solution

- Spawn tasks for writing buffers to the GPU. This is especially helpful
for `current_input_buffer` and `previous_input_buffer`, which take about
the same time and are the longest buffer writes - moving these to tasks
effectively halves the time spent in the system.

<img width="588" height="251" alt="image"
src="https://github.com/user-attachments/assets/0a086e7a-1d3c-4c17-9d66-eff94196943d"
/>

- In the 250k bevymark_3d stress test, this saves 1.7ms in the system,
and 2.8ms in frame time

frametime

<img width="620" height="376" alt="image"
src="https://github.com/user-attachments/assets/a4c106ac-7668-4f8a-970f-71cbb8be851c"
/>

system

<img width="1384" height="744" alt="image"
src="https://github.com/user-attachments/assets/5c42227d-8ee5-4b84-bc1a-c04768356255"
/>



## Testing

- `cargo rer bevymark_3d --features=debug,trace_tracy -- --benchmark
--waves 250 --per-wave 1000`

---------

Co-authored-by: Kevin Chen <chen.kevin.f@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants