Skip to content

Add docs for blending#1124

Merged
Korijn merged 13 commits intomainfrom
blend-docs
Aug 15, 2025
Merged

Add docs for blending#1124
Korijn merged 13 commits intomainfrom
blend-docs

Conversation

@almarklein
Copy link
Member

@almarklein almarklein commented Jun 23, 2025

Ref discussion in #1120.

I refactored the current guide, moving the content to multiple files, to make room to go into some topics deeper, like we do here for transparency.

The text does not describe how Pygfx currently work, but how I think it should work. Open for suggestions. The idea is to get (more or less) agreement on this, then implement it accordingly.

@almarklein almarklein mentioned this pull request Jun 23, 2025
@Korijn
Copy link
Collaborator

Korijn commented Jun 23, 2025

Focusing on the API design: it looks fine to me. I'm curious to hear from @panxinmiao. Are there any overlooked use cases?

@panxinmiao
Copy link
Contributor

panxinmiao commented Jul 17, 2025

I still believe this design conflates two distinct concepts: the definition of transparency (a high-level API concept at the engine level) and the blending configuration (a low-level API concept at the GPU level).

What we commonly refer to as “alpha blending” is just one specific configuration of the blending settings in the rendering pipeline — a particular blending mode.

It corresponds to the classic alpha blending formula:
FinalColor = SrcColor * SrcAlpha + DstColor * (1 - SrcAlpha),
also known as “pre-multiplied alpha blending”. This approach is order-dependent and is one of the standard methods to achieve transparency.

As I understand it, alpha_mode = "blend" — given that it's grouped alongside options like "opaque" — is intended to represent an "alpha blended object", which is, to some extent, a high-level API concept for transparent objects. It does not represent the low-level blending configuration of the rendering pipeline.

Moreover, blending in the rendering pipeline is not exclusive to transparent objects. It's quite common for opaque objects to use blending as well (e.g., for glowing or light-accumulation effects). The engine should not restrict this capability.

Additionally, if we follow this API design, how should we categorize objects during rendering — i.e., assign them to different render passes? Furthermore, the extra parameters under alpha_mode = "blend" such as "over", "add", "subtract", and "multiply" do not fully cover the range of blending options available in the rendering pipeline.

@panxinmiao
Copy link
Contributor

panxinmiao commented Jul 17, 2025

I'm not entirely sure if I expressed my point clearly. If my wording caused any confusion, I sincerely apologize.

I would like to explain my concerns again:

1、alpha_mode conflates object classification semantics and pipeline configuration semantics, leading to conceptual confusion.

Values like alpha_mode="blend", alongside opaque, weighted, and dither, essentially serve as a classification system indicating how an object should be rendered. This is a high-level, engine-side logic.
On the other hand, values like "add" or "multiply" under blend_mode (or sometimes directly under alpha_mode) represent specific GPU-level blending configurations. These define how colors are computed in the rendering pipeline and belong to the low-level graphics API domain.

Mixing these two kinds of semantics into a single enum or parameter tree leads to confusion. And, blending settings in the rendering pipeline are not exclusive to transparent objects, using “whether blending is enabled” as the criterion for determining if an object is transparent is problematic and misleading.

For example:

m.alpha_mode = "add"

This raises the question:

  • Does this mean the object is a transparent additive blend, requiring back-to-front sorting?
  • Or is it an opaque object using additive blending, rendered in a forward (unsorted) opaque pass?
### 2、The current presets under `alpha_mode="blend"`—such as "over", "add", "subtract", and "multiply"—are too limited. They do not adequately cover the full range of blending configurations supported by modern rendering pipelines. For instance:
  • Min/Max blending
  • Custom src/dst blend factors
  • Blend constants

Edit:
I just realized that in this API design, there is a setting for blend_mode when alpha_mode=blend, which allows for custom blending methods, so this statement does not hold.

This brings us back to the core issue: using alpha_mode=blend as a classification mechanism for objects is conceptually different from configuring blending operations in the GPU rendering pipeline, and conflating the two can be misleading.
In other words, blend_mode is a low-level parameter that controls how the GPU performs blending, and it should not be exclusive to alpha_mode=blend. Even opaque objects may require specific blend_mode configurations in certain scenarios. Therefore, alpha_mode (which classifies objects for rendering strategy) and blend_mode (which configures pipeline-level blending behavior) should be treated as separate concepts.

@almarklein
Copy link
Member Author

almarklein commented Jul 17, 2025

Moreover, blending in the rendering pipeline is not exclusive to transparent objects. It's quite common for opaque objects to use blending as well

In the proposed API, what you refer to is an object with alpha_mode='blend' and depth_write=True. Is that transparent or opaque? I don't know and it also irrelevant because .transparent is not a property anymore.

Additionally, if we follow this API design, how should we categorize objects during rendering — i.e., assign them to different render passes?

I believe objects can be unambiguously categorized (from the pov of the renderer) based on just their alpha_mode and depth_write props.

Values like alpha_mode="blend", alongside opaque, weighted, and dither, essentially serve as a classification system indicating how an object should be rendered. This is a high-level, engine-side logic. On the other hand, values like "add" or "multiply" under blend_mode (or sometimes directly under alpha_mode) represent specific GPU-level blending configurations. [...] Mixing these two kinds of semantics into a single enum or parameter tree leads to confusion.

I don't agree with this distinction of high-level vs GPU-level to be honest. There is certainly a categorization, where we have four main categories, where each category has a number of options, but all it does it define how alpha values are used in rendering the object.

Or are you objecting to the fact that the presets for e.g. blend_mode ('over', 'add', 'subtract'), can also be passed to the alpha_mode prop? (which is a reasonable objection IMO)

@panxinmiao
Copy link
Contributor

panxinmiao commented Jul 18, 2025

In the proposed API, what you refer to is an object with alpha_mode='blend' and depth_write=True. Is that transparent or opaque? I don't know and it also irrelevant because .transparent is not a property anymore.

No, it's important. This is exactly where the problem lies, we need clear and unambiguous categorization. Without it, it's unclear which render pass should handle the object, nd which stage of the pipeline it belongs to.

Regarding object classification, we indeed didn't pay much attention to it before (prior to the simplification and refactoring of Blender), because at that time, the renderer did not yet distinguish between different rendering passes or stages — so the classification of an object didn’t matter much.
However, as we started introducing more advanced rendering features, this classification and render pass scheduling has become crucial. That’s also why I opened #974 in advance—it brings significant changes to the renderer’s behavior and needs to be discussed and reviewed.

There is certainly a categorization, where we have four main categories, where each category has a number of options, but all it does it define how alpha values are used in rendering the object.

Not just that, it also determines which render pass is used and at what stage in the pipeline the object is rendered. This is not a minor technicality, it's a fundamental architectural concern in the subsequent renderer design and render pass scheduling.

I believe objects can be unambiguously categorized (from the pov of the renderer) based on just their alpha_mode and depth_write props.

I agree that alpha_mode can be used as the primary way to classify objects—but we must clarification that alpha_mode='blend' represents a category of objects that should be rendered using a specific render pass at a specific pipeline stage. It does not mean that GPU blending is being used per se. In other words, we must decouple alpha_mode (for classification) from blend_mode (which configures the GPU pipeline).

As for using depth_write to infer categorization—I don't think that's appropriate. Properties like depth_write and depth_test serve many flexible purposes and are frequently controlled directly by the user. They are not strongly tied to render pass decisions and should not drive classification.

Or are you objecting to the fact that the presets for e.g. blend_mode ('over', 'add', 'subtract'), can also be passed to the alpha_mode prop? (which is a reasonable objection IMO)

Yes, that's exactly my point. blend_mode should be treated as an independent concept.

@almarklein
Copy link
Member Author

No, it's important. This is exactly where the problem lies, we need clear and unambiguous categorization.

Please help me understand, by providing an example where an object has transparent=False, which cannot be derived from depth_write. Or maybe another way to phrase it, you seem to suggest that an object can be transparent=False, alpha_mode='blend', depth_write=False. How do you expect the renderer to handle that?

@panxinmiao
Copy link
Contributor

No, it's important. This is exactly where the problem lies, we need clear and unambiguous categorization.

Please help me understand, by providing an example where an object has transparent=False, which cannot be derived from depth_write. Or maybe another way to phrase it, you seem to suggest that an object can be transparent=False, alpha_mode='blend', depth_write=False. How do you expect the renderer to handle that?

For example, scene backgrounds (or objects meant to serve as scene backgrounds), decals, or helper objects and markers for debugging or visualization purposes. I once had a use case where an object had a hidden proxy geometry—only the proxy geometry wrote to the depth buffer, while the visible object itself only wrote to the color buffer but not to depth

@almarklein
Copy link
Member Author

Yeah but how do you think the renderer should use the value of .transparent? From what I understand, objects with transparent=True are rendered first, but for the case of backgrounds this is not enough, because you want to render before all the other opaque objects. I'm not sure how the decals etc. fit in. Can you elaborate?

@panxinmiao
Copy link
Contributor

For example, some skybox or skydome models are modeled as a box or a sphere. See this model on Sketchfab: https://skfb.ly/oIpX9 and https://skfb.ly/oIHQL

When used, these typically have the following settings:

transparent=False, 
render_order=-99,  # to ensure they render first
depth_test=False, 
depth_write=False  # so they don’t participate in any depth-related operations.

And they are unit-sized and have their center position follow the camera every frame.

As for decals, they refer to "localized texture layers attached on top of existing geometry" to enhance visual detail—such as cracks, stains, bullet holes, or signs—without modifying the object's base geometry or main texture. In essence, a decal might be a small piece of geometry (like a plane) that is projected onto the target surface using geometric projection techniques.

While these effects are usually achieved using textures with transparency (i.e., alpha blending), making them visually semi-transparent, they should not be classified as transparent objects from the rendering pipeline's perspective. Decals do not participate in the transparent object depth sorting done in the main geometry pass or any other complex rendering behavior for transparent objects (such as double-sided/double passes rendering, as well as possible future more complex processes such as depth peeling, dual depth peeling, etc.). they behave more like a screen-space overlay. Decals are typically a separate pass or a separate render group in opaque pass following opaque objects (to control the rendering order), and are directly overlaid on the result of the opaque pass.

A typical decal setup maybe like this:

transparent = False
depth_test = True
depth_write = False
blend_mode = "normal"

Of course, the above parameter settings can be adjusted based on the actual needs of the scene. However, the key point is that low-level GPU pipeline features such as depth_write and blending method are atomic capabilities and should not be used as the basis for object classification. These features are flexible and can be freely combined by users to meet various creative needs, and we should not impose restrictions on the engine's ability to support such flexibility.

We can say that a certain category of objects (e.g., "opaque" or "transparent") is typically implemented using a specific set of parameters, but we should not assume that a particular parameter configuration necessarily implies the object belongs to a certain category.
I think the design of an engine's API should be guided by semantics rather than being constrained by low-level implementation details. In other words, APIs should focus on expressing the intrinsic properties or intended behaviors of objects, rather than exposing low-level pipeline settings such as "depth_write" or blend method. These parameters should serve as implementation mechanisms—not as the basis for object categorization or interface design.

@almarklein
Copy link
Member Author

Thanks for the explanation; I understand your point better now. It has everything to do with controlling the order that objects are rendered in. I'll give this some more thought ...

@almarklein
Copy link
Member Author

I'm moving rather slow since I am on holiday and try to get some Pygfx done in the late evenings / early mornings.

I've updated this pr and the #1144. I've not included a transparency property, because from the use-cases put forward it seems mostly used as a trick to force things to be rendered earlier or later than usual. Instead I added object.render_group which allows controlling multiple "layers", as if using multiple calls to render() except defined in the scene graph, which covers use-cases such as backgrounds and overlays.

I went a bit back and forth with naming things but I settled on:

  • alpa_mode: a string preset for the majority of cases (and to which we can add if we want).
  • alpha_config: a dict that fully describes the alpha handling in one place (instead of separate dicts for composite, dither, etc. which felt awkward).
  • alpha_method: represents the 'group': either "opaque", "composite", "stochastic" or "weighted".
  • Note that the names of alpha-method are more technical, while the names in alpha-mode are a bit more user-friendly, e.g. "solid" and "blend" (for the over-operator).

Further:

  • The renderer employs an opaque and transparency pass, like most render engines do. Which pass to assign an object to is based on the alpha-method, but users can override this in the alpha_config. These cases should be really rare.
  • Objects with alpha-method 'weighted' are grouped together because of the way the z-sorting happens. Though they can still be "grouped" using ob.render_order.
  • I added alpha-mode 'auto' as a somewhat plug-n-play mode, especially for 2D-ish scenes. I'm not entirely happy about it, but I think it more or less represents the earlier behavior.

@almarklein
Copy link
Member Author

mmm, things still don't feel right to me.

I've been reading a bit and I quite like Unity's renderQueue. Basically instead of ThreeJS's approach of having 2 queues (opaque and transparent), it has 5 builtin queues (including background and overlay), but users can also assign objects a renderQueue int value in between two builtin queues.

I feel like this is a quite elegant and flexible solution. Better than ThreeJS (setting .transparent and .renderOrder to try to control an objects render order) and also better than the ob.render_group I implemented that tries to do a similar thing.

We can derive a suitable value from alpha_method and allow the user to set it to override it. The ob.render_order still has value for controlling the order within a queue, and be per-object instead of per-material.

I'll have a stab at this soon, but figured I'd post the idea here to see what others think.

@almarklein
Copy link
Member Author

I updated this pr and #1144. IMO the render_queue idea solves the whole render-order issue in an elegant way that can also be super-flexible for special use-cases. I hope we can start to wrap this up ...

@Korijn
Copy link
Collaborator

Korijn commented Aug 10, 2025

I updated this pr and #1144. IMO the render_queue idea solves the whole render-order issue in an elegant way that can also be super-flexible for special use-cases. I hope we can start to wrap this up ...

To me this looks fantastic but I also think you should just move on and accept the outcome. In the interest of other priorities. :)

@panxinmiao
Copy link
Contributor

I believe there is one more point that needs to be clarified and documented.

Specifically, which rendering passes/stages our renderer internally uses (currently, we have three passes/stages in #974: Opaque, Transparent, and Weighted),

and according to our API design, what rules govern the assignment of objects to these specific rendering passes/stages.

Additionally, is render_queue intended to be user-configurable? Since the different rendering stages are inherent internal passes of the renderer and cannot be reordered arbitrarily, could this conflict with render_queue?

In other words, if an object is assigned to the Transparent stage based on the above rules, but the user sets its render_queue to 2000, what would happen?

@Korijn
Copy link
Collaborator

Korijn commented Aug 15, 2025

I guess it was a little frustrating process, but I am really impressed by the outcome. ❤️

@Korijn Korijn enabled auto-merge (squash) August 15, 2025 20:41
@Korijn Korijn merged commit afcf1a8 into main Aug 15, 2025
13 checks passed
@Korijn Korijn deleted the blend-docs branch August 15, 2025 20:48
@hmaarrfk
Copy link
Contributor

uge

@almarklein
Copy link
Member Author

Next is up is the actual implementation, which is ready in #1144

@panxinmiao
Copy link
Contributor

Sorry to bring up this topic again, 😓

After several days of hands-on practice and investigation, I can confirm that the renderQueue here is not the object classification mechanism we need.

Essentially, it is a higher-level sorting mechanism. On the CPU side, it buckets and sorts all objects based on their renderQueue to determine the order and grouping strategy of DrawCalls.
Although it ultimately determines which rendering phase/Pass an object belongs to in the renderer, it functions more like an internal identifier (even though it can be directly specified by users).
It still doesn’t solve the fundamental question: “How are objects classified (i.e., assigned to a specific phase) by default?”
Even in Unity, users are required to set the Queue or RenderType tag in the material shader to determine the final RenderQueue.

  • "Opaque" → 2000
  • "TransparentCutout" → 2450
  • "Transparent" → 3000

The only difference is that Unity allows users to directly set the RenderQueue.

However, Queue or RenderType itself is explicit, unambiguous, and requires user input—the engine does not infer it (and in my opinion, cannot infer it, as it is essentially a meta-attribute that reflects user intent).
If not set, the engine simply assigns a default value (e.g., Queue=2000) without any additional inference.

The Queue or RenderType values—Opaque, TransparentCutout, and Transparent—are exactly the internal phases and object classifications we need in the renderer.
Just like the transparent property in three.js or the TransparencyMode property in Babylon.js, these are meta-attributes that determine user intent and directly assign objects to the appropriate rendering phase. I believe the engine should not attempt to infer them (and in fact, cannot reliably do so).

Additionally, while reviewing Babylon.js documentation, I noticed that TransparencyMode can be left empty, and Babylon.js uses a very complex (though still explicit) logic to automatically infer it—unlike other engines that simply assign a default value. I was initially curious why Babylon.js adopted this approach, but after some research, I believe it’s merely a historical burden.
As mentioned in BabylonJS/Babylon.js#16144, this might be removed in the future.


Our current issue is that although alpha_method seems usable as a classification basis, it is an internal value that cannot be set directly or independently, and it’s opaque to users.
Even if it can be configured via the alpha_config dictionary, alpha_method is tied to the GPU pipeline’s blending settings and cannot serve as an independent intent-based identifier for grouping or phase assignment. This brings us back to square one. I’ve expressed this viewpoint multiple times that internal renderer phase division and GPU blending configuration are orthogonal concepts and should not be forcibly coupled.

I’d like to reiterate: object classification and renderer phase division are based on the engine’s own scene rendering logic and design philosophy—they exist to enable specific engine capabilities, e.g., supporting Weighted Blend OIT objects by introducing a dedicated Weighted phase (the rationale for treating it as a separate phase can be discussed elsewhere).
Most engines classify objects into at least opaque and transparent categories (with corresponding internal render phases) because, in most scenarios, opaque and transparent objects require different processing logic. Moreover, between rendering opaque and transparent objects, the renderer may need to perform some certain tasks (e.g., generating SSR-based transmissive light sampling maps for physically based transparency).

GPU blending, on the other hand, is a rendering pipeline capability with various applications. It is not synonymous with transparency or alpha blending, nor is it exclusive to a specific object category or render phase.

A simple example:
Even for opaque objects (from the renderer’s perspective, meaning they are rendered in the opaque phase—not necessarily visually opaque), users can leverage blending. A common use case is using additive blending for color accumulation.

Furthermore, if desired, users can achieve correct alpha-blended transparency effects even in this phase.
(Assuming source colors are pre-multiplied by α for simplicity)

When blending back-to-front iteratively (where dst is the accumulated result and src is the current fragment), the standard blend equation is:
dst ← src + (1 - src_α) * dst,
configured with:
SrcFactor = ONE, DstFactor = ONE_MINUS_SRC_ALPHA.

When blending front-to-back (typical for the opaque phase), use:
dst ← dst + (1 - dst_α) * src,
configured with:
SrcFactor = ONE_MINUS_DST_ALPHA, DstFactor = ONE.

The two are mathematically equivalent.

However, if users do this, the renderer still treats it as an opaque object and processes it in the opaque phase. This means all engine logic after the opaque phase (e.g., generating SSR buffers for physically based transparency, special handling for transparent-phase objects like dual-side dual-pass rendering, etc.) occurs after this object is rendered.


At last, I feel the current design of concepts like alpha_mode, alpha_config, and alpha_method is overly complex—even for someone relatively familiar with the API like myself, it’s easy to get confused.

Additionally, while well-intentioned to minimize user cognitive load (by reducing the number of values users need to know and set), having auto behaviors for many enums may actually cause more confusion. The engine needs to know the user’s explicit intent, and users working directly with the engine should understand its characteristics. Higher-level applications built on top of the engine should be responsible for providing abstractions based on specific use cases.

This is similar to how developers using pygfx must understand what Geometry represents, what Material represents, and what vertex positions in Geometry mean—while end-users of applications built on pygfx don’t need to know these details.

For example, the current code defaults alpha_mode='auto' but uses "blend" as the fallback behavior. I understand the reasoning: even for opaque objects, using blending won’t cause issues (since alpha=1), ensuring broad compatibility. However, this causes all default objects to be classified as transparent (renderQueue 3000) and assigned to the transparent phase (using the Transparent Pass), leading to unexpected results. It took me about an hour to figure out the reason why my rendering result was incorrect.

I think, trying to “do everything for the user” in a rendering engine often leads to bigger problems.

@almarklein
Copy link
Member Author

I have a feeling the confusion comes from a need to want to classify an object as opaque or transparent.

The short answer is this:

matrerial.render_queue = 2000  # == material.transparent = False
material.render_queue = 3000  # == material.transparent = True

Yes, the value of render_queue is automatically derived from alpha_method, depth_write, and whether alpha test is used. But as far as "intent" goes, users should probably just set render_queue directly.

Even if [alpha_method] can be configured via the alpha_config dictionary, alpha_method is tied to the GPU pipeline’s blending settings and cannot serve as an independent intent-based identifier for grouping or phase assignment. This brings us back to square one.

You are correct that alpha-method is a poor identifier for intent by itself. You should probably set material.render_queue instead.

A simple example:
Even for opaque objects (from the renderer’s perspective, meaning they are rendered in the opaque phase—not necessarily visually opaque), users can leverage blending.
[...]
However, if users do this, the renderer still treats it as an opaque object and processes it in the opaque phase.

Same answer, set render_queue.

At last, I feel the current design of concepts like alpha_mode, alpha_config, and alpha_method is overly complex—even for someone relatively familiar with the API like myself, it’s easy to get confused.

I agree it's a rather complex system. I have struggled with this a lot, and it still frustrates me. I would very much present users with a system that Just Works. I actually think we could get close if we adopt techniques such as adaptive transparency, mlab etc. which insert output fragments into a multi-layer render target. But such techniques suffer a significant performance and memory hit. So I've somewhat accepted that we cannot really solve this problem for the user. Instead we offer more solutions that most engines do.

So the complexity comes partly from the fact that we have included support for stochastic and weighted methods. This offers more ways to deal with transparency, at the cost of a more complex API 🤷

having auto behaviors for many enums may actually cause more confusion.

Also true. I can then recommend to explicitly set most such enums. I think it makes sense to recommend users to use explicit values in the docs.

BTW: the 'auto' alpha-mode also comes from the fact that we have multiple objects (lines, points, text) that have semi-transparent fragments for aa even if the object is opaque. Such objects can still be rendered fine, especially in 2D scenario's.

@panxinmiao
Copy link
Contributor

I believe that render_queue is not suitable as a user-facing API for expressing intent. It should be treated as an internal mechanism, since its meaning is neither clear nor intuitive to most users. In fact, the value of render_queue is usually derived from other properties, and users rarely need to set it directly.

As I understand it, automatic alpha blending was introduced mainly to support anti-aliasing (AA) for points, lines, text, and similar objects. However, this actually highlights that object classification/phasing and GPU blending capabilities are orthogonal concepts, and should not be coupled together.

For example, when the opacity=1, these objects should be classified as opaque, with no reason to place them in the transparent rendering stage. This applies even more strongly to Meshes: by default, they should not be treated as transparent objects (when opacity=1.0), since doing so violates user expectations, introduces potential issues, and conflicts with common usage patterns.

That said, it is reasonable for these 2D objects to require GPU blending when AA is enabled. In my view, such objects should still belong to the opaque category, but enabling AA should automatically configure the GPU blending state of their rendering pipeline (or adjust the corresponding material's GPU blending parameters). Additionally, because these 2D objects essentially reside on a flat z=0 plane, they neither need depth testing nor complex sorting. A simple sequential rendering order—based on layer or order—is sufficient.

@almarklein
Copy link
Member Author

There is no "opaque rendering stage" or "transparent rendering stage". And there is no way to classify an object as either opaque or transparent.

There's two main things:

  • How is the alpha value used, e.g. blended, dithered, ignored. This is defined by alpha_config. Most users will set it via the alpha_mode presets.
  • In what order the objects are rendered. This is defined by render_queue. It's default value is derived by guessing intent from alpha mode and depth_write, covering common cases. Setting the value manually, gives users a lot of flexibility in controlling object order.

There is also ob.render_order to control an object's order within a render queue. In ThreeJS this is used in certain cases to push objects to the end/beginning of the current pass. With the concept of the render_queue user will need it less often.

For example, when the opacity=1, these objects should be classified as opaque, with no reason to place them in the transparent rendering stage.

To address this point specifically as an example: objects that are blended but also write depth, by default end up in render_queue 2600, which sits somewhere in between the default queues for opaque (2000) and transparent (3000) objects, and objects are sorted back-to-front (so that any aa edges blend correctly).

@almarklein
Copy link
Member Author

There's two main things:

The alpha_config and render_queue are indeed orthogonal.

@panxinmiao
Copy link
Contributor

There is no "opaque rendering stage" or "transparent rendering stage". And there is no way to classify an object as either opaque or transparent.

Dividing the rendering process into stages is a necessary prerequisite for implementing some modern real-time rendering techniques. I am certain that almost all modern rendering engines at least separate the opaque and transparent stages, since many techniques require the renderer to perform specific operations between these two stages—for example, generating the SSR buffer. This was also the original motivation behind introducing object classification and stage division in #974.

I will explain in detail in #974 when I have time.

@panxinmiao
Copy link
Contributor

BTW: the 'auto' alpha-mode also comes from the fact that we have multiple objects (lines, points, text) that have semi-transparent fragments for aa even if the object is opaque. Such objects can still be rendered fine, especially in 2D scenario's.

I’d also like to add a note here: as I understand it, the current "aa" for certain opaque objects (such as points, lines, and text) relies on transparent fragments, which makes the API design and implementation somewhat awkward.
In my view, the reason for adopting this fragment-based alpha blending approach to “aa” is largely due to the lack of a proper implementation of PMA (premultipliedAlpha). If I’m not mistaken, this “aa” is mainly intended to mitigate edge artifacts and black borders, but the same can be achieved correctly with PMA. This would make the notion of “opaque” more consistent across API semantics, visual results, and internal implementation. Combined with MSAA and other techniques, it can further improve the overall anti-aliasing quality.
My plan is to systematically investigate and refine the logic of premultipliedAlpha after resolving the ColorSpace-related issues, in order to establish a complete and correct workflow for textures and color management. Until then, including in changes like #974, I will aim to maintain compatibility and preserve the current “aa” behavior.

@almarklein
Copy link
Member Author

In my view, the reason for adopting this fragment-based alpha blending approach to “aa” is largely due to the lack of a proper implementation of PMA (premultipliedAlpha). If I’m not mistaken, this “aa” is mainly intended to mitigate edge artifacts and black borders, but the same can be achieved correctly with PMA. This would make the notion of “opaque” more consistent across API semantics,

The purpose of aa is to blend the edges of objects with the things rendered behind it, to avoid jaggies. It's a very efficient way to prevent aliasing because it in the shader you can compute the exact pixel coverage. The downside is that it needs blending, so it relies on sorting to prevent artifacts.

I think premultiplied alpha is a different thing.

@almarklein
Copy link
Member Author

copied from #974

render stages vs render_queue

I feel like they're two different ways to approach the same problem, where render_queue is actually a more powerful model. But now they're used both, which defeats the purpose of the render_queue. But I'm pretty sure this can be made to work without dividing objects into 4 render stages (opaque, transmissive, transparent, weighted)...

First the problems that I see with the currently proposed approach (in #974):

The three render stages representing the generally-transparent objects, are forced in the order: transmissive, transparent, weighted. Ignoring the render_queue set by the user.

You added a comment for the weighted blending, saying that it should absolutely not be mixed with other transparent objects. I think there are cases where it could, as long as both groups of objects are spatially separate. And the same argument can be made for transmissive vs transparent objects!

In the renderer logic in main, the iter_render_pipelines_per_pass_type basically dealt with this problem. I believe we can keep using that and include support for transmissive objects, by adding a pass_type "transmissive".

In that way, the renderer would first sorts all objects in the normal way. Then iter_render_pipelines_per_pass_type iterates over them, grouping by pass_type. If the scene is "clean", you'll get at most one group of weighted and/or transmissive objects. But the user can also use render_queue to force multiple groups, and can control the order of transmissive/transparent/weighted objects. Or put all objects in the same render_queue, and let depth_sorting create the grouping, resulting in better quality of scenes with multiple "clusters" of transmissive objects.

Some things that would be possible with this approach:

  • transparent/translucent/weighted objects can be used in the background.
  • In case of multiple translucent objects, e.g. multiple wine-glasses, the translucent objects could be put in a different render_queue than the normally-transparent objects, so they are rendered neatly in order (i.e. only one copy of the transmission texture)
  • Or they could be the render_queue, resulting in one copy of the transmission texture for each wine glass.
  • Maybe we can allow users to force a transmission-copy for each translucent object to get more accurate results (at cost of performance), by putting each object in a differtent render_queue.

@panxinmiao
Copy link
Contributor

But I'm pretty sure this can be made to work without dividing objects into 4 render stages (opaque, transmissive, transparent, weighted)...

The rendering process is currently divided into three stages: opaque, transparent, and weighted blend. Transmissive and transparent objects are handled within the same rendering stage/render pass. However, I deliberately sort transmissive objects before regular transparent ones using renderQueue. This ordering brings two benefits, as transmissive objects typically do not rely on GPU blending and usually depth_write=True (from the perspective of GPU rendering pipeline, it behaves like an "opaque" object), :

  • When depth_write = True, it allows better utilization of early-Z testing.

  • When users intentionally set its depth_write = False, it can, to some extent (though not perfectly), simulate the visibility of regular transparent objects behind them.

The three render stages representing the generally-transparent objects, are forced in the order: transmissive, transparent, weighted. Ignoring the render_queue set by the user.

This is exactly the issue I mentioned earlier: renderQueue is essentially a higher-level sorting mechanism, while the division of rendering stages is an inherent part of the renderer’s logic. No matter how renderQueue is adjusted, the sorting only takes effect within a given rendering stage, and cannot cross stage boundaries.
That said, in the current implementation, the distinction between opaque and transparent stages is determined by whether renderQueue < 2500. As a result, apart from weighted blend objects, the rendering order can be regarded as fully following the user-defined renderQueue; weighted blend objects are the only exception.


Before considering how to handle Weighted Blended objects, I attempted to research the implementations of current mainstream rendering engines. However, I found that the vast majority of engines do not have built-in OIT (Order-Independent Transparency) pipelines. Even when they do, OIT is only treated as a global transparent rendering solution, separate from traditional sorting-based transparent methods. Therefore, there is a lack of readily available reference cases or best practices for using Weighted Blended objects alongside ordinary alpha blend objects.

We aim to support Weighted Blended OIT and alpha blend objects simultaneously to some extent. However, the unique aspect of Weighted Blended OIT is its inherently multi-pass nature (at least requiring Pass A: rendering OIT objects to an accumulation/revealage buffer; Pass B: compositing to the main framebuffer via a full-screen pass). This differs from the standard forward rendering process. When mixing Weighted Blended OIT objects with ordinary alpha blend objects, frequent GPU rendering state switches are difficult to avoid:

  • Rendering OIT objects requires switching to an OIT-specific render pass (binding the accumulation/revealage buffers).
  • Rendering ordinary alpha objects requires switching back to the main framebuffer's render pass.

If these two types of objects appear alternately, it will trigger expensive render pass switches.

A reasonable idea is: since Weighted Blended OIT objects do not participate in depth sorting themselves, there is no need to handle interleaving scenarios between them and ordinary alpha blend objects. Therefore, all OIT objects can be processed uniformly at a certain stage of the rendering pipeline, thereby avoiding the complexity introduced by their multi-pass nature—that is, treating OIT as a completely independent rendering stage, decoupled from the rendering process of other objects.

Based on this, categorizing Weighted Blended OIT objects as a separate type and assigning them an independent rendering stage is a natural implementation. Although some users might desire the flexibility to manually specify the rendering order to achieve interleaved rendering of OIT and ordinary transparent objects, the value of this flexibility is likely outweighed by the performance cost and implementation complexity it introduces. Frequent render pass switches are not only prone to unintentional misuse by users, leading to performance degradation, but mixing Weighted Blended objects with ordinary alpha blend objects might also cause unexpected rendering issues. Furthermore, this would impose constraints on the engine's feature expansion and maintenance. For example, adding support for two-pass rendering of double-sided transparent objects would necessitate additional handling for compatibility with Weighted Blended OIT objects, introducing unnecessary complexity.

I believe this design choice is essentially an engineering trade-off. The core of real-time rendering technology lies in finding a balance between visual realism, performance cost, and implementation complexity. This is also the fundamental distinction between modern real-time rendering and offline rendering. While the basic laws of physics are inherently simple—light propagation, energy conservation, laws of reflection and refraction, all expressible precisely in mathematics—offline rendering (e.g., path tracing, Monte Carlo integration) tends to follow these physical laws directly, approximating the real world through extensive sampling and computation. In contrast, real-time rendering heavily employs approximations and simplifications (e.g., BRDF approximations, precomputed IBL) to achieve acceptable visual quality within the constraints of limited performance budgets and engineering complexity. Handling Weighted Blended OIT as a separate stage adheres to this philosophy.

A similar trade-off is evident in the handling of transmissive light in #974. For transmissive objects, screen-space refraction (SSR) is needed to gather information about the scene behind them.
I initially considered generating a separate scene framebuffer for each transmissive object, but this would be too expensive. Each transmissive object would require re-rendering the entire scene, which is impractical for real-time rendering.
Subsequently, I thought about utilizing transparency sorting and layer-by-layer accumulation (hence the most significant change initially made to the Pygfx renderer in this PR was the requirement for transparency sorting). The result after drawing each object could serve as the background for the next object, thus avoiding the costly overhead of rendering the scene multiple times. However, since the GPU does not allow simultaneous reading and writing to the same texture, double buffering (ping-pong buffers) or an intermediate copy mechanism would be required. This would still cause render pass interruptions. When there are multiple transmissive objects in the scene, although it wouldn't require extra re-rendering of the scene, frequent render pass interruptions and switches would still lead to performance degradation (This is the same as the problem with weighted blended objects.). Moreover, this approach still cannot handle cases where transmissive objects intersect or self-occlude.

Therefore, I tried looking at implementations in mature engines, and the final answer was straightforward. The methods adopted by the reference implementation of the glTF 2.0 Transmission extension, as well as engines like Three.js, Babylon.js, and Unity HDRP, are all very "simple and direct": after all opaque objects are rendered, a copy of the background color buffer is generated, and all transmissive objects share this buffer for sampling. Although this scheme cannot handle the superposition of multiple layers of transmissive objects, it is simple, efficient, and practical.

I believe this is a simple trade-off, and one that makes sense. Firstly, in the vast majority of scenarios, there won't be overlapping and superposition of multiple transmissive objects. Furthermore, adopting this fixed strategy also inversely requires scene designers and technical artists to optimize scene design accordingly, rather than abusing the feature. Content creators inherently need to optimize asset structure and scene layout within the engine's capabilities to achieve the best visual results.

All these engines divide the rendering process into multiple stages—at least an opaque stage and a transparent stage—and perform specific tasks between these stages based on logical requirements. Generating the SSR framebuffer is one such task. Some global illumination, AO, volumetric fog, and even certain post-processing effects also need to be inserted between the opaque and transparent stages. Unity even exposes injection points between these stages to let users plug in custom logic or passes.

@panxinmiao
Copy link
Contributor

Sorry for post such a “long essay” without properly dividing it into paragraphs and section titles, 😅 . I’ll add some examples and supporting facts to certain arguments and viewpoints in the text when I have more time.

@almarklein
Copy link
Member Author

No matter how renderQueue is adjusted, the sorting only takes effect within a given rendering stage, and cannot cross stage boundaries.

I think you're confusing material.renderQueue with ob.renderOrder here. The renderOrder sorts objects within a pass/queue, and the renderQueue is the pass/stage.

From what I understand how Unity works. It sorts all objects based on the renderQueue, then on the depth. Then per renderQueue it renders all objects in that queue, which can be a mix of objects requiring different render passes. The SSR texture is copied right when the first (transparent) object that needs it is to be rendered.

Therefore, I tried looking at implementations in mature engines, and the final answer was straightforward. The methods adopted by the reference implementation of the glTF 2.0 Transmission extension, as well as engines like Three.js, Babylon.js, and Unity HDRP, are all very "simple and direct": after all opaque objects are rendered, a copy of the background color buffer is generated, and all transmissive objects share this buffer for sampling.

Awesome! This can be done perfectly fine with the way we iterate over objects in current main. What we'd have to do, is copy the SSR texture as soon as it encounters an object that needs it.

Imagine the case where a user does not set render_queue, so all transparent (normal and translucent) are in 3000. So right before the transparent objects are rendered, we copy the SSR texture. This is quite literally the behavior that you call the "simple and direct" approach in your comment.

Although this scheme cannot handle the superposition of multiple layers of transmissive objects [...]

But with render_queue we can! If we copy the SSR texture for every renderQueue that contains objects that need it, the user can create scenes as complex as he wants. Also, the user can easily control whether e.g. weighted objects are drawn before or after, by setting renderQueue.

As for mixing the different kinds of transparent objects in a single render_queue. I don't really see a problem. The normal and translucent objects can be mixed, as long as we just copy the SSR texture beforehand. The weighted objects are also fine, because they get grouped separately from the other objects automatically because of the depth-sorting (their key is zero).

So there is no need to fear these "render pass interruptions"; by default there would be just one SRR copy, and one group of weighted objects (if any). Only when the user uses render_queue there can be multiple such passes, enabling better rendering of specific scenes.

@panxinmiao
Copy link
Contributor

panxinmiao commented Sep 8, 2025

In fact, the engine is deliberately designed to avoid creating multiple SSR textures for different transmissive objects.

In the rendering pipeline, a full-resolution screen-sized copy is already a very heavy operation, and generating mipmaps for the SSR color buffer adds even more overhead. When multiple transmissive objects are present, this quickly becomes a performance bottleneck.
Moreover, in the transparent/transmissive stage, the engine sorts all objects (back-to-front) and renders them in batches. If a “copy framebuffer” barrier were inserted every time a transmissive object is encountered, it would break the rendering pipeline, disrupt batching, and cause a GPU pipeline flush — which is extremely costly.

I can confirm that Unity HDRP only maintains a single SSR buffer, generated at a fixed stage in the renderer. Unity also provides a BeforePreRefraction injection point, which is specifically placed before SSR generation.

This is the documentation for Unity HDRP:

https://docs.unity3d.com/Packages/com.unity.render-pipelines.high-definition@17.3/manual/how-hdrp-calculates-color-for-reflection-and-refraction.html

The color buffer HDRP uses is the first color pyramid that contains only opaque objects, so refractive objects won't be visible through other refractive objects.

https://docs.unity3d.com/Packages/com.unity.render-pipelines.high-definition@17.3/manual/Custom-Pass-buffers-pyramids.html#depth-pyramid-and-color-pyramid-generation-in-hdrp

HDRP generates color pyramids at these points in the rendering pipeline:

  1. After the BeforePreRefraction injection point (in Color Pyramid PreRefraction). This contains opaque objects and objects rendered in BeforePreRefraction and is used for Screen space reflections next frame and the refraction effect in the current frame.

Blender seems to be the same:
https://community.khronos.org/t/objects-with-transmission-hiding-other-objects-with-transmission/107669

I believe the same principle applies to weighted blended objects: they should be handled in a single batch. Otherwise, the overhead of frequently interrupting the rendering pipeline and switching rendering states would far outweigh any potential benefits.

But with render_queue we can! If we copy the SSR texture for every renderQueue that contains objects that need it, the user can create scenes as complex as he wants...

As I mentioned earlier, I believe these measures are, in fact, intentional restrictions on users creating arbitrarily complex scenes. Such complexity is usually unnecessary, and may be abused. and the engine in fact may not be able to handle them well, which could lead to potential issues.

@panxinmiao
Copy link
Contributor

panxinmiao commented Sep 9, 2025

I think you're confusing material.renderQueue with ob.renderOrder here. The renderOrder sorts objects within a pass/queue, and the renderQueue is the pass/stage.

https://docs.unity3d.com/Packages/com.unity.render-pipelines.high-definition%4017.0/manual/rendering-execution-order.html

This documentation clearly outlines Unity’s rendering pipeline stages, including the fundamental opaque and transparent stages, and further explains the exact timing of tasks executed within or between these stages. In the transparent stage, Unity also processes refraction first and then transparents, which is consistent with our logic.

As I mentioned earlier: RenderQueue is essentially a higher-level sorting mechanism, while the division of rendering stages is an inherent part of the renderer’s logic.

I haven’t found detailed documentation on how Unity assigns renderable objects to different rendering stages, but I believe the division is based on whether the object’s renderQueue is greater than 2500 (opaque stage or transparent stage). Therefore, it can also be said that RenderQueue determines which rendering stage an object is assigned to.

https://docs.unity3d.com/ScriptReference/Rendering.RenderQueue.html
The documentation emphasizes that users should “Use the render queue section that matches the object type”, and it also specifies that 2500 is the last render queue treated as opaque.

@almarklein
Copy link
Member Author

I believe these measures are, in fact, intentional restrictions on users creating arbitrarily complex scenes. Such complexity is usually unnecessary

Sure, any added flexibility is also an opportunity for users to do something stupid 😉. I think the design would naturally allow users to create advanced use-cases. But I'm fine with restricting the SRR to being created only the first time it's needed.

Therefore, it can also be said that RenderQueue determines which rendering stage an object is assigned to.

Yes, you could say that. I'd even say there is not really an explicit "rendering stage". Objects are sorted by render_queue, and some things (like whether objects sort front-to-back) are different based on certain ranges of the render-queue. In the common case where objects only use render-stages 2000 and 3000 (and maybe 1000 for background), you have exactly the same mechanics as a "two-part render stage".

@panxinmiao
Copy link
Contributor

Yes, you could say that. I'd even say there is not really an explicit "rendering stage".

No, for a renderer, rendering stages are explicitly defined—this is a fundamental aspect of pipeline design. Unity follows the same principle. Tasks such as SSR generation are also executed at a specific time points. This has already been explained in the Unity document that I mentioned earlier:
https://docs.unity3d.com/Packages/com.unity.render-pipelines.high-definition%4017.0/manual/rendering-execution-order.html

https://docs.unity3d.com/ScriptReference/Rendering.RenderQueue.html

According to Unity’s documentation, whether an object belongs to the transparent stage is determined by whether its RenderQueue value is greater than 2500. In other words, both 2600 and 3000 are rendered in the transparent stage, with lower values rendered earlier. RenderQueue itself is not the render stage; rather, it is a higher-level sorting mechanism within a stage, intended to provide flexibility for batching optimizations. For most users, Unity recommends following the predefined RenderQueue conventions to avoid potential rendering issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants