Conversation
This file will host the series of functions that implement OMM rasterization, given alpha texture data as the source. For now the functions are not exposed in meshoptimizer.h as the interface will be highly unstable. In this change, a simple recursive subdivision is used to generate 2-state OMM data. In the recursive invocations, we sample alpha data for the corners of micro-triangles to pass them to the following level; this reuses sampled positions to some extent, although the total number of samples is still a little more than optimal. For now, the resulting alpha samples are simply averaged together with the center sample and thresholded against 0.5 to determine the triangle state. This can be improved in the future.
The rasterization for 4-state OMM can encode the same opaque/transparent data with an extra "unknown" bit. In the shader, we can then either use the 4-state data as is (getting anyhit invocations for unknown state), or force it to be 2-state, in which case the two unknown states are converted to a known opaque or transparent. For determining known state, for now we use thresholds closer to the 0/1 cutoff to get a more conservative estimate. It's still the case that the triangle corners and centers could miss fine texture detail and the result will not be conservative. For unknown state, we use the same formula as the 2-state raster for improved consistency.
For now we assert on D3D11-like texture limits and reasonable limits for stride/pitch to prevent accidental mistakes.
Since the parameters in recursive calls are mostly shared between successive calls and there are too many of them to fit into registers, it's more efficient to pass an array of 3 floats (U/V/alpha) for each corner. This also makes it somewhat easier to follow the logic in the function, although in some places having just one index vs cI[J] was cleaner.
Given the input UV coordinates, we can compute the optimal subdivision level for each triangle based on the target size of each triangle in texels. For now we round the resulting log2 ratio to nearest; this tends to keep the average size a little closer to target compared to using floor/ceil. The subdiv level is additionally constrained to a given maximum; when target_edge is 0, all triangles are subdivided uniformly.
When rasterizing micro-triangles, ideally we'd like each microtriangle to be 2x2-3x3 texels. Because rasterization doesn't examine the entire contents of the microtriangle, using mips that are too detailed could lead to missing features in the opacity map. This function can be used before calling opacityMapRasterize to pass mip level data and width/height instead of the original texture for best results.
Instead of a 1-1 mapping, opacityMapMeasure now generates a unique index map based on UV equality. This significantly reduces the number of rasterization requests on real-world meshes; while post-rasterization deduplication is still important and will be added later, we need both to be able to reach reasonable rasterization times. For now we use the same hash scaffolding as some other files in the library do; this might need to be tweaked a little bit in the future to either change the table to store indices, or to simplify the operator== interaction.
In case the input UVs have small jitter, we quantize them to a subpixel grid before hashing. While this runs a small risk of collapsing different triangles together, in practice our rasterization algorithm is not sensitive enough to pick up the differences and these collisions are not common. We currently use a 4x4 subpixel grid (loosely equivalent to 16x MSAA).
Instead of using the hash table to store triangle data directly, we store it in an array on the side and just store indices in the hash table. This results in a reduced memory consumption (as we only need to round the index store up to a power of two), much better cache locality when the triangle reduction rate is significant, and works cleanly with our canonical hashLookup templated interface without operator== hacks.
If input UVs have inf/nans then depending on fp modes we might end up with a negative level; previously we were clamping that to max_level but on further reflection it seems better to explicitly clamp to 0.
After rasterization, it's not uncommon to see identical rasterized results for triangles that were distinct in source data. This is redundant and can be fixed by deduplicating the bits and adjusting OMM indices to point to the new data. The new function, meshopt_opacityMapCompact, does just that. It expects a compactly stored source data and re-compacts it, adjusting offsets, levels and external indices to match. The result is the number of resulting OMM entries, which is a little awkward because the resulting size must be computed from the last offset (similarly to buildMeshlets) manually.
If input triangles have the same state for each microtriangle, they can use a special index that takes no bytes of actual data storage. We now implement this as part of compaction. All level 0 triangles can be represented with a special index; level 1 and above need to be checked. We can mostly do this checking byte-wise, except state=2 level=1 triangles that only use 4 bits out of a zero byte and require special treatment. Also clamp mips returned from RasterizeMip to 0.
We need bilinear filtering to be able to more accurately classify triangle data. In addition, this change switches from border addressing to a mixed wrap/clamp addressing: the source texture coordinate is wrapped, but we never filter across the edge so all 4 samples are local. This is helpful for performance and also means the code works better for use cases where the source texture is *not* wrapped in real rendering: as long as source coordinates are in 0..1 range, the samples will work as if addressing was clamp-to-edge. The implementation is careful to avoid using floor (which is a function call depending on compiler flags and target instruction set).
Instead of a straight average, we weigh the center sample more strongly to improve the estimation. These weights were derived by training a degenerate "network" (4=>1 reduction) to estimate coverage from a0/a1/a2/ac, and truncating the weights to two decimal digits. In the future it might make sense to switch to a slightly more complex estimator; e.g. 4=>3=>1 with intermediate ReLU provides a little higher quality results; however, for now the extra gains aren't obviously worth the extra evaluation cost, and we might need more input samples to achieve significantly better results. Also adjust the mip selection to round level further downwards; with bilinear filtering, we would actually prefer something like 3x3 footprint for microtriangles as 4 samples cover that pretty well.
This change adds "sources" to meshopt_opacityMapMeasure, which indicates the source triangle for each OMM. Without it, you'd need to scan omm_indices and figure out which first triangle maps to the OMM which is a little cumbersome. Also rename meshopt_opacityRasterizeMip to meshopt_opacityMapPreferredMip as the name signals intent better, and add meshopt_opacityMapTriangleSize which had to be implemented by the caller previously to actually do the memory layout. This API surface is now complete enough that it can be added to the header; however, it doesn't have an ideal shape so it's likely that it will change again.
- Fix UBSAN alignment violation when hashing OMM data - Fix int/unsigned mismatch when filling source triangle ids - Fix typo in a comment
This tests all 5 functions (measure/rasterize/compact + preferred mip & triangle size) on a basic example where a quad is mapped to a circle texture. While this test exercises compaction/special index conversion paths, the OMMs rasterized here don't end up being compacted.
Instead of testing a quad, we test a 6-triangle tessellation with 4 corners (that lie outside of the circle) and 2 center triangles. We additionally flip one of the center triangles so that it's a perfect mirror of the other one and produces the same OMM data, so that compaction can deduplicate these. Some of the test assertions are only valid for the current algorithm; these may need to be tweaked in the future.
Instead of duplicating MurmurHash code, we keep the code in a separate function - this is identical to the function in indexgenerator but needs to support unaligned inputs, hence "u" suffix. For the OMM data, instead of using custom tail processing code (which can't be fully covered anyhow because the input is a power of two), we fold first and last byte into the seed value. The finalizer should mix these up sufficiently well. UV data is also well structured and long enough not to require the extra finalizer; this matches what we do in indexgenerator too so should be robust.
We need this in many different places in the code; in getSpecialIndex we used to use a simpler computation because the data would always take at least a byte, but it can be replaced with a general version too. Also remove <float.h> which we are not using and don't plan to.
This matches "Map" naming nicely (map entry size is exactly what the function returns!) and removes the ambiguity between what a "triangle" means here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change introduces an initial version of OMM support. Given a set of triangles, UV coordinates, and a texture, these functions can generate OMM data for use with
VK_EXT_opacity_micromap, equivalent NV Vulkan extensions, or DXR 1.2 OMMs.Note: currently, the 4-state OMMs generated are very non-conservative. The rasterization code will be adapted in the future to be more conservative (no plans to have a strong guarantee for the conservative output, but the issues should be minimal in the future when using mip 0), but this is not part of this change. The interface is not necessarily final, so there's no documentation for this functionality yet - this will happen when things are closer to being ready but this change on its own is useful enough to merge.
This contribution is sponsored by Valve.