diff --git a/INSTRUCTION.md b/INSTRUCTION.md new file mode 100644 index 0000000..a744a2e --- /dev/null +++ b/INSTRUCTION.md @@ -0,0 +1,297 @@ +Instructions - Vulkan Grass Rendering +======================== + +This is due **Sunday 11/5, evening at midnight**. + +**Summary:** +In this project, you will use Vulkan to implement a grass simulator and renderer. You will +use compute shaders to perform physics calculations on Bezier curves that represent individual +grass blades in your application. Since rendering every grass blade on every frame will is fairly +inefficient, you will also use compute shaders to cull grass blades that don't contribute to a given frame. +The remaining blades will be passed to a graphics pipeline, in which you will write several shaders. +You will write a vertex shader to transform Bezier control points, tessellation shaders to dynamically create +the grass geometry from the Bezier curves, and a fragment shader to shade the grass blades. + +The base code provided includes all of the basic Vulkan setup, including a compute pipeline that will run your compute +shaders and two graphics pipelines, one for rendering the geometry that grass will be placed on and the other for +rendering the grass itself. Your job will be to write the shaders for the grass graphics pipeline and the compute pipeline, +as well as binding any resources (descriptors) you may need to accomplish the tasks described in this assignment. + +![](img/grass.gif) + +You are not required to use this base code if you don't want +to. You may also change any part of the base code as you please. +**This is YOUR project.** The above .gif is just a simple example that you +can use as a reference to compare to. + +**Important:** +- If you are not in CGGT/DMD, you may replace this project with a GPU compute +project. You MUST get this pre-approved by Austin Eng before continuing! + +### Contents + +* `src/` C++/Vulkan source files. + * `shaders/` glsl shader source files + * `images/` images used as textures within graphics pipelines +* `external/` Includes and static libraries for 3rd party libraries. +* `img/` Screenshots and images to use in your READMEs + +### Installing Vulkan + +In order to run a Vulkan project, you first need to download and install the [Vulkan SDK](https://vulkan.lunarg.com/). +Make sure to run the downloaded installed as administrator so that the installer can set the appropriate environment +variables for you. + +Once you have done this, you need to make sure your GPU driver supports Vulkan. Download and install a +[Vulkan driver](https://developer.nvidia.com/vulkan-driver) from NVIDIA's website. + +Finally, to check that Vulkan is ready for use, go to your Vulkan SDK directory (`C:/VulkanSDK/` unless otherwise specified) +and run the `cube.exe` example within the `Bin` directory. IF you see a rotating gray cube with the LunarG logo, then you +are all set! + +### Running the code + +While developing your grass renderer, you will want to keep validation layers enabled so that error checking is turned on. +The project is set up such that when you are in `debug` mode, validation layers are enabled, and when you are in `release` mode, +validation layers are disabled. After building the code, you should be able to run the project without any errors. You will see +a plane with a grass texture on it to begin with. + +![](img/cube_demo.png) + +## Requirements + +**Ask on the mailing list for any clarifications.** + +In this project, you are given the following code: + +* The basic setup for a Vulkan project, including the swapchain, physical device, logical device, and the pipelines described above. +* Structs for some of the uniform buffers you will be using. +* Some buffer creation utility functions. +* A simple interactive camera using the mouse. + +You need to implement the following features/pipeline stages: + +* Compute shader (`shaders/compute.comp`) +* Grass pipeline stages + * Vertex shader (`shaders/grass.vert') + * Tessellation control shader (`shaders/grass.tesc`) + * Tessellation evaluation shader (`shaders/grass.tese`) + * Fragment shader (`shaders/grass.frag`) +* Binding of any extra descriptors you may need + +See below for more guidance. + +## Base Code Tour + +Areas that you need to complete are +marked with a `TODO` comment. Functions that are useful +for reference are marked with the comment `CHECKITOUT`. + +* `src/main.cpp` is the entry point of our application. +* `src/Instance.cpp` sets up the application state, initializes the Vulkan library, and contains functions that will create our +physical and logical device handles. +* `src/Device.cpp` manages the logical device and sets up the queues that our command buffers will be submitted to. +* `src/Renderer.cpp` contains most of the rendering implementation, including Vulkan setup and resource creation. You will +likely have to make changes to this file in order to support changes to your pipelines. +* `src/Camera.cpp` manages the camera state. +* `src/Model.cpp` manages the state of the model that grass will be created on. Currently a plane is hardcoded, but feel free to +update this with arbitrary model loading! +* `src/Blades.cpp` creates the control points corresponding to the grass blades. There are many parameters that you can play with +here that will change the behavior of your rendered grass blades. +* `src/Scene.cpp` manages the scene state, including the model, blades, and simualtion time. +* `src/BufferUtils.cpp` provides helper functions for creating buffers to be used as descriptors. + +We left out descriptions for a couple files that you likely won't have to modify. Feel free to investigate them to understand their +importance within the scope of the project. + +## Grass Rendering + +This project is an implementation of the paper, [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). +Please make sure to use this paper as a primary resource while implementing your grass renderers. It does a great job of explaining +the key algorithms and math you will be using. Below is a brief description of the different components in chronological order of how your renderer will +execute, but feel free to develop the components in whatever order you prefer. + +### Representing Grass as Bezier Curves + +In this project, grass blades will be represented as Bezier curves while performing physics calculations and culling operations. +Each Bezier curve has three control points. +* `v0`: the position of the grass blade on the geomtry +* `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector (explained soon) +* `v2`: a physical guide for which we simulate forces on + +We also need to store per-blade characteristics that will help us simulate and tessellate our grass blades correctly. +* `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0` +* Orientation: the orientation of the grass blade's face +* Height: the height of the grass blade +* Width: the width of the grass blade's face +* Stiffness coefficient: the stiffness of our grass blade, which will affect the force computations on our blade + +We can pack all this data into four `vec4`s, such that `v0.w` holds orientation, `v1.w` holds height, `v2.w` holds width, and +`up.w` holds the stiffness coefficient. + +![](img/blade_model.jpg) + +### Simulating Forces + +In this project, you will be simulating forces on grass blades while they are still Bezier curves. This will be done in a compute +shader using the compute pipeline that has been created for you. Remember that `v2` is our physical guide, so we will be +applying transformations to `v2` initially, then correcting for potential errors. We will finally update `v1` to maintain the appropriate +length of our grass blade. + +#### Binding Resources + +In order to update the state of your grass blades on every frame, you will need to create a storage buffer to maintain the grass data. +You will also need to pass information about how much time has passed in the simulation and the time since the last frame. To do this, +you can extend or create descriptor sets that will be bound to the compute pipeline. + +#### Gravity + +Given a gravity direction, `D.xyz`, and the magnitude of acceleration, `D.w`, we can compute the environmental gravity in +our scene as `gE = normalize(D.xyz) * D.w`. + +We then determine the contribution of the gravity with respect to the front facing direction of the blade, `f`, +as a term called the "front gravity". Front gravity is computed as `gF = (1/4) * ||gE|| * f`. + +We can then determine the total gravity on the grass blade as `g = gE + gF`. + +#### Recovery + +Recovery corresponds to the counter-force that brings our grass blade back into equilibrium. This is derived in the paper using Hooke's law. +In order to determine the recovery force, we need to compare the current position of `v2` to its original position before +simulation started, `iv2`. At the beginning of our simulation, `v1` and `v2` are initialized to be a distance of the blade height along the `up` vector. + +Once we have `iv2`, we can compute the recovery forces as `r = (iv2 - v2) * stiffness`. + +#### Wind + +In order to simulate wind, you are at liberty to create any wind function you want! In order to have something interesting, +you can make the function depend on the position of `v0` and a function that changes with time. Consider using some combination +of sine or cosine functions. + +Your wind function will determine a wind direction that is affecting the blade, but it is also worth noting that wind has a larger impact on +grass blades whose forward directions are parallel to the wind direction. The paper describes this as a "wind alignment" term. We won't go +over the exact math here, but use the paper as a reference when implementing this. It does a great job of explaining this! + +Once you have a wind direction and a wind alignment term, your total wind force (`w`) will be `windDirection * windAlignment`. + +#### Total force + +We can then determine a translation for `v2` based on the forces as `tv2 = (gravity + recovery + wind) * deltaTime`. However, we can't simply +apply this translation and expect the simulation to be robust. Our forces might push `v2` under the ground! Similarly, moving `v2` but leaving +`v1` in the same position will cause our grass blade to change length, which doesn't make sense. + +Read section 5.2 of the paper in order to learn how to determine the corrected final positions for `v1` and `v2`. + +### Culling tests + +Although we need to simulate forces on every grass blade at every frame, there are many blades that we won't need to render +due to a variety of reasons. Here are some heuristics we can use to cull blades that won't contribute positively to a given frame. + +#### Orientation culling + +Consider the scenario in which the front face direction of the grass blade is perpendicular to the view vector. Since our grass blades +won't have width, we will end up trying to render parts of the grass that are actually smaller than the size of a pixel. This could +lead to aliasing artifacts. + +In order to remedy this, we can cull these blades! Simply do a dot product test to see if the view vector and front face direction of +the blade are perpendicular. The paper uses a threshold value of `0.9` to cull, but feel free to use what you think looks best. + +#### View-frustum culling + +We also want to cull blades that are outside of the view-frustum, considering they won't show up in the frame anyway. To determine if +a grass blade is in the view-frustum, we want to compare the visibility of three points: `v0, v2, and m`, where `m = (1/4)v0 * (1/2)v1 * (1/4)v2`. +Notice that we aren't using `v1` for the visibility test. This is because the `v1` is a Bezier guide that doesn't represent a position on the grass blade. +We instead use `m` to approximate the midpoint of our Bezier curve. + +If all three points are outside of the view-frustum, we will cull the grass blade. The paper uses a tolerance value for this test so that we are culling +blades a little more conservatively. This can help with cases in which the Bezier curve is technically not visible, but we might be able to see the blade +if we consider its width. + +#### Distance culling + +Similarly to orientation culling, we can end up with grass blades that at large distances are smaller than the size of a pixel. This could lead to additional +artifacts in our renders. In this case, we can cull grass blades as a function of their distance from the camera. + +You are free to define two parameters here. +* A max distance afterwhich all grass blades will be culled. +* A number of buckets to place grass blades between the camera and max distance into. + +Define a function such that the grass blades in the bucket closest to the camera are kept while an increasing number of grass blades +are culled with each farther bucket. + +#### Occlusion culling (extra credit) + +This type of culling only makes sense if our scene has additional objects aside from the plane and the grass blades. We want to cull grass blades that +are occluded by other geometry. Think about how you can use a depth map to accomplish this! + +### Tessellating Bezier curves into grass blades + +In this project, you should pass in each Bezier curve as a single patch to be processed by your grass graphics pipeline. You will tessellate this patch into +a quad with a shape of your choosing (as long as it looks sufficiently like grass of course). The paper has some examples of grass shapes you can use as inspiration. + +In the tessellation control shader, specify the amount of tessellation you want to occur. Remember that you need to provide enough detail to create the curvature of a grass blade. + +The generated vertices will be passed to the tessellation evaluation shader, where you will place the vertices in world space, respecting the width, height, and orientation information +of each blade. Once you have determined the world space position of each vector, make sure to set the output `gl_Position` in clip space! + +** Extra Credit**: Tessellate to varying levels of detail as a function of how far the grass blade is from the camera. For example, if the blade is very far, only generate four vertices in the tessellation control shader. + +To build more intuition on how tessellation works, I highly recommend playing with the [helloTessellation sample](https://github.com/CIS565-Fall-2017/Vulkan-Samples/tree/master/samples/5_helloTessellation) +and reading this [tutorial on tessellation](http://in2gpu.com/2014/07/12/tessellation-tutorial-opengl-4-3/). + +## Resources + +### Links + +The following resources may be useful for this project. + +* [Responsive Real-Time Grass Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) +* [CIS565 Vulkan samples](https://github.com/CIS565-Fall-2017/Vulkan-Samples) +* [Official Vulkan documentation](https://www.khronos.org/registry/vulkan/) +* [Vulkan tutorial](https://vulkan-tutorial.com/) +* [RenderDoc blog on Vulkan](https://renderdoc.org/vulkan-in-30-minutes.html) +* [Tessellation tutorial](http://in2gpu.com/2014/07/12/tessellation-tutorial-opengl-4-3/) + + +## Third-Party Code Policy + +* Use of any third-party code must be approved by asking on our Google Group. +* If it is approved, all students are welcome to use it. Generally, we approve + use of third-party code that is not a core part of the project. For example, + for the path tracer, we would approve using a third-party library for loading + models, but would not approve copying and pasting a CUDA function for doing + refraction. +* Third-party code **MUST** be credited in README.md. +* Using third-party code without its approval, including using another + student's code, is an academic integrity violation, and will, at minimum, + result in you receiving an F for the semester. + + +## README + +* A brief description of the project and the specific features you implemented. +* At least one screenshot of your project running. +* A performance analysis (described below). + +### Performance Analysis + +The performance analysis is where you will investigate how... +* Your renderer handles varying numbers of grass blades +* The improvement you get by culling using each of the three culling tests + +## Submit + +If you have modified any of the `CMakeLists.txt` files at all (aside from the +list of `SOURCE_FILES`), mentions it explicity. +Beware of any build issues discussed on the Google Group. + +Open a GitHub pull request so that we can see that you have finished. +The title should be "Project 6: YOUR NAME". +The template of the comment section of your pull request is attached below, you can do some copy and paste: + +* [Repo Link](https://link-to-your-repo) +* (Briefly) Mentions features that you've completed. Especially those bells and whistles you want to highlight + * Feature 0 + * Feature 1 + * ... +* Feedback on the project itself, if any. diff --git a/README.md b/README.md index a744a2e..ca04d4f 100644 --- a/README.md +++ b/README.md @@ -1,297 +1,255 @@ -Instructions - Vulkan Grass Rendering -======================== +Grass Rendering with Vulkan +=============== -This is due **Sunday 11/5, evening at midnight**. +![](img/wind-circle-loop-short.gif) -**Summary:** -In this project, you will use Vulkan to implement a grass simulator and renderer. You will -use compute shaders to perform physics calculations on Bezier curves that represent individual -grass blades in your application. Since rendering every grass blade on every frame will is fairly -inefficient, you will also use compute shaders to cull grass blades that don't contribute to a given frame. -The remaining blades will be passed to a graphics pipeline, in which you will write several shaders. -You will write a vertex shader to transform Bezier control points, tessellation shaders to dynamically create -the grass geometry from the Bezier curves, and a fragment shader to shade the grass blades. +**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 6** -The base code provided includes all of the basic Vulkan setup, including a compute pipeline that will run your compute -shaders and two graphics pipelines, one for rendering the geometry that grass will be placed on and the other for -rendering the grass itself. Your job will be to write the shaders for the grass graphics pipeline and the compute pipeline, -as well as binding any resources (descriptors) you may need to accomplish the tasks described in this assignment. +* Mauricio Mutai +* Tested on: Windows 10, i7-7700HQ @ 2.2280GHz 16GB, GTX 1050Ti 4GB (Personal Computer) -![](img/grass.gif) +## Overview -You are not required to use this base code if you don't want -to. You may also change any part of the base code as you please. -**This is YOUR project.** The above .gif is just a simple example that you -can use as a reference to compare to. +### Introduction -**Important:** -- If you are not in CGGT/DMD, you may replace this project with a GPU compute -project. You MUST get this pre-approved by Austin Eng before continuing! +One of the aims of this project was to implement a simple program that renders a large number (thousands) of natural-looking grass blades. These blades should react to external forces, such as gravity and wind, as well as internal forces to maintain its structure in a sensible way. -### Contents +This grass renderer is heavily based on the work presented in [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) by Jahrmann and Wimmer. -* `src/` C++/Vulkan source files. - * `shaders/` glsl shader source files - * `images/` images used as textures within graphics pipelines -* `external/` Includes and static libraries for 3rd party libraries. -* `img/` Screenshots and images to use in your READMEs +In summary, each grass blade is represented by three points, `v0`, `v1`, and `v2`, which themselves define a quadratic Bezier curve. One of these points, `v0`, is fixed and represents position of the blade's root. `v2` represents the position of the blade's tip. `v1` is an auxiliary point used for defining the Bezier curve. In order to render one frame, we apply certain forces (wind, gravity, and recovery) to `v2` to determine its new position. We validate `v2` to make sure it does not go under ground, and then adjust `v1` to make sure the grass blade has approximately constant length. -### Installing Vulkan +Having `v0`, `v1`, and `v2`, we can then use a tessellation shader to draw a 2D shape that follows the Bezier curve defined by those points. This 2D shape is our final grass blade. -In order to run a Vulkan project, you first need to download and install the [Vulkan SDK](https://vulkan.lunarg.com/). -Make sure to run the downloaded installed as administrator so that the installer can set the appropriate environment -variables for you. +In addition, we perform some culling in order to avoid drawing blades that will not contribute significantly to the final image. Three culling methods were implemented -- see more below. -Once you have done this, you need to make sure your GPU driver supports Vulkan. Download and install a -[Vulkan driver](https://developer.nvidia.com/vulkan-driver) from NVIDIA's website. +The second (and perhaps more important) aim of this project was to get myself acquaintanced with the Vulkan API. As I completed the project, I made use of many pages of the Khronos documentation for Vulkan (for example, this page about [`vkCmdDrawIndirect()`](https://www.khronos.org/registry/vulkan/specs/1.0/man/html/vkCmdDrawIndirect.html)). -Finally, to check that Vulkan is ready for use, go to your Vulkan SDK directory (`C:/VulkanSDK/` unless otherwise specified) -and run the `cube.exe` example within the `Bin` directory. IF you see a rotating gray cube with the LunarG logo, then you -are all set! +### Features -### Running the code +Below are the renderer's main features: -While developing your grass renderer, you will want to keep validation layers enabled so that error checking is turned on. -The project is set up such that when you are in `debug` mode, validation layers are enabled, and when you are in `release` mode, -validation layers are disabled. After building the code, you should be able to run the project without any errors. You will see -a plane with a grass texture on it to begin with. +* Compute shader (`shaders/compute.comp`) + * Updates blades by applying forces (wind, gravity, recovery) + * Multiple wind forces available, selected via `#define` + * Orientation culling + * View-frustum culling + * Distance culling + * Wind direction can be used to determine blade's final color +* Grass pipeline stages + * Vertex shader (`shaders/grass.vert`) + * Computes positions modified by model matrix + * Computes bitangent vector (direction along blade's width) + * Tessellation control shader (`shaders/grass.tesc`) + * Dynamically tessellates blades to varying levels of detail depending on distance from camera + * Tessellation evaluation shader (`shaders/grass.tese`) + * Evaluates Bezier curve to place blade's vertices in correct positions + * Fragment shader (`shaders/grass.frag`) + * Two coloring modes, depending on which one was chosen in compute shader + * Wind as color (no shading) + * This maps the absolute coordinates of the wind force's direction to a color. Note this uses the final wind force (scaled by the wind alignment factor), hence why a mostly vertical wind will show up as grey. + * Lambert shading, with blades having constant green albedo color -![](img/cube_demo.png) +Below are some of the main changes made to the base code (mostly related to Vulkan): -## Requirements +* `Renderer.cpp` + * `Renderer::CreateComputeDescriptorSetLayout()` + * Define descriptor set layout for compute shader + * One buffer for storing all blades + * One buffer for storing only blades to be rendered + * One buffer for keeping track of how many blades should be rendered + * `Renderer::CreateComputeDescriptorSets()` + * Update descriptor sets for compute shader using layout above and buffers created in `Blades` objects + * `Renderer::CreateComputePipeline()` + * Define one push constant (for storing total number of blades) to pass to compute shader + * `Renderer::RecordComputeCommandBuffer()` + * Update push constant for compute shader + * `Renderer::Frame()` + * Optionally print number of blades rendered in frame by copying a `VkBuffer` back into CPU memory. Enable with `#define PRINT_NUM_BLADES 1` in `Renderer.cpp` +* `Blades.h`/`Blades.cpp` + * Define additional field `color` + * `color.w` determines whether to use `color.xyz` or default green color to render grass blade -**Ask on the mailing list for any clarifications.** +### Wind Functions -In this project, you are given the following code: +The wind functions are named after the macro that enables them in `shaders/compute.comp`. Some of these are shown in "Example GIFs" below. -* The basic setup for a Vulkan project, including the swapchain, physical device, logical device, and the pipelines described above. -* Structs for some of the uniform buffers you will be using. -* Some buffer creation utility functions. -* A simple interactive camera using the mouse. +* `WIND_X`: Periodic wind in the X direction. +* `WIND_Y`: Periodic wind mostly in the Y direction. A wind exactly in the Y direction will not move the blades due to the way `v2` is computed. +* `WIND_Z`: Periodic wind in the Z direction. +* `WIND_RADIAL`: Periodic wind that emanates outwards from the origin, creating circular waves. +* `WIND_CIRCLE`: Wind that moves around in a circular trajectory. +* `WIND_XZ`: Periodic wind in the X and Z directions. More complex than just a combination of `WIND_X` and `WIND_Z`! +* `WIND_CONST`: Constant wind in the (1, 1, -1) direction. +* `WIND_TEXT`: Periodic wind that draws mysterious text. -You need to implement the following features/pipeline stages: +## Example GIFs -* Compute shader (`shaders/compute.comp`) -* Grass pipeline stages - * Vertex shader (`shaders/grass.vert') - * Tessellation control shader (`shaders/grass.tesc`) - * Tessellation evaluation shader (`shaders/grass.tese`) - * Fragment shader (`shaders/grass.frag`) -* Binding of any extra descriptors you may need +Below are some GIFs showcasing the wind functions implemented here, as well as the two coloring modes. These were rendered with `2^15` blades and using the camera enabled by `WIND_GIF_CAMERA` in `Camera.cpp`. + +| `WIND_X`, Lambert mode | `WIND_X`, "wind as color" mode | +|:---------------------------:|:-------------------------------:| +| ![](img/wind_x_lambert.gif) | ![](img/wind_x_wind.gif) | + +| `WIND_RADIAL`, Lambert mode | `WIND_RADIAL`, "wind as color" mode | +|:--------------------------------:|:------------------------------------:| +| ![](img/wind_radial_lambert.gif) | ![](img/wind_radial_wind.gif) | + +| `WIND_CIRCLE`, Lambert mode | `WIND_CIRCLE`, "wind as color" mode | +|:--------------------------------:|:------------------------------------:| +| ![](img/wind_circle_lambert.gif) | ![](img/wind_circle_wind.gif) | -See below for more guidance. +| `WIND_XZ`, Lambert mode | `WIND_XZ`, "wind as color" mode | +|:----------------------------:|:--------------------------------:| +| ![](img/wind_xz_lambert.gif) | ![](img/wind_xz_wind.gif) | -## Base Code Tour +| `WIND_TEXT`, Lambert mode | `WIND_TEXT`, "wind as color" mode | +|:------------------------------:|:----------------------------------:| +| ![](img/wind_text_lambert.gif) | ![](img/wind_text_wind.gif) | -Areas that you need to complete are -marked with a `TODO` comment. Functions that are useful -for reference are marked with the comment `CHECKITOUT`. +## Analysis -* `src/main.cpp` is the entry point of our application. -* `src/Instance.cpp` sets up the application state, initializes the Vulkan library, and contains functions that will create our -physical and logical device handles. -* `src/Device.cpp` manages the logical device and sets up the queues that our command buffers will be submitted to. -* `src/Renderer.cpp` contains most of the rendering implementation, including Vulkan setup and resource creation. You will -likely have to make changes to this file in order to support changes to your pipelines. -* `src/Camera.cpp` manages the camera state. -* `src/Model.cpp` manages the state of the model that grass will be created on. Currently a plane is hardcoded, but feel free to -update this with arbitrary model loading! -* `src/Blades.cpp` creates the control points corresponding to the grass blades. There are many parameters that you can play with -here that will change the behavior of your rendered grass blades. -* `src/Scene.cpp` manages the scene state, including the model, blades, and simualtion time. -* `src/BufferUtils.cpp` provides helper functions for creating buffers to be used as descriptors. +### Methodology -We left out descriptions for a couple files that you likely won't have to modify. Feel free to investigate them to understand their -importance within the scope of the project. +In order to measure the performance of this renderer, I re-purposed the `Scene::UpdateTime()` function to compute the average time spent to render one frame over 2000 frames. By default, measurements are not taken nor printed out, but can be enabled with `#define PRINT_AVG_DELTA 1` in `Scene.cpp`. -## Grass Rendering +Most of the measurements were taken by rendering the scene with the default camera position and orientation. For some view-frustum related tests, a special camera closer to the origin was used (see `FRUSTUM_CULL_TEST` in `Camera.cpp`). Unless this zoomed-in camera is mentioned, the test was performed using the default camera. -This project is an implementation of the paper, [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). -Please make sure to use this paper as a primary resource while implementing your grass renderers. It does a great job of explaining -the key algorithms and math you will be using. Below is a brief description of the different components in chronological order of how your renderer will -execute, but feel free to develop the components in whatever order you prefer. +The analyses below generally compare the average time to render a frame as the number of blades increases and as certain optimizations are enabled or disabled. -### Representing Grass as Bezier Curves +### Orientation Culling -In this project, grass blades will be represented as Bezier curves while performing physics calculations and culling operations. -Each Bezier curve has three control points. -* `v0`: the position of the grass blade on the geomtry -* `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector (explained soon) -* `v2`: a physical guide for which we simulate forces on +#### Overview -We also need to store per-blade characteristics that will help us simulate and tessellate our grass blades correctly. -* `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0` -* Orientation: the orientation of the grass blade's face -* Height: the height of the grass blade -* Width: the width of the grass blade's face -* Stiffness coefficient: the stiffness of our grass blade, which will affect the force computations on our blade +When orientation culling is enabled, the compute shader uses the blade's front vector (the direction in which the blade is "facing") and the camera's look vector (the direction in which the camera is looking) in order to determine whether the blade is roughly perpendicular with respect to the camera. Since the blade is two-dimensional, a perpendicular blade will be barely visible, so we may as well cull it and prevent further stages in the pipeline from processing it. -We can pack all this data into four `vec4`s, such that `v0.w` holds orientation, `v1.w` holds height, `v2.w` holds width, and -`up.w` holds the stiffness coefficient. +#### Performance Impact -![](img/blade_model.jpg) +Below is a graph comparing the average time to render a frame with and without orientation culling enabled. -### Simulating Forces +Note the X-axis is logarithmic (there is a 4x increase in the number of blades for each step in X) for this and all subsequent graphs. -In this project, you will be simulating forces on grass blades while they are still Bezier curves. This will be done in a compute -shader using the compute pipeline that has been created for you. Remember that `v2` is our physical guide, so we will be -applying transformations to `v2` initially, then correcting for potential errors. We will finally update `v1` to maintain the appropriate -length of our grass blade. +![](img/graph-orientation.png) -#### Binding Resources +As we can see, orientation culling has a very minimal effect at first, but the improvement in performance becomes noticeable as we increase the number of blades. -In order to update the state of your grass blades on every frame, you will need to create a storage buffer to maintain the grass data. -You will also need to pass information about how much time has passed in the simulation and the time since the last frame. To do this, -you can extend or create descriptor sets that will be bound to the compute pipeline. +This is probably because relatively few of the blades are actually perpendicular enough to the camera to be culled, so not many of them culled. However, all blades need to be checked as part of the test, which slows the compute shader ever so slightly. -#### Gravity +As the number of blades increases, it's likely that the benefit of culling away a small portion of the blades becomes more evident, since our GPU is initially more saturated due to the higher number of blades. -Given a gravity direction, `D.xyz`, and the magnitude of acceleration, `D.w`, we can compute the environmental gravity in -our scene as `gE = normalize(D.xyz) * D.w`. +### Distance Culling -We then determine the contribution of the gravity with respect to the front facing direction of the blade, `f`, -as a term called the "front gravity". Front gravity is computed as `gF = (1/4) * ||gE|| * f`. +#### Overview -We can then determine the total gravity on the grass blade as `g = gE + gF`. +When distance culling is enabled, the compute shader determines the distance from the blade's `v0` point to the camera's eye. Depending on this distance, the blade is put into one of 8 buckets. In the 1st bucket, no blades are culled. For the 2nd bucket, 1 out of 8 blades are culled. For the 3rd, 2 out of 8 are culled, and so on. -#### Recovery +#### Performance Impact -Recovery corresponds to the counter-force that brings our grass blade back into equilibrium. This is derived in the paper using Hooke's law. -In order to determine the recovery force, we need to compare the current position of `v2` to its original position before -simulation started, `iv2`. At the beginning of our simulation, `v1` and `v2` are initialized to be a distance of the blade height along the `up` vector. +Below is a graph comparing the average time to render a frame with and without distance culling enabled. -Once we have `iv2`, we can compute the recovery forces as `r = (iv2 - v2) * stiffness`. +![](img/graph-distance.png) -#### Wind +Like orientation culling, distance-based culling has a minimal effect at first, but makes for a more visible improvement in performance as the number of blades increases. If we compare this graph to the orientation culling one, we can see distance culling is slightly more effective. -In order to simulate wind, you are at liberty to create any wind function you want! In order to have something interesting, -you can make the function depend on the position of `v0` and a function that changes with time. Consider using some combination -of sine or cosine functions. +Just like orientation culling, with the default camera, not many blades get culled because of their distance from the camera. However, when our GPU is saturated with hundreds of thousands of blades, culling the portion of blades that are far away enough provides noticeable improvements. -Your wind function will determine a wind direction that is affecting the blade, but it is also worth noting that wind has a larger impact on -grass blades whose forward directions are parallel to the wind direction. The paper describes this as a "wind alignment" term. We won't go -over the exact math here, but use the paper as a reference when implementing this. It does a great job of explaining this! +### View-frustum Culling -Once you have a wind direction and a wind alignment term, your total wind force (`w`) will be `windDirection * windAlignment`. +#### Overview -#### Total force +When view-frustum culling is enabled, the compute shader uses the camera's view-projection matrix to project three positions on the blade to determine if none of these points are visible to the camera -- if this is the case, the blade may be culled. Note that there is some tolerance added to this check, because a blade has width, and so may be visible even if those points are not in the frustum. -We can then determine a translation for `v2` based on the forces as `tv2 = (gravity + recovery + wind) * deltaTime`. However, we can't simply -apply this translation and expect the simulation to be robust. Our forces might push `v2` under the ground! Similarly, moving `v2` but leaving -`v1` in the same position will cause our grass blade to change length, which doesn't make sense. +The three points are `v0`, `m`, and `v2`, where `m = 0.25 * v0 + 0.5 * v1 + 0.25 * v2` is used because it actually lies on the blade's Bezier curve, unlike `v1`. -Read section 5.2 of the paper in order to learn how to determine the corrected final positions for `v1` and `v2`. +#### Performance Impact (default camera) -### Culling tests +Below is a graph comparing the average time to render a frame with and without view-frustum culling enabled, with the default camera. -Although we need to simulate forces on every grass blade at every frame, there are many blades that we won't need to render -due to a variety of reasons. Here are some heuristics we can use to cull blades that won't contribute positively to a given frame. +![](img/graph-view-default.png) -#### Orientation culling +We can see view-frustum culling has a very minimal impact. This is because, with the default camera, very few (if any) blades are actually outside the view-frustum. -Consider the scenario in which the front face direction of the grass blade is perpendicular to the view vector. Since our grass blades -won't have width, we will end up trying to render parts of the grass that are actually smaller than the size of a pixel. This could -lead to aliasing artifacts. +It might be more interesting to investigate what happens if we move the camera such that most blades are outside the view-frustum. This is what we do in the next section. -In order to remedy this, we can cull these blades! Simply do a dot product test to see if the view vector and front face direction of -the blade are perpendicular. The paper uses a threshold value of `0.9` to cull, but feel free to use what you think looks best. +#### Performance Impact (zoomed-in camera) -#### View-frustum culling +Below is a graph comparing the average time to render a frame with and without view-frustum culling enabled, with the zoomed-in camera enabled by `FRUSTUM_CULL_TEST` in `Camera.cpp`. -We also want to cull blades that are outside of the view-frustum, considering they won't show up in the frame anyway. To determine if -a grass blade is in the view-frustum, we want to compare the visibility of three points: `v0, v2, and m`, where `m = (1/4)v0 * (1/2)v1 * (1/4)v2`. -Notice that we aren't using `v1` for the visibility test. This is because the `v1` is a Bezier guide that doesn't represent a position on the grass blade. -We instead use `m` to approximate the midpoint of our Bezier curve. +![](img/graph-view-zoom.png) -If all three points are outside of the view-frustum, we will cull the grass blade. The paper uses a tolerance value for this test so that we are culling -blades a little more conservatively. This can help with cases in which the Bezier curve is technically not visible, but we might be able to see the blade -if we consider its width. +In this exaggerated case, view-frustum culling provides a huge improvement to the render time -- performance is almost doubled for `2^15` and `2^17` blades. -#### Distance culling +Although most blades are culled away, we still need to run most of the compute shader for all blades, and we still need to rasterize and draw the fragments generated by the non-culled blades, which take up most of the screen. This explains why the render time drops, but not exactly in proportion to how many blades were culled away. -Similarly to orientation culling, we can end up with grass blades that at large distances are smaller than the size of a pixel. This could lead to additional -artifacts in our renders. In this case, we can cull grass blades as a function of their distance from the camera. +### Dynamic Tessellation (Level of Detail) -You are free to define two parameters here. -* A max distance afterwhich all grass blades will be culled. -* A number of buckets to place grass blades between the camera and max distance into. +#### Overview -Define a function such that the grass blades in the bucket closest to the camera are kept while an increasing number of grass blades -are culled with each farther bucket. +In addition to general distance-based culling in the compute shader, dynamic tessellation was implemented to adjust the level of detail in the tessellation of the blades depending on their distance from the camera. -#### Occlusion culling (extra credit) +The basic idea is that a distant blade will look small enough that the viewer will not be able to distinguish between a detailed blade (say, tessellated with 4 vertical segments) and a simple quad. Thus, we could save a bit of time by tessellating distant blades with less detail and reduce the load on the tessellation evaluation shader and subsequent pipeline stages. -This type of culling only makes sense if our scene has additional objects aside from the plane and the grass blades. We want to cull grass blades that -are occluded by other geometry. Think about how you can use a depth map to accomplish this! +#### Performance Impact -### Tessellating Bezier curves into grass blades +Below is a graph comparing the average time to render a frame with and without dynamic tessellation enabled. Distance culling was also enabled for these tests. This was done to better estimate the improvements gained from dynamic tessellation -- the blades that are tessellated with less detail are also likely to be culled away by the compute shader, so tests done without distance-based culling could overestimate the improvements gained from this optimization. -In this project, you should pass in each Bezier curve as a single patch to be processed by your grass graphics pipeline. You will tessellate this patch into -a quad with a shape of your choosing (as long as it looks sufficiently like grass of course). The paper has some examples of grass shapes you can use as inspiration. +![](img/graph-lod.png) -In the tessellation control shader, specify the amount of tessellation you want to occur. Remember that you need to provide enough detail to create the curvature of a grass blade. +As we can see, dynamic tessellation provides a small improvement to performance. Even at a high number of blades, the decrease in runtime is quite small. This suggests the bulk of the work is not done in the tessellation evaluation shader and subsequent stages. -The generated vertices will be passed to the tessellation evaluation shader, where you will place the vertices in world space, respecting the width, height, and orientation information -of each blade. Once you have determined the world space position of each vector, make sure to set the output `gl_Position` in clip space! +### All Culling Combined -** Extra Credit**: Tessellate to varying levels of detail as a function of how far the grass blade is from the camera. For example, if the blade is very far, only generate four vertices in the tessellation control shader. +#### Overview -To build more intuition on how tessellation works, I highly recommend playing with the [helloTessellation sample](https://github.com/CIS565-Fall-2017/Vulkan-Samples/tree/master/samples/5_helloTessellation) -and reading this [tutorial on tessellation](http://in2gpu.com/2014/07/12/tessellation-tutorial-opengl-4-3/). +Below, we investigate the effects of enabling all our culling methods. -## Resources +#### Performance Impact (default camera) -### Links +Below is a graph comparing the average time to render a frame with all culling methods enabled and disabled, with the default camera. -The following resources may be useful for this project. +![](img/graph-all-default.png) -* [Responsive Real-Time Grass Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) -* [CIS565 Vulkan samples](https://github.com/CIS565-Fall-2017/Vulkan-Samples) -* [Official Vulkan documentation](https://www.khronos.org/registry/vulkan/) -* [Vulkan tutorial](https://vulkan-tutorial.com/) -* [RenderDoc blog on Vulkan](https://renderdoc.org/vulkan-in-30-minutes.html) -* [Tessellation tutorial](http://in2gpu.com/2014/07/12/tessellation-tutorial-opengl-4-3/) +The improvements from the combined culling are, as before, more noticeable when there are more grass blades. Combining all culling methods leads to better performance than when any individual culling method is enabled by itself. +We know view-frustum culling is highly dependent on the camera's position and orientation, so let us investigate the results using the zoomed-in camera. -## Third-Party Code Policy +#### Performance Impact (zoomed-in camera) -* Use of any third-party code must be approved by asking on our Google Group. -* If it is approved, all students are welcome to use it. Generally, we approve - use of third-party code that is not a core part of the project. For example, - for the path tracer, we would approve using a third-party library for loading - models, but would not approve copying and pasting a CUDA function for doing - refraction. -* Third-party code **MUST** be credited in README.md. -* Using third-party code without its approval, including using another - student's code, is an academic integrity violation, and will, at minimum, - result in you receiving an F for the semester. +Below is a graph comparing the average time to render a frame with all culling methods enabled and disabled, as well as with only distance culling enabled, with the zoomed-in camera. +![](img/graph-all-zoom.png) -## README +Although the distance-based culling is responsible for most of improvement in runtime, the other culling methods also help to further reduce the render time. -* A brief description of the project and the specific features you implemented. -* At least one screenshot of your project running. -* A performance analysis (described below). +## Other Notes -### Performance Analysis +* The GIF at the beginning of the README was rendered with `2^14` blades, "wind as color" mode enabled, and with radial wind enabled. -The performance analysis is where you will investigate how... -* Your renderer handles varying numbers of grass blades -* The improvement you get by culling using each of the three culling tests +* There are several `#define`s in `shaders/compute.comp` to toggle certain features: +``` +#define WIND_X 0 +#define WIND_Y 1 +#define WIND_Z 2 +#define WIND_RADIAL 3 +#define WIND_CIRCLE 4 +#define WIND_XZ 5 +#define WIND_CONST 6 +#define WIND_TEXT 7 -## Submit +// WIND_TYPE defines which wind function will be used. +// It should be one of the values defined immediately above. +#define WIND_TYPE WIND_XZ -If you have modified any of the `CMakeLists.txt` files at all (aside from the -list of `SOURCE_FILES`), mentions it explicity. -Beware of any build issues discussed on the Google Group. +// Defines the radius of the circle moving around in the circular trajectory +// in WIND_CIRCLE. +#define WIND_CIRCLE_RADIUS 5.0 -Open a GitHub pull request so that we can see that you have finished. -The title should be "Project 6: YOUR NAME". -The template of the comment section of your pull request is attached below, you can do some copy and paste: +// If 0, uses Lambert shading and default green albedo. +// Otherwise, uses wind as color. +#define USE_CUSTOM_COLOR 1 -* [Repo Link](https://link-to-your-repo) -* (Briefly) Mentions features that you've completed. Especially those bells and whistles you want to highlight - * Feature 0 - * Feature 1 - * ... -* Feedback on the project itself, if any. +// Enable each culling method. +#define ORIENTATION_CULL 1 +#define FRUSTUM_CULL 1 +#define DISTANCE_CULL 1 +``` + diff --git a/img/graph-all-default.png b/img/graph-all-default.png new file mode 100644 index 0000000..4232b56 Binary files /dev/null and b/img/graph-all-default.png differ diff --git a/img/graph-all-zoom.png b/img/graph-all-zoom.png new file mode 100644 index 0000000..0689c37 Binary files /dev/null and b/img/graph-all-zoom.png differ diff --git a/img/graph-distance.png b/img/graph-distance.png new file mode 100644 index 0000000..41e1c57 Binary files /dev/null and b/img/graph-distance.png differ diff --git a/img/graph-lod.png b/img/graph-lod.png new file mode 100644 index 0000000..21d3f6a Binary files /dev/null and b/img/graph-lod.png differ diff --git a/img/graph-orientation.png b/img/graph-orientation.png new file mode 100644 index 0000000..8f8d2e1 Binary files /dev/null and b/img/graph-orientation.png differ diff --git a/img/graph-view-default.png b/img/graph-view-default.png new file mode 100644 index 0000000..d4115c0 Binary files /dev/null and b/img/graph-view-default.png differ diff --git a/img/graph-view-zoom.png b/img/graph-view-zoom.png new file mode 100644 index 0000000..313fee1 Binary files /dev/null and b/img/graph-view-zoom.png differ diff --git a/img/wind-circle-loop-short.gif b/img/wind-circle-loop-short.gif new file mode 100644 index 0000000..d364d9c Binary files /dev/null and b/img/wind-circle-loop-short.gif differ diff --git a/img/wind_circle_lambert.gif b/img/wind_circle_lambert.gif new file mode 100644 index 0000000..d1ec312 Binary files /dev/null and b/img/wind_circle_lambert.gif differ diff --git a/img/wind_circle_wind.gif b/img/wind_circle_wind.gif new file mode 100644 index 0000000..e5e185d Binary files /dev/null and b/img/wind_circle_wind.gif differ diff --git a/img/wind_radial_lambert.gif b/img/wind_radial_lambert.gif new file mode 100644 index 0000000..82ded49 Binary files /dev/null and b/img/wind_radial_lambert.gif differ diff --git a/img/wind_radial_wind.gif b/img/wind_radial_wind.gif new file mode 100644 index 0000000..dd9d513 Binary files /dev/null and b/img/wind_radial_wind.gif differ diff --git a/img/wind_text_lambert.gif b/img/wind_text_lambert.gif new file mode 100644 index 0000000..da013f8 Binary files /dev/null and b/img/wind_text_lambert.gif differ diff --git a/img/wind_text_wind.gif b/img/wind_text_wind.gif new file mode 100644 index 0000000..052ad22 Binary files /dev/null and b/img/wind_text_wind.gif differ diff --git a/img/wind_x_lambert.gif b/img/wind_x_lambert.gif new file mode 100644 index 0000000..f3ea356 Binary files /dev/null and b/img/wind_x_lambert.gif differ diff --git a/img/wind_x_wind.gif b/img/wind_x_wind.gif new file mode 100644 index 0000000..747b750 Binary files /dev/null and b/img/wind_x_wind.gif differ diff --git a/img/wind_xz_lambert.gif b/img/wind_xz_lambert.gif new file mode 100644 index 0000000..865da33 Binary files /dev/null and b/img/wind_xz_lambert.gif differ diff --git a/img/wind_xz_wind.gif b/img/wind_xz_wind.gif new file mode 100644 index 0000000..5d00355 Binary files /dev/null and b/img/wind_xz_wind.gif differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..00de765 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -9,7 +9,7 @@ float generateRandomFloat() { Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Model(device, commandPool, {}, {}) { std::vector blades; blades.reserve(NUM_BLADES); - + //srand(123); for (int i = 0; i < NUM_BLADES; i++) { Blade currentBlade = Blade(); @@ -35,6 +35,9 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode float stiffness = MIN_BEND + (generateRandomFloat() * (MAX_BEND - MIN_BEND)); currentBlade.up = glm::vec4(bladeUp, stiffness); + // custom color + currentBlade.color = glm::vec4(1.0f); + blades.push_back(currentBlade); } diff --git a/src/Blades.h b/src/Blades.h index 9bd1eed..3412b03 100644 --- a/src/Blades.h +++ b/src/Blades.h @@ -4,7 +4,7 @@ #include #include "Model.h" -constexpr static unsigned int NUM_BLADES = 1 << 13; +constexpr static unsigned int NUM_BLADES = 1 << 15; constexpr static float MIN_HEIGHT = 1.3f; constexpr static float MAX_HEIGHT = 2.5f; constexpr static float MIN_WIDTH = 0.1f; @@ -21,6 +21,8 @@ struct Blade { glm::vec4 v2; // Up vector and stiffness coefficient glm::vec4 up; + // special vec4 for custom colors: .xyz is RGB, .w is whether to use this color + glm::vec4 color; static VkVertexInputBindingDescription getBindingDescription() { VkVertexInputBindingDescription bindingDescription = {}; @@ -31,8 +33,8 @@ struct Blade { return bindingDescription; } - static std::array getAttributeDescriptions() { - std::array attributeDescriptions = {}; + static std::array getAttributeDescriptions() { + std::array attributeDescriptions = {}; // v0 attributeDescriptions[0].binding = 0; @@ -58,6 +60,12 @@ struct Blade { attributeDescriptions[3].format = VK_FORMAT_R32G32B32A32_SFLOAT; attributeDescriptions[3].offset = offsetof(Blade, up); + // color + attributeDescriptions[4].binding = 0; + attributeDescriptions[4].location = 4; + attributeDescriptions[4].format = VK_FORMAT_R32G32B32A32_SFLOAT; + attributeDescriptions[4].offset = offsetof(Blade, color); + return attributeDescriptions; } }; diff --git a/src/BufferUtils.cpp b/src/BufferUtils.cpp index acf617e..5c4f7f4 100644 --- a/src/BufferUtils.cpp +++ b/src/BufferUtils.cpp @@ -79,7 +79,8 @@ void BufferUtils::CreateBufferFromData(Device* device, VkCommandPool commandPool vkUnmapMemory(device->GetVkDevice(), stagingBufferMemory); // Create the buffer - VkBufferUsageFlags usage = VK_BUFFER_USAGE_TRANSFER_DST_BIT | bufferUsage; + // for reading stuff, has to be SRC as well + VkBufferUsageFlags usage = VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_TRANSFER_SRC_BIT | bufferUsage; VkMemoryPropertyFlags flags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT; BufferUtils::CreateBuffer(device, bufferSize, usage, flags, buffer, bufferMemory); diff --git a/src/Camera.cpp b/src/Camera.cpp index 3afb5b8..1083203 100644 --- a/src/Camera.cpp +++ b/src/Camera.cpp @@ -5,9 +5,39 @@ #define GLM_FORCE_DEPTH_ZERO_TO_ONE #include +#define FRUSTUM_CULL_TEST 0 +#define WIND_GIF_CAMERA 0 + #include "Camera.h" #include "BufferUtils.h" +#if FRUSTUM_CULL_TEST +Camera::Camera(Device* device, float aspectRatio) : device(device) { + r = 2.0f; + theta = 0.0f; + phi = 0.0f; + cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 2.0f, 2.0f), glm::vec3(0.0f, 0.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f)); + cameraBufferObject.projectionMatrix = glm::perspective(glm::radians(45.0f), aspectRatio, 0.1f, 100.0f); + cameraBufferObject.projectionMatrix[1][1] *= -1; // y-coordinate is flipped + + BufferUtils::CreateBuffer(device, sizeof(CameraBufferObject), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, buffer, bufferMemory); + vkMapMemory(device->GetVkDevice(), bufferMemory, 0, sizeof(CameraBufferObject), 0, &mappedData); + memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject)); +} +#elif WIND_GIF_CAMERA +Camera::Camera(Device* device, float aspectRatio) : device(device) { + r = 10.0f; + theta = 0.0f; + phi = 0.0f; + cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 13.0f, 19.0f), glm::vec3(0.0f, 0.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f)); + cameraBufferObject.projectionMatrix = glm::perspective(glm::radians(45.0f), aspectRatio, 0.1f, 100.0f); + cameraBufferObject.projectionMatrix[1][1] *= -1; // y-coordinate is flipped + + BufferUtils::CreateBuffer(device, sizeof(CameraBufferObject), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, buffer, bufferMemory); + vkMapMemory(device->GetVkDevice(), bufferMemory, 0, sizeof(CameraBufferObject), 0, &mappedData); + memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject)); +} +#else Camera::Camera(Device* device, float aspectRatio) : device(device) { r = 10.0f; theta = 0.0f; @@ -20,6 +50,8 @@ Camera::Camera(Device* device, float aspectRatio) : device(device) { vkMapMemory(device->GetVkDevice(), bufferMemory, 0, sizeof(CameraBufferObject), 0, &mappedData); memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject)); } +#endif // FRUSTUM_CULL_TEST + VkBuffer Camera::GetBuffer() const { return buffer; diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..790d272 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -5,6 +5,9 @@ #include "Blades.h" #include "Camera.h" #include "Image.h" +#include "BufferUtils.h" + +#define PRINT_NUM_BLADES 0 static constexpr unsigned int WORKGROUP_SIZE = 32; @@ -195,9 +198,43 @@ void Renderer::CreateTimeDescriptorSetLayout() { } void Renderer::CreateComputeDescriptorSetLayout() { - // TODO: Create the descriptor set layout for the compute pipeline + // TODOX: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + // Describe the binding of the descriptor set layout + // TODOX: just copied this from time descriptor + VkDescriptorSetLayoutBinding inputBladesLayoutBinding = {}; + inputBladesLayoutBinding.binding = 0; + inputBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + inputBladesLayoutBinding.descriptorCount = 1; // TODO: number of input blades??? + inputBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + inputBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding culledBladesLayoutBinding = {}; + culledBladesLayoutBinding.binding = 1; + culledBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesLayoutBinding.descriptorCount = 1; // TODO: number of input blades??? + culledBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding numBladesLayoutBinding = {}; + numBladesLayoutBinding.binding = 2; + numBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numBladesLayoutBinding.descriptorCount = 1; + numBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numBladesLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { inputBladesLayoutBinding, culledBladesLayoutBinding, numBladesLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &grassComputeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -216,6 +253,14 @@ void Renderer::CreateDescriptorPool() { { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, // TODO: Add any additional types and counts of descriptors you will need to allocate + // Input blades (compute) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(scene->GetBlades().size()) }, + + // Culled blades (compute) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(scene->GetBlades().size()) }, + + // Number of remaining blades (compute) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(scene->GetBlades().size()) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -320,6 +365,42 @@ void Renderer::CreateModelDescriptorSets() { void Renderer::CreateGrassDescriptorSets() { // TODO: Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(1 * grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassBufferInfo = {}; + grassBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + grassBufferInfo.offset = 0; + grassBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &grassBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -360,6 +441,75 @@ void Renderer::CreateTimeDescriptorSet() { void Renderer::CreateComputeDescriptorSets() { // TODO: Create Descriptor sets for the compute pipeline // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + grassComputeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { grassComputeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassComputeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassComputeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(3 * grassComputeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo inputBladesBufferInfo = {}; + inputBladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + inputBladesBufferInfo.offset = 0; + inputBladesBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo culledBladesBufferInfo = {}; + culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesBufferInfo.offset = 0; + culledBladesBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo numBladesBufferInfo = {}; + numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numBladesBufferInfo.offset = 0; + numBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = grassComputeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].dstArrayElement = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; // TODO: should this be 1? same for culledBlades below??? + // I think so, because docs say descriptorCount is number of elts in pBufferInfo, + // and pBufferInfo is array of VkDescriptorBufferInfo, of which we only have one (inputBladesBufferInfo) + // https://www.khronos.org/registry/vulkan/specs/1.0/man/html/VkWriteDescriptorSet.html + descriptorWrites[3 * i + 0].pBufferInfo = &inputBladesBufferInfo; + descriptorWrites[3 * i + 0].pImageInfo = nullptr; + descriptorWrites[3 * i + 0].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = grassComputeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].dstArrayElement = 0; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo; + descriptorWrites[3 * i + 1].pImageInfo = nullptr; + descriptorWrites[3 * i + 1].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = grassComputeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].dstArrayElement = 0; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo; + descriptorWrites[3 * i + 2].pImageInfo = nullptr; + descriptorWrites[3 * i + 2].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -717,15 +867,22 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.pName = "main"; // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, grassComputeDescriptorSetLayout }; + + // Define push constant stuff to hold NUM_BLADES + VkPushConstantRange pushConstantRange = {}; + pushConstantRange.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + pushConstantRange.offset = 0; + pushConstantRange.size = sizeof(int); + VkPushConstantRange pushConstantRangeArray[] = { pushConstantRange }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; pipelineLayoutInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; pipelineLayoutInfo.setLayoutCount = static_cast(descriptorSetLayouts.size()); pipelineLayoutInfo.pSetLayouts = descriptorSetLayouts.data(); - pipelineLayoutInfo.pushConstantRangeCount = 0; - pipelineLayoutInfo.pPushConstantRanges = 0; + pipelineLayoutInfo.pushConstantRangeCount = 1; + pipelineLayoutInfo.pPushConstantRanges = pushConstantRangeArray; if (vkCreatePipelineLayout(logicalDevice, &pipelineLayoutInfo, nullptr, &computePipelineLayout) != VK_SUCCESS) { throw std::runtime_error("Failed to create pipeline layout"); @@ -883,7 +1040,15 @@ void Renderer::RecordComputeCommandBuffer() { // Bind descriptor set for time uniforms vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); + // Update push constants + int pushValues[] = { NUM_BLADES }; + vkCmdPushConstants(computeCommandBuffer, computePipelineLayout, VK_SHADER_STAGE_COMPUTE_BIT, 0, sizeof(int), pushValues); + // TODO: For each group of blades bind its descriptor set and dispatch + for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &grassComputeDescriptorSets[j], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, (int)ceil((float)NUM_BLADES / WORKGROUP_SIZE), 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -973,16 +1138,20 @@ void Renderer::RecordCommandBuffers() { vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipeline); for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { - VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; + VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };//{ scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); // TODO: Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + // CHECKITOUT: it's getNumBladesBuffer that specifies how many threads are spawned + // see: https://www.khronos.org/registry/vulkan/specs/1.0/man/html/vkCmdDrawIndirect.html + // see: https://www.khronos.org/registry/vulkan/specs/1.0/man/html/VkDrawIndirectCommand.html + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1033,6 +1202,35 @@ void Renderer::Frame() { throw std::runtime_error("Failed to submit draw command buffer"); } +#if PRINT_NUM_BLADES + // try to read numBladesBuffer ============================================ + // Create the staging buffer + VkBuffer stagingBuffer; + VkDeviceMemory stagingBufferMemory; + VkDeviceSize bufferSize = sizeof(BladeDrawIndirect); + + VkBufferUsageFlags stagingUsage = VK_BUFFER_USAGE_TRANSFER_DST_BIT; + VkMemoryPropertyFlags stagingProperties = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT; + BufferUtils::CreateBuffer(device, bufferSize, stagingUsage, stagingProperties, stagingBuffer, stagingBufferMemory); + + // Fill the staging buffer + void *data; + vkMapMemory(device->GetVkDevice(), stagingBufferMemory, 0, bufferSize, 0, &data); + // CPU mem "data" should now be mapped to a vkbuffer!!?? + // so copy data into "data" + BufferUtils::CopyBuffer(device, computeCommandPool, scene->GetBlades()[0]->GetNumBladesBuffer(), stagingBuffer, bufferSize); + vkUnmapMemory(device->GetVkDevice(), stagingBufferMemory); + + // read "data" + BladeDrawIndirect* indirectDraw = (BladeDrawIndirect*)data; + printf("num blades: %d\n", indirectDraw->vertexCount); + + // No need for the staging buffer anymore + vkDestroyBuffer(device->GetVkDevice(), stagingBuffer, nullptr); + vkFreeMemory(device->GetVkDevice(), stagingBufferMemory, nullptr); + // try to read numBladesBuffer ============================================ +#endif // PRINT_NUM_BLADES + if (!swapChain->Present()) { RecreateFrameResources(); } @@ -1057,6 +1255,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, grassComputeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..78ab81d 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -56,12 +56,15 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout grassComputeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; + std::vector grassDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector grassComputeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/Scene.cpp b/src/Scene.cpp index 86894f2..768b790 100644 --- a/src/Scene.cpp +++ b/src/Scene.cpp @@ -1,7 +1,9 @@ #include "Scene.h" #include "BufferUtils.h" -Scene::Scene(Device* device) : device(device) { +#define PRINT_AVG_DELTA 0 + +Scene::Scene(Device* device) : device(device), deltaAcc(0.0f), deltaCount(0) { BufferUtils::CreateBuffer(device, sizeof(Time), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, timeBuffer, timeBufferMemory); vkMapMemory(device->GetVkDevice(), timeBufferMemory, 0, sizeof(Time), 0, &mappedData); memcpy(mappedData, &time, sizeof(Time)); @@ -32,6 +34,16 @@ void Scene::UpdateTime() { time.totalTime += time.deltaTime; memcpy(mappedData, &time, sizeof(Time)); +#if PRINT_AVG_DELTA + deltaAcc += time.deltaTime; + deltaCount++; + + if (deltaCount >= MAX_DELTA_COUNT) { + printf("avg delta: %.3f ms\n", 1000.0f * deltaAcc / (float)deltaCount); + deltaAcc = 0.0f; + deltaCount = 0; + } +#endif // PRINT_AVG_DELTA } VkBuffer Scene::GetTimeBuffer() const { diff --git a/src/Scene.h b/src/Scene.h index 7699d78..4f9d9bd 100644 --- a/src/Scene.h +++ b/src/Scene.h @@ -6,6 +6,8 @@ #include "Model.h" #include "Blades.h" +#define MAX_DELTA_COUNT 2000 + using namespace std::chrono; struct Time { @@ -26,6 +28,9 @@ class Scene { std::vector models; std::vector blades; + float deltaAcc; // accumulates deltaTime + int deltaCount; // counts how many times deltaTime has been accumulated + high_resolution_clock::time_point startTime = high_resolution_clock::now(); public: diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..98b9a42 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -1,9 +1,37 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define DISTANCE_BUCKETS 8 +#define MAX_DISTANCE 65.0 + +#define WIND_TO_COLOR_FACTOR 0.03 +#define WIND_X 0 +#define WIND_Y 1 +#define WIND_Z 2 +#define WIND_RADIAL 3 +#define WIND_CIRCLE 4 +#define WIND_XZ 5 +#define WIND_CONST 6 +#define WIND_TEXT 7 + +#define WIND_TYPE WIND_XZ + +#define WIND_CIRCLE_RADIUS 5.0 + +#define USE_CUSTOM_COLOR 1 + +#define ORIENTATION_CULL 1 +#define FRUSTUM_CULL 1 +#define DISTANCE_CULL 1 + #define WORKGROUP_SIZE 32 layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; +// based on: https://stackoverflow.com/questions/37056159/using-different-push-constants-in-different-shader-stages +layout(push_constant) uniform s_pushConstants { + int numBlades; +} pushConstants; + layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; @@ -19,6 +47,7 @@ struct Blade { vec4 v1; vec4 v2; vec4 up; + vec4 color; }; // TODO: Add bindings to: @@ -36,21 +65,266 @@ struct Blade { // uint firstInstance; // = 0 // } numBlades; +layout(set = 2, binding = 0) buffer InputBlades { + Blade inputBlades[]; +}; + +//output +layout(set = 2, binding = 1) buffer CulledBlades { + Blade culledBlades[]; +}; + +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; + bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); } +#if WIND_TYPE == WIND_TEXT +#define WIND_TEXT_MAX_DISTANCE 0.7 +// I derived this while waiting for food at Wawa. Hopefully it is good enough +// segA : one end of line segment +// segB : other end +vec3 getWindFromLineSegment(vec3 p, vec3 segA, vec3 segB) { + vec3 pA = segA - p; + vec3 pB = segB - p; + vec3 AB = normalize(segB - segA); + float dotABpA = dot(AB, pA); + float dotABpB = dot(AB, pB); + // check if p's "projection" is on the line segment + if (sign(dotABpA) == sign(dotABpB)) { + // in this case, it is not, so return no wind + return vec3(0.0); + } + + // compute h, height of triangle defined by p, segA, segB, relative to p + // to do this, compute l, leg of right triangle defined by h, pA + //AB = normalize(AB); + // equivalent to: ||AB|| ||pA|| cos(x) * AB + // 1 ||pA|| cos(x) * AB + // ||l|| * AB + vec3 l = dotABpA * AB; + vec3 h = pA - l; + float dist = length(h); + if (dist > WIND_TEXT_MAX_DISTANCE) { + return vec3(0.0); + } + vec3 dir = -normalize(h); + return dir * (1.0 - dist / WIND_TEXT_MAX_DISTANCE); +} +#endif + void main() { // Reset the number of blades to 0 if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point + // TODO: push constant??? + if (gl_GlobalInvocationID.x >= pushConstants.numBlades) { + return; + } + // TODO: Apply forces on every blade and update the vertices in the buffer + Blade blade = inputBlades[gl_GlobalInvocationID.x]; + + // extract things stored in Ws + float orientation = blade.v0.w; + float height = blade.v1.w; + float width = blade.v2.w; + float stiffness = blade.up.w; + + // recovery force ========================================================= + // compute "initial position" + // go up from base (v0) + vec3 initialV2 = blade.v0.xyz + blade.up.xyz * height; + vec3 recovery = (initialV2 - blade.v2.xyz) * stiffness; + + // gravity ================================================================ + // environmental gravity + // hardcoded + vec3 gEnv = vec3(0.0, -1.0, 0.0); + + // front direction for front gravity + vec3 front = vec3(cos(orientation), 0.0, sin(orientation)); + vec3 gFront = 0.25 * length(gEnv) * front; + vec3 gravity = gEnv + gFront; + + // wind =================================================================== + // hardcode for now + // "raw" because it is not the final wind "force" + // btw, these aren't really forces, are they. they are velocities. they aren't accelerating anything and ignore mass. +#if WIND_TYPE == WIND_X + vec3 windRaw = vec3(1.0, 0.0, 0.0) * (sin(totalTime) + 1.0) * 200.0; +#elif WIND_TYPE == WIND_Y + // add some X and Z components to wind because we need to nudge v2 + // to get non-zero wind alignment factor + vec3 windRaw = vec3(0.0, 0.9, 0.0) * (sin(totalTime)) * 400.0 + vec3(0.1, 0.0, 0.1) * 200.0; +#elif WIND_TYPE == WIND_Z + vec3 windRaw = vec3(0.0, 0.0, 1.0) * (sin(totalTime) + 1.0) * 200.0; +#elif WIND_TYPE == WIND_XZ + // 90% is determined by pow() term. we use pow to exaggerate the shape of sin() so it spends + // less time on higher values. this makes the wind spend less time being strong + // add constant 10% to prevent wind from completely stopping unnaturally + vec3 windRaw = vec3(1.0, 0.0, 0.0) * (pow((sin(totalTime * 0.75) * 0.5 + 0.5001), 4.0) * 180.0 + 20.0); + // make it depend on blade's Z as well so it's not one uniform wind force + windRaw *= (cos(totalTime * 2.5 + blade.v0.z) * 0.5 + 0.5001) * 0.8 + 0.2; + // very light dependency on X as well + windRaw *= (cos(totalTime * -8.0 + blade.v0.x) * 0.5 + 0.5001) * 0.5 + 0.5; +#elif WIND_TYPE == WIND_RADIAL + vec3 windRaw = normalize(blade.v0.xyz) * (sin(-totalTime * 5.0f + length(blade.v0.xyz)) + 1.0) * 200.0; +#elif WIND_TYPE == WIND_CIRCLE + vec3 circlePoint = WIND_CIRCLE_RADIUS * vec3(cos(totalTime), 0.0, sin(totalTime)); + // circleTangent will be wind direction + vec3 circleTangent = vec3(-circlePoint.z, 0.0, circlePoint.x); // -x/y is perpendicular slope to x/y; + // negating Y instead of X gives right direction + float circlePointDist = distance(circlePoint, blade.v0.xyz); + float windStrength = (circlePointDist < WIND_CIRCLE_RADIUS) ? (1.0 - circlePointDist / WIND_CIRCLE_RADIUS) * 200.0 + : 0.0; + vec3 windRaw = circleTangent * windStrength; +#elif WIND_TYPE == WIND_CONST + vec3 windRaw = vec3(0.577350269, 0.577350269, -0.577350269) * 200.0; +#elif WIND_TYPE == WIND_TEXT + // each call to getWindFromLineSegment() will draw one line segment + // draw left 5 + vec3 windRaw = getWindFromLineSegment(blade.v0.xyz, vec3(-7.0, 0.0, -4.0), vec3(-3.0, 0.0, -4.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-7.0, 0.0, -4.0), vec3(-7.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-3.0, 0.0, 1.0), vec3(-7.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-3.0, 0.0, 1.0), vec3(-3.0, 0.0, 6.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-7.0, 0.0, 6.0), vec3(-3.0, 0.0, 6.0)); + // draw 6 + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-2.0, 0.0, -4.0), vec3( 2.0, 0.0, -4.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-2.0, 0.0, -4.0), vec3(-2.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 2.0, 0.0, 1.0), vec3(-2.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 2.0, 0.0, 1.0), vec3( 2.0, 0.0, 6.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-2.0, 0.0, 6.0), vec3( 2.0, 0.0, 6.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3(-2.0, 0.0, 6.0), vec3(-2.0, 0.0, 1.0)); + // draw right 5 + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 3.0, 0.0, -4.0), vec3( 7.0, 0.0, -4.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 3.0, 0.0, -4.0), vec3( 3.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 7.0, 0.0, 1.0), vec3( 3.0, 0.0, 1.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 7.0, 0.0, 1.0), vec3( 7.0, 0.0, 6.0)); + windRaw += getWindFromLineSegment(blade.v0.xyz, vec3( 3.0, 0.0, 6.0), vec3( 7.0, 0.0, 6.0)); + windRaw *= 150.0 * pow((sin(totalTime) * 0.5 + 0.5001), 4.0); +#endif + float windDirectionalAlignment = 1.0 - abs(dot(normalize(windRaw), normalize(blade.v2.xyz - blade.v0.xyz))); + float windHeightRatio = dot(blade.v2.xyz - blade.v0.xyz, blade.up.xyz) / height; + float windAlignment = windDirectionalAlignment * windHeightRatio; + + // check if windRaw == vec3(0.0). if it is, windalignment is probably nan + vec3 wind = windRaw * (windRaw == vec3(0.0) ? 1.0 : windAlignment); + + // reaction =============================================================== + vec3 reaction = (recovery + gravity + wind) * deltaTime; + + // "candidate" v2 -- validate before storing in blade + vec3 candidateV2 = blade.v2.xyz + reaction; + + // validate v2 ============================================================ + candidateV2 = candidateV2 - blade.up.xyz * min(dot(blade.up.xyz, candidateV2 - blade.v0.xyz), 0.0); + + // compute V1 ============================================================= + float projectedLength = length(candidateV2 - blade.v0.xyz - blade.up.xyz * dot(candidateV2 - blade.v0.xyz, blade.up.xyz)); + vec3 candidateV1 = blade.v0.xyz + height * blade.up.xyz * max(1.0 - projectedLength / height, + 0.05 * max(projectedLength / height, 1.0)); + + // validate V1 ============================================================ + // formula 12 for n = 2 + float bezierLength = (2.0 * distance(candidateV2, blade.v0.xyz) + distance(candidateV1, blade.v0.xyz) + distance(candidateV2, candidateV1)) / 3.0; + float heightLengthRatio = height / bezierLength; + + // write corrected values to blade + blade.v1.xyz = blade.v0.xyz + heightLengthRatio * (candidateV1 - blade.v0.xyz); + blade.v2.xyz = blade.v1.xyz + heightLengthRatio * (candidateV2 - candidateV1); + //blade.v2.xyz += vec3(sin(totalTime), 0.0, cos(totalTime)); + //blade.v2.xyz = candidateV2; + + // set custom color ======================================================= +#if USE_CUSTOM_COLOR + wind = abs(wind); + blade.color.xyz = wind * WIND_TO_COLOR_FACTOR * 0.8 + vec3(0.2); + blade.color.w = 1.0; +#else + blade.color.w = 0.0; +#endif + + // store updated blade ==================================================== + inputBlades[gl_GlobalInvocationID.x] = blade; + // TODO: Cull blades that are too far away or not in the camera frustum and write them // to the culled blades buffer // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount // You want to write the visible blades to the buffer without write conflicts between threads + // use atomicAdd + + mat4 inverseView = inverse(camera.view); +#if ORIENTATION_CULL + // orientation culling ==================================================== + + vec3 viewDir = normalize(vec3(inverseView * vec4(0.0, 0.0, 1.0, 0.0))); + // if viewDir and front are roughly perpendicular, cull + if (abs(dot(viewDir, front)) <= 0.05) { + return; + } +#endif // ORIENTATION_CULL + +#if FRUSTUM_CULL + // view frustum culling =================================================== + vec3 midpoint = 0.25 * blade.v0.xyz + 0.5 * blade.v1.xyz + 0.25 * blade.v2.xyz; + + mat4 viewProj = camera.proj * camera.view; + + // test V0 + const float frustumTolerance = 0.05; + vec4 projPoint = viewProj * vec4(blade.v0.xyz, 1.0); + float frustumLimit = projPoint.w + frustumTolerance; + + if (!inBounds(projPoint.x, frustumLimit) && !inBounds(projPoint.y, frustumLimit)) { + // test midpoint + projPoint = viewProj * vec4(midpoint, 1.0); + frustumLimit = projPoint.w + frustumTolerance; + + if (!inBounds(projPoint.x, frustumLimit) && !inBounds(projPoint.y, frustumLimit)) { + // test V2 + projPoint = viewProj * vec4(blade.v2.xyz, 1.0); + frustumLimit = projPoint.w + frustumTolerance; + + // V0, midpoint, V2 are all outside frustum: cull + if (!inBounds(projPoint.x, frustumLimit) && !inBounds(projPoint.y, frustumLimit)) { + return; + } + } + } +#endif // FRUSTUM_CULL + +#if DISTANCE_CULL + // distance culling ======================================================= + vec3 cameraEye = vec3(inverseView[3]); + vec3 eyeToBlade = blade.v0.xyz - cameraEye; + float projDistance = length(eyeToBlade - blade.up.xyz * dot(eyeToBlade, blade.up.xyz)); + // cull if too far + if (projDistance > MAX_DISTANCE) { + return; + } + int indexMod = int(gl_GlobalInvocationID.x % DISTANCE_BUCKETS); + int cullability = int(floor(float(DISTANCE_BUCKETS) * (1.0 - projDistance / MAX_DISTANCE))); + if (indexMod > cullability) { + return; // cull + } +#endif // DISTANCE_CULL + + //blade.v0.x += 1.0 * sin(totalTime);// + 3.14159265 * 0.5); + //blade.v0.z += 1.0 * cos(totalTime); + + uint idx = atomicAdd(numBlades.vertexCount, 1); + culledBlades[idx] = blade; //inputBlades[gl_GlobalInvocationID.x]; + + } diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..a6054a8 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -7,11 +7,24 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { } camera; // TODO: Declare fragment shader inputs +layout(location = 0) in vec2 fs_uv; +layout(location = 1) in vec3 fs_normal; +layout(location = 2) in vec4 fs_color; layout(location = 0) out vec4 outColor; void main() { - // TODO: Compute fragment color - - outColor = vec4(1.0); + if (fs_color.w > 0.0) { + // use custom color in fs_color + outColor = vec4(fs_color.xyz, 1.0); + } + else { + // use green + lambert shading + const vec3 lightDir = vec3(-0.577350269, 0.577350269, 0.577350269); + float lambert = max(dot(fs_normal, lightDir), dot(-fs_normal, lightDir)); + lambert = clamp(lambert, 0.25, 1.0) * 0.5 + 0.5; + vec3 color = vec3(0.1, 0.9, 0.2) * lambert; + outColor = vec4(color, 1.0); + } + //outColor = vec4(abs(fs_normal), 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..b90e048 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -1,6 +1,13 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define DYNAMIC_TESSELLATION 1 + +#if DYNAMIC_DYNAMIC_TESSELLATION +#define MAX_TESSELLATION 4.0 +#define MAX_DISTANCE 36.0 +#endif + layout(vertices = 1) out; layout(set = 0, binding = 0) uniform CameraBufferObject { @@ -9,18 +16,45 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { } camera; // TODO: Declare tessellation control shader inputs and outputs +// https://stackoverflow.com/questions/20726441/passing-data-through-tessellation-shaders-to-the-fragment-shader +layout(location = 0) in vec4 tesc_v1[]; +layout(location = 1) in vec4 tesc_v2[]; +layout(location = 2) in vec4 tesc_up[]; +layout(location = 3) in vec4 tesc_bitangent[]; +layout(location = 4) in vec4 tesc_color[]; + +layout(location = 0) patch out vec4 tese_v1; +layout(location = 1) patch out vec4 tese_v2; +layout(location = 2) patch out vec4 tese_up; +layout(location = 3) patch out vec4 tese_bitangent; +layout(location = 4) patch out vec4 tese_color; void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; // TODO: Write any shader outputs + tese_v1 = tesc_v1[gl_InvocationID]; + tese_v2 = tesc_v2[gl_InvocationID]; + tese_up = tesc_up[gl_InvocationID]; + tese_color = tesc_color[gl_InvocationID]; + //tese_orientationAndWidth = tesc_orientationAndWidth[gl_InvocationID]; + tese_bitangent = tesc_bitangent[gl_InvocationID]; + // compute distance from camera to blade +#if DYNAMIC_DYNAMIC_TESSELLATION + vec3 cameraEye = vec3(inverse(camera.view)[3]); + float dist = distance(cameraEye, gl_in[gl_InvocationID].gl_Position.xyz); + float tessellationLevel = dist < MAX_DISTANCE ? ceil(MAX_TESSELLATION * (1.0 - dist / MAX_DISTANCE)) + : 1.0; +#else + float tessellationLevel = 4.0; +#endif // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? + gl_TessLevelInner[0] = 1.0; // 1 horizontal slice + gl_TessLevelInner[1] = tessellationLevel; // 4 vertical slices + gl_TessLevelOuter[0] = tessellationLevel; // left edge: 4 slices + gl_TessLevelOuter[1] = 1.0; // top edge: 1 slice + gl_TessLevelOuter[2] = tessellationLevel; // right edge: 4 slices + gl_TessLevelOuter[3] = 1.0; // bottom edge: 1 slices } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..4104ac7 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -9,10 +9,48 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { } camera; // TODO: Declare tessellation evaluation shader inputs and outputs +layout(location = 0) out vec2 fs_uv; +layout(location = 1) out vec3 fs_normal; +layout(location = 2) out vec4 fs_color; + +layout(location = 0) patch in vec4 tese_v1; +layout(location = 1) patch in vec4 tese_v2; +layout(location = 2) patch in vec4 tese_up; +layout(location = 3) patch in vec4 tese_bitangent; +layout(location = 4) patch in vec4 tese_color; void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + mat4 viewProj = camera.proj * camera.view; + //vec4 worldPos = gl_in[0].gl_Position; + // hard-coded width for now + //worldPos.x += 0.12 * (u > 0.5 ? 1.0 : -1.0); + //worldPos.x = v >= 0.9 ? gl_in[0].gl_Position.x : worldPos.x; + + // bezier w/ De Casteljau ================================================= + vec4 bezierA = gl_in[0].gl_Position + v * (tese_v1 - gl_in[0].gl_Position); + vec4 bezierB = tese_v1 + v * (tese_v2 - tese_v1); + vec4 bezierC = bezierA + v * (bezierB - bezierA); + vec4 worldPos = bezierC; + + // move along bitangent, unless at top (being at top == v is 1.0) + worldPos.xyz += v >= 0.99 ? vec3(0.0) : + tese_bitangent.xyz * tese_bitangent.w * (u > 0.5 ? 1.0 : -1.0) * (1.0 - v); + + // compute normal + fs_normal = normalize(cross(normalize(vec3(bezierB - bezierA)), tese_bitangent.xyz)); + + // middle displacement ==================================================== + // TODO + // d = w n (0.5 - |u - 0.5|(1 - v)) + //vec3 middleDisplacement = tese_bitangent.w * fs_normal * (0.5 - abs(u - 0.5) * (1.0 - v)); + //worldPos.xyz += middleDisplacement; + + gl_Position = viewProj * worldPos; + fs_color = tese_color; + fs_uv.x = (0.49 <= u && u <= 0.5) ? 1.0 : 0.0; + fs_uv.y = (0.24 <= v && v <= 0.26) ? 1.0 : 0.0; } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..a2c243c 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -1,6 +1,7 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define PI_OVER_TWO 1.57079632679 layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; @@ -8,10 +9,46 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { // TODO: Declare vertex shader inputs and outputs +layout(location = 0) in vec4 vs_v0; +layout(location = 1) in vec4 vs_v1; +layout(location = 2) in vec4 vs_v2; +layout(location = 3) in vec4 vs_up; +layout(location = 4) in vec4 vs_color; + +layout(location = 0) out vec4 tesc_v1; +layout(location = 1) out vec4 tesc_v2; +layout(location = 2) out vec4 tesc_up; +layout(location = 3) out vec4 tesc_bitangent; +layout(location = 4) out vec4 tesc_color; + out gl_PerVertex { vec4 gl_Position; }; void main() { // TODO: Write gl_Position and any other shader outputs + // compute v0 in world space + vec4 worldV0 = model * vec4(vs_v0.xyz, 1.0); + gl_Position = worldV0; + + // compute v1 in world space + vec4 worldV1 = model * vec4(vs_v1.xyz, 1.0); + tesc_v1 = worldV1; + + // compute v2 in world space + vec4 worldV2 = model * vec4(vs_v2.xyz, 1.0); + tesc_v2 = worldV2; + + // compute up in world space + vec4 worldUp = normalize(model * vec4(vs_up.xyz, 0.0)); + tesc_up = worldUp; + + // pass color along + tesc_color = vs_color; + + // add pi / 2 to get direction along width + float orientation = vs_v0.w - PI_OVER_TWO; + vec4 worldBitangent = normalize(model * vec4(cos(orientation), 0.0, sin(orientation), 0.0)); + //tesc_orientationAndWidth = vec4(worldOrientation.xyz, vs_v2.w); + tesc_bitangent = vec4(worldBitangent.xyz, vs_v2.w); // store width in W }