Add cudf table packing utilities with memory type support#843
Add cudf table packing utilities with memory type support#843nirandaperera wants to merge 10 commits intorapidsai:mainfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: niranda perera <niranda.perera@gmail.com>
7dc13db to
bdf915a
Compare
Signed-off-by: niranda perera <niranda.perera@gmail.com>
wence-
left a comment
There was a problem hiding this comment.
I am a bit concerned that the logic here is more complicated than it needs to be.
Do we really need special case code for tables less than 1MiB in size, for example?
| * @note If the @p data is empty, the resulting Buffer will be DEVICE memory type. | ||
| * |
There was a problem hiding this comment.
question: Is this an important property of the implementation?
There was a problem hiding this comment.
This was coming from cuda::is_host_accessible check. It was returning true for empty rmm::device_buffers.
I have replaced this with a new move for Host buffers. So, this will not be relevant anymore.
| * @brief Move rmm::device_buffer (resides in device or pinned host memory) into a | ||
| * Buffer. | ||
| * | ||
| * This operation is cheap; no copy is performed. The resulting Buffer resides in | ||
| * device memory. | ||
| * device memory or pinned host memory. |
There was a problem hiding this comment.
question: Why do we need to document the possible locations of the device_buffer? This is also slightly a lie, right? If the buffer was allocated with a managed memory resource then it might physically be located in host memory.
There was a problem hiding this comment.
I moved this to a separate method. So, this is not relevant now.
| /** | ||
| * @brief Get the memory types that are device accessible. | ||
| * | ||
| * @return A span of memory types that are device accessible. | ||
| */ | ||
| constexpr std::span<MemoryType const> device_accessible_memory_types() noexcept { | ||
| return std::span{MEMORY_TYPES}.first<2>(); | ||
| } | ||
|
|
||
| static_assert(std::ranges::equal( | ||
| device_accessible_memory_types(), | ||
| std::array{MemoryType::DEVICE, MemoryType::PINNED_HOST} |
There was a problem hiding this comment.
question: Why not just return a std::array of the two memory types?
There was a problem hiding this comment.
I wanted to make sure that if we meddle with MEMORY_TYPES array in the future, we fail statically if somethings not right.
cpp/src/integrations/cudf/pack.cpp
Outdated
| cudf::chunked_pack cpack( | ||
| table, cudf_chunked_pack_min_buffer_size, stream, br->device_mr() | ||
| ); | ||
| RAPIDSMPF_EXPECTS( | ||
| cpack.get_total_contiguous_size() == 0, | ||
| "empty table should have 0 contiguous size", | ||
| std::runtime_error | ||
| ); |
cpp/src/integrations/cudf/pack.cpp
Outdated
| auto const est_size = std::max(raw_est_size, cudf_chunked_pack_min_buffer_size); | ||
|
|
||
| // max available memory for bounce buffer | ||
| auto max_availble = [](size_t res_sz, size_t ob) -> size_t { |
cpp/src/integrations/cudf/pack.cpp
Outdated
| * 7. If all attempts fail, throw an error. | ||
| */ | ||
| template <> | ||
| std::unique_ptr<PackedData> pack<MemoryType::HOST>( |
There was a problem hiding this comment.
There's no code sharing between the different templated pack functions. So I don't think it is worth it adding templates. Let's just call these pack_device, pack_pinned, pack_host.
Or, find ways of sharing code (e.g. above for pinned and device).
cpp/src/integrations/cudf/pack.cpp
Outdated
|
|
||
| if (br->is_pinned_memory_available()) { // try pinned memory as fallback |
There was a problem hiding this comment.
| if (br->is_pinned_memory_available()) { // try pinned memory as fallback | |
| if (br->is_pinned_memory_available()) { |
cpp/src/integrations/cudf/pack.cpp
Outdated
| dev_res.clear(); | ||
| return chunked_pack(table, stream, bounce_buffer, br->device_mr(), reservation); | ||
| } | ||
| dev_res.clear(); // Release unused device reservation. |
There was a problem hiding this comment.
You could use a scoped block and not have to do this manually.
cpp/src/integrations/cudf/pack.cpp
Outdated
| auto dev_avail = max_availble(dev_res.size(), dev_ob); | ||
| if (dev_avail >= cudf_chunked_pack_min_buffer_size) { | ||
| rmm::device_buffer bounce_buffer(dev_avail, stream, br->device_mr()); | ||
| dev_res.clear(); |
There was a problem hiding this comment.
This will happen at the end of the block on the next line, or you could use br->allocate with your reservation and then get the device buffer out I suppose.
There was a problem hiding this comment.
br->allocate gives a Buffer. So, I'd have to move that again. Felt like that was too verbose.
…_pack_usage2 Signed-off-by: niranda perera <niranda.perera@gmail.com>
Introduces
pack.hppandpack.cppwith utilities for packing cudf tables into contiguous buffers with explicit memory type control. This enables packing directly to device, pinned host, or host memory using the rapidsmpf memory reservation system.Depends on #840
Changes
New files:
cpp/include/rapidsmpf/integrations/cudf/pack.hppcpp/src/integrations/cudf/pack.cppcpp/tests/test_cudf_pack.cppAPI:
detail::pack<MemoryType>()pack()chunked_pack()cudf::chunked_packwith a bounce buffer for memory-efficient packing to hostRefactored:
TableChunk::copy()now usesdetail::pack<>for host memory destinations