Add cudf table packing utilities with memory type support#843

Open

nirandaperera wants to merge 10 commits intorapidsai:mainfrom

nirandaperera:better_cudf_pack_usage2

Contributor

nirandaperera commented Feb 3, 2026 •

edited

Loading

Introduces pack.hpp and pack.cpp with utilities for packing cudf tables into contiguous buffers with explicit memory type control. This enables packing directly to device, pinned host, or host memory using the rapidsmpf memory reservation system.

Depends on #840

Changes

New files:

cpp/include/rapidsmpf/integrations/cudf/pack.hpp
cpp/src/integrations/cudf/pack.cpp
cpp/tests/test_cudf_pack.cpp

API:

Function	Description
`detail::pack<MemoryType>()`	Templated pack with compile-time memory type
`pack()`	Runtime dispatch based on reservation's memory type
`chunked_pack()`	Uses `cudf::chunked_pack` with a bounce buffer for memory-efficient packing to host

Refactored:

TableChunk::copy() now uses detail::pack<> for host memory destinations

nirandaperera requested review from a team as code owners

February 3, 2026 23:41

nirandaperera marked this pull request as draft

February 3, 2026 23:41

copy-pr-bot bot commented Feb 3, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

nirandaperera added 2 commits

February 5, 2026 10:50


          init

9e41618

Signed-off-by: niranda perera <niranda.perera@gmail.com>


          adding pack and chunked pack

bdf915a

Signed-off-by: niranda perera <niranda.perera@gmail.com>

nirandaperera force-pushed the better_cudf_pack_usage2 branch from 7dc13db to bdf915a Compare

February 5, 2026 18:52

nirandaperera added improvement non-breaking labels

nirandaperera added 3 commits

February 5, 2026 14:12


          adding chunked pack for host packing

d174514

Signed-off-by: niranda perera <niranda.perera@gmail.com>


          minor

334b5d3

Signed-off-by: niranda perera <niranda.perera@gmail.com>


          Merge branch 'main' of github.com:rapidsai/rapidsmpf into better_cudf…

977c2cf

…_pack_usage2

nirandaperera marked this pull request as ready for review

February 6, 2026 16:00

nirandaperera added 2 commits

February 6, 2026 08:56


          fix test

00fa4a8


          adding a special case for small tables

8d4cda7

wence- requested changes

View reviewed changes

Contributor

wence- left a comment

I am a bit concerned that the logic here is more complicated than it needs to be.

Do we really need special case code for tables less than 1MiB in size, for example?

cpp/include/rapidsmpf/memory/buffer_resource.hpp Outdated

Comment on lines +322 to +323

		* @note If the @p data is empty, the resulting Buffer will be DEVICE memory type.
		*

Contributor

wence- Feb 9, 2026

question: Is this an important property of the implementation?

Contributor Author

nirandaperera Feb 9, 2026

This was coming from cuda::is_host_accessible check. It was returning true for empty rmm::device_buffers.
I have replaced this with a new move for Host buffers. So, this will not be relevant anymore.

cpp/include/rapidsmpf/memory/buffer_resource.hpp Outdated

Comment on lines +312 to +316

+                   * @brief Move rmm::device_buffer (resides in device or pinned host memory) into a
+                   * Buffer.
                    *
                    * This operation is cheap; no copy is performed. The resulting Buffer resides in
-                   * device memory.
+                   * device memory or pinned host memory.

Contributor

wence- Feb 9, 2026

question: Why do we need to document the possible locations of the device_buffer? This is also slightly a lie, right? If the buffer was allocated with a managed memory resource then it might physically be located in host memory.

Contributor Author

nirandaperera Feb 9, 2026

I moved this to a separate method. So, this is not relevant now.

cpp/include/rapidsmpf/memory/memory_type.hpp

Comment on lines +71 to +82

+              /**
+               * @brief Get the memory types that are device accessible.
+               *
+               * @return A span of memory types that are device accessible.
+               */
+              constexpr std::span<MemoryType const> device_accessible_memory_types() noexcept {
+                  return std::span{MEMORY_TYPES}.first<2>();
+              }
+              static_assert(std::ranges::equal(
+                  device_accessible_memory_types(),
+                  std::array{MemoryType::DEVICE, MemoryType::PINNED_HOST}

Contributor

wence- Feb 9, 2026

question: Why not just return a std::array of the two memory types?

Contributor Author

nirandaperera Feb 9, 2026

I wanted to make sure that if we meddle with MEMORY_TYPES array in the future, we fail statically if somethings not right.

cpp/src/integrations/cudf/pack.cpp Outdated

Comment on lines +116 to +123

+                      cudf::chunked_pack cpack(
+                          table, cudf_chunked_pack_min_buffer_size, stream, br->device_mr()
+                      );
+                      RAPIDSMPF_EXPECTS(
+                          cpack.get_total_contiguous_size() == 0,
+                          "empty table should have 0 contiguous size",
+                          std::runtime_error
+                      );

Contributor

wence- Feb 9, 2026

Use cudf::pack_metadata

cpp/src/integrations/cudf/pack.cpp Outdated

+                  auto const est_size = std::max(raw_est_size, cudf_chunked_pack_min_buffer_size);
+                  // max available memory for bounce buffer
+                  auto max_availble = [](size_t res_sz, size_t ob) -> size_t {

Contributor

wence- Feb 9, 2026

Typo

cpp/src/integrations/cudf/pack.cpp Show resolved Hide resolved

cpp/src/integrations/cudf/pack.cpp Outdated

+               * 7. If all attempts fail, throw an error.
+               */
+              template <>
+              std::unique_ptr<PackedData> pack<MemoryType::HOST>(

Contributor

wence- Feb 9, 2026

There's no code sharing between the different templated pack functions. So I don't think it is worth it adding templates. Let's just call these pack_device, pack_pinned, pack_host.

Or, find ways of sharing code (e.g. above for pinned and device).

cpp/src/integrations/cudf/pack.cpp Outdated

Comment on lines +176 to +177


		if (br->is_pinned_memory_available()) { // try pinned memory as fallback

Contributor

wence- Feb 9, 2026

Suggested change

      
                if (br->is_pinned_memory_available()) {  // try pinned memory as fallback
          
                if (br->is_pinned_memory_available()) {

cpp/src/integrations/cudf/pack.cpp Outdated

+                      dev_res.clear();
+                      return chunked_pack(table, stream, bounce_buffer, br->device_mr(), reservation);
+                  }
+                  dev_res.clear();  // Release unused device reservation.

Contributor

wence- Feb 9, 2026

You could use a scoped block and not have to do this manually.

cpp/src/integrations/cudf/pack.cpp Outdated

+                  auto dev_avail = max_availble(dev_res.size(), dev_ob);
+                  if (dev_avail >= cudf_chunked_pack_min_buffer_size) {
+                      rmm::device_buffer bounce_buffer(dev_avail, stream, br->device_mr());
+                      dev_res.clear();

Contributor

wence- Feb 9, 2026

This will happen at the end of the block on the next line, or you could use br->allocate with your reservation and then get the device buffer out I suppose.

Contributor Author

nirandaperera Feb 9, 2026

br->allocate gives a Buffer. So, I'd have to move that again. Felt like that was too verbose.

nirandaperera added 3 commits

February 9, 2026 15:07


          addressing PR comments

570d1ff


          Merge branch 'main' of github.com:rapidsai/rapidsmpf into better_cudf…

6b3d0ec

…_pack_usage2

Signed-off-by: niranda perera <niranda.perera@gmail.com>


          Merge branch 'main' into better_cudf_pack_usage2

0f474cd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement non-breaking