Skip to content

Conversation

@CheyuWu
Copy link

@CheyuWu CheyuWu commented Dec 27, 2025

Purpose of PR

This issue asks the ability to detect and handle PyTorch tensors, focusing on cpu tensors.

  1. add PyTorch tensor detection utilities
  2. add tensor validation
  3. add cpu tensor path in encode_tensor()
  4. add error handling
  5. add basic tests

Related Issues or PRs

Closes #725

Changes Made

  • Bug fix
  • New feature
  • Refactoring
  • Documentation
  • Test
  • CI/CD pipeline
  • Other

Breaking Changes

  • Yes
  • No

Checklist

  • Added or updated unit tests for all changes
  • Added or updated documentation for all changes
  • Successfully built and ran all unit tests or manual tests locally
  • PR title follows "MAHOUT-XXX: Brief Description" format (if related to an issue)
  • Code follows ASF guidelines

@CheyuWu CheyuWu changed the title [QDP] PyTorch Tensor Detection and CPU Path MAHOUT-725 [QDP] PyTorch Tensor Detection and CPU Path Dec 27, 2025
@CheyuWu CheyuWu changed the title MAHOUT-725 [QDP] PyTorch Tensor Detection and CPU Path MAHOUT-725: [QDP] PyTorch Tensor Detection and CPU Path Dec 27, 2025
@CheyuWu CheyuWu changed the base branch from main to dev-qdp December 27, 2025 14:07
@CheyuWu
Copy link
Author

CheyuWu commented Dec 27, 2025

Manual Testing

(mahout) user@DESKTOP-UDTSV2K:~/code/mahout$ cd ./qdp/qdp-python/
(mahout) user@DESKTOP-UDTSV2K:~/code/mahout/qdp/qdp-python$ cp ../target/debug/libmahout_qdp.so mahout_qdp.so && PYTHONPATH=. pytest .
========================================================================== test session starts ==========================================================================
platform linux -- Python 3.11.13, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/user/code/mahout/qdp/qdp-python
configfile: pyproject.toml
collected 21 items                                                                                                                                                      

tests/test_bindings.py ...s.....                                                                                                                                  [ 42%]
tests/test_high_fidelity.py ............                                                                                                                          [100%]

===================================================================== 20 passed, 1 skipped in 3.30s =====================================================================

@rich7420 rich7420 added the qdp label Dec 29, 2025
@github-project-automation github-project-automation bot moved this to Backlog in QDP Dec 29, 2025
@400Ping
Copy link

400Ping commented Dec 29, 2025

please fix the ci error

@CheyuWu
Copy link
Author

CheyuWu commented Dec 29, 2025

Hi @400Ping, I have fixed the problem PTAL

@400Ping
Copy link

400Ping commented Dec 29, 2025

I think overall is good but we are going to merge #753 into dev-qdp, I think it is best to wait for it to merge and fix the conflicts for this pr.

@400Ping
Copy link

400Ping commented Dec 30, 2025

Its merged, please taken account how other inputs are written and refactor this pr.

@CheyuWu
Copy link
Author

CheyuWu commented Dec 30, 2025

Its merged, please taken account how other inputs are written and refactor this pr.

OK, I will address this tomorrow

@rich7420
Copy link
Contributor

plz fix precommit errors

@CheyuWu
Copy link
Author

CheyuWu commented Dec 31, 2025

Hi @rich7420 @400Ping , I’ve addressed the issue.
BTW, the numpy package was missing from pyproject.toml, so I added it.

Manual Testing

(qdp-python) user@DESKTOP-UDTSV2K:~/code/mahout/qdp/qdp-python$ PYTHONPATH=. python -m pytest .
========================================================================== test session starts ==========================================================================
platform linux -- Python 3.11.13, pytest-9.0.1, pluggy-1.6.0
rootdir: /home/user/code/mahout/qdp/qdp-python
configfile: pyproject.toml
collected 21 items                                                                                                                                                      

tests/test_bindings.py ...s.....                                                                                                                                  [ 42%]
tests/test_high_fidelity.py ............                                                                                                                          [100%]

===================================================================== 20 passed, 1 skipped in 2.63s =====================================================================

@400Ping
Copy link

400Ping commented Dec 31, 2025

@CheyuWu see this #752 (comment)

@ryankert01
Copy link
Contributor

@CheyuWu feel free to help review this related PR #777 ~ It's kind of complex to add a input data format.

Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
@400Ping
Copy link

400Ping commented Jan 3, 2026

cc @guan404ming @rich7420 @ryankert01 to review
I think we can merge this first and I will open a pr for handling PyTorch input format.

@rich7420
Copy link
Contributor

rich7420 commented Jan 3, 2026

@CheyuWu plz fix this pre-commit error

Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
@CheyuWu
Copy link
Author

CheyuWu commented Jan 3, 2026

Let me check what is going on

Comment on lines +267 to +270
let data: Vec<f64> = tensor
.call_method0("flatten")?
.call_method0("tolist")?
.extract()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems here do twice copy.

PyTorch tensor → Python list
Python list → Rust Vec<f64>

We should improve this in follow-up PR use like numpy() or PinnedHostBuffer way to decrease memory copy time.
Please comments in this part.

Copy link
Author

@CheyuWu CheyuWu Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK no problem
I have added the comment PTAL

@rich7420
Copy link
Contributor

rich7420 commented Jan 3, 2026

I think overall LGTM
It will be merged when all green.

Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
Copy link
Author

@CheyuWu CheyuWu Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is modified by ruff format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a unrelated format error. It should be reverted.
cc @ryankert01

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted the code PTAL

Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
@rich7420
Copy link
Contributor

rich7420 commented Jan 3, 2026

@400Ping do you want to take another look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

[QDP] PyTorch Tensor Detection and CPU Path

4 participants