PYTHON-5355 Addition of API to move to and from NumPy ndarrays and BSON BinaryVectors#2590
Conversation
bson/binary.py
Outdated
There was a problem hiding this comment.
IIRC importing numpy is very very slow. In that case we should not even attempt to import it by default.
There was a problem hiding this comment.
I've changed to lazy imports, and used importlib.util.find_spec("numpy") as skip condition in test_bson.py.
bson/binary.py
Outdated
There was a problem hiding this comment.
Should this be a new method or a new argument to the existing as_vector method? Like binary.as_vector(numpy=True)
There was a problem hiding this comment.
I'm open to this as alternative to additional function as_numpy_vector. What do you think, @blink1073 ?
There was a problem hiding this comment.
I like Shane's suggestion, for symmetry with from_vector.
justfile
Outdated
There was a problem hiding this comment.
Instead of adding a new extra we could use --with numpy.
There was a problem hiding this comment.
This was a great suggestion. I had to change invocations from uv run mypy . (or pytest) to be done ad modules uv run python -m mypy . but it works. I removed the extra from pyproject and requirements/.
bson/binary.py
Outdated
There was a problem hiding this comment.
Are we following the rules of the spec for validating PACKED_BIT here (eg the validation in as_vector)?
There was a problem hiding this comment.
I just added the same validations applied in as_vector.
justfile
Outdated
There was a problem hiding this comment.
I think this is confusing, there might be other place's we'd use numpy optionally, perhaps test-numpy?
There was a problem hiding this comment.
"test-numpy" works for me. @ShaneHarvey ?
There was a problem hiding this comment.
@blink1073 Happy to make this change now. All tests are passing.
…y-Arrays-In-BinaryVectors
Jibola
left a comment
There was a problem hiding this comment.
Few questions for clarification, but otherwise looks good.
| @unittest.skipIf(not _NUMPY_AVAILABLE, "numpy optional-dependency not installed.") | ||
| def test_vector_from_numpy(self): |
There was a problem hiding this comment.
NIT because we don't have a general policy around this, but there's three separate test themes in here.
Happy path
Invalid input
Value coercion
I'd push we split those three into three separate unit tests for better semantic representation, but I'm not hard-pressed on this.
…ue) isinstance of np.ndarray
…ing one task/test
…mpy-Arrays-In-BinaryVectors
|
This looks good to go now @Jibola. Failing tests are unrelated. I ran the link check locally and it succeeded. The run was rate-limited. The evergreen time out was spurious I think. |
Issue Key
Summary
The existing API for
bson.binary.BinaryVectorrequires that the vector of numbers be a list of either floats or integers. This extends that to allow one to use NumPy Arrays, which are the most common form of vectors used in Python.Changes in this PR
datamember of thebson.binary.BinaryVectorto include np.ndarray:Binary.from_vectornow accepts this input, and uses NumPy'stobytes()instead ofstruct.packas is required for lists.A new instance method,
as_numpy_vectorwas added. This was chosen so that we maintain backwards-compatibility. There was also concern about needlessly introducing Numpy as a dependency.Testing Plan
New tests are found in
test/test_bson.py. These are run viajust test-numpylocally and with new Evergreen variants that test on different architectures.Checklist
Checklist for Author
Checklist for Reviewer @ShaneHarvey @blink1073 @Jibola
Focus Areas for Reviewer (optional)