Skip to content

Conversation

@klassen9
Copy link

@klassen9 klassen9 commented Nov 4, 2025

Here is a faster implementation of the data unpacking (src/finn/util/data_packing.py::packed_bytearray_to_finnpy). This implementation is different to the one seen in #1291.

While being very efficient mdanilows variant suffers from weaknesses, such at not supporting SIMD>1 and not supporting some data types such as fixed and floating point.

This PR addresses these problems and the performance problems of the current implementation by adding a unique unpacking for different datatype categories.

Furthermore, I removed the inferring of the output_shape, as it is ambiguous: E.g. a Byte can store 1 to 4 UINT2 numbers.

I ran a test comparing the current implementation to my variant:

For the first test I assumed an input of shape (10, 32, 32, 8, 1) of different datatype, which is packed into a byte array. Then I unpacked it with both variants. Here are the speedups:

image

For the second test I assumed an input of shape (10, 8, 1). One can see that the speedup is decreasing, but still is substantial:

image

@auphelia
Copy link
Collaborator

As part of this PR, I've also moved the get_driver_shapes function into finn.util.basic because the data_packing.py is used as part of the deployment folder that gets copied to the device, get_driver_shapes is not required in there but has additional dependencies (like ModelWrapper and onnx). Thanks @mdanilow for making me aware of this!

@auphelia
Copy link
Collaborator

Thank you, @klassen9!

@auphelia auphelia merged commit 3b26147 into Xilinx:dev Jan 26, 2026
2 checks passed
@auphelia auphelia mentioned this pull request Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants