Fix a mis-parse of NPY v2.0 and 3.0 headers #77

nmcclatchey · 2021-07-12T17:16:24Z

CNPY originally only supported NPY 1.0 headers, with a 2-byte length field. This PR causes parse_npy_header() to respond to NPY 2.0 and 3.0 by reading the header length as 4 bytes.

CNPY originally only supported NPY 1.0 headers, with a 2-byte length field. This commit causes parse_npy_header to respond to NPY 2.0 and 3.0 by reading the header length as 4 bytes.

ZJUGuoShuai · 2022-09-14T01:55:26Z

I only saw version 2.0 from NEP 1, can you tell me where is version 3.0 defined? Thanks.

nmcclatchey · 2022-09-14T03:33:52Z

You can find all 3 versions defined on numpy.org. Specifically, on the numpy.lib.format page.

Version 3.0 merely swaps ASCII strings to Unicode strings.

s-trinh · 2025-09-02T05:22:17Z

cnpy.cpp

+      header_len = *reinterpret_cast<uint32_t*>(buffer+8);
+    else
+      header_len = *reinterpret_cast<uint16_t*>(buffer+8);
+    std::string header(reinterpret_cast<char*>(buffer+(extended_header ? 11 : 9)),header_len);


Since header_len is uint16_t, I think it should be?

Suggested change

std::string header(reinterpret_cast<char*>(buffer+(extended_header ? 11 : 9)),header_len);

std::string header(reinterpret_cast<char*>(buffer+(extended_header ? 12 : 10)),header_len);

I think you're right. It's been working because the only use of header has been based on relative locations discovered using find and similar, but we could improve performance (ever so slightly) by using the correct offset.

Fixed. Thanks for pointing this out.

The `header` string was started too early, and had included the final byte of the length field (and omitting the final byte). This commit adjusts where it starts to fix the issue. Note: No ill effects occurred earlier because each parsed field is located using `find`, and the final byte is typically `}` and irrelevant to the parsed fields.

Fix a mis-parse of NPY v2.0 and 3.0 headers

3392473

CNPY originally only supported NPY 1.0 headers, with a 2-byte length field. This commit causes parse_npy_header to respond to NPY 2.0 and 3.0 by reading the header length as 4 bytes.

s-trinh reviewed Sep 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a mis-parse of NPY v2.0 and 3.0 headers #77

Fix a mis-parse of NPY v2.0 and 3.0 headers #77

Uh oh!

nmcclatchey commented Jul 12, 2021

Uh oh!

ZJUGuoShuai commented Sep 14, 2022

Uh oh!

nmcclatchey commented Sep 14, 2022

Uh oh!

s-trinh Sep 2, 2025

Uh oh!

nmcclatchey Sep 3, 2025

Uh oh!

nmcclatchey Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	std::string header(reinterpret_cast<char*>(buffer+(extended_header ? 11 : 9)),header_len);
	std::string header(reinterpret_cast<char*>(buffer+(extended_header ? 12 : 10)),header_len);

Fix a mis-parse of NPY v2.0 and 3.0 headers #77

Are you sure you want to change the base?

Fix a mis-parse of NPY v2.0 and 3.0 headers #77

Uh oh!

Conversation

nmcclatchey commented Jul 12, 2021

Uh oh!

ZJUGuoShuai commented Sep 14, 2022

Uh oh!

nmcclatchey commented Sep 14, 2022

Uh oh!

s-trinh Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

nmcclatchey Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

nmcclatchey Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants