-
Notifications
You must be signed in to change notification settings - Fork 923
Add PyString::from_fmt & py_format! using new PyUnicodeWriter api
#5199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f0df624 to
eb46ca2
Compare
0897456 to
203dec2
Compare
aea618c to
c28a9e4
Compare
c28a9e4 to
08d68b8
Compare
davidhewitt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long to come to this. This API looks pretty tempting as a way to squeeze some performance out of e.g. error pathways where we need to format Python strings from error values.
It would be interesting to see benchmarks compared to the existing pattern of formatting to string and then copying into a new Python string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice to write compat logic using private APIs of older Python versions makes me a little uneasy. Is the performance advance so significant that we can justify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the compat layer because it is faster to do it in rust. See #5199 (comment)
src/fmt.rs
Outdated
| #[cfg(not(any(Py_LIMITED_API, PyPy)))] | ||
| impl fmt::Write for PyUnicodeWriter { | ||
| fn write_str(&mut self, s: &str) -> fmt::Result { | ||
| let result = unsafe { | ||
| PyUnicodeWriter_WriteUTF8(self.as_ptr(), s.as_ptr().cast(), s.len() as isize) | ||
| }; | ||
| if result < 0 { | ||
| self.set_error(); | ||
| Err(fmt::Error) | ||
| } else { | ||
| Ok(()) | ||
| } | ||
| } | ||
|
|
||
| fn write_char(&mut self, c: char) -> fmt::Result { | ||
| let result = unsafe { PyUnicodeWriter_WriteChar(self.as_ptr(), c as u32) }; | ||
| if result < 0 { | ||
| self.set_error(); | ||
| Err(fmt::Error) | ||
| } else { | ||
| Ok(()) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider any buffering on the Rust side to avoid making hotly repeated FFI calls? Possibly a follow-up.
src/fmt.rs
Outdated
| /// This is like the `format!` macro, but it returns a `PyString` instead of a `String`. | ||
| #[macro_export] | ||
| macro_rules! py_format { | ||
| ($py: expr, $($arg:tt)*) => { | ||
| $crate::types::PyString::from_fmt($py, format_args!($($arg)*)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to decide if we need this API if we already have PyString::from_fmt. I like the macro conciseness, though in the past we've had pushback against adding too many macros.
Interesting thing to note here is that format_args! has an as_str() const fn, so I wonder if there's a crazy way that we can make this automatically intern! the value if the format string is a constant 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having troubles merging the py_format! macro and the intern! macro. Currently the intern! macro returns a &'static Bound<'py, PyString>, but py_format! has to return a dynamic lifetime reference. I am not able to do this without leaking (static) variables into the users scope.
Do you have any smart ideas the solve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I need something like a Python Cow 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could always just return Bound<'py, PyString> from py_format!? At least in the interned case it would be returning the existing string, just increasing the reference count.
src/fmt.rs
Outdated
| #[cfg(not(any(Py_LIMITED_API, PyPy)))] | ||
| /// The `PyUnicodeWriter` is a utility for efficiently constructing Python strings | ||
| pub struct PyUnicodeWriter { | ||
| writer: NonNull<ffi::PyUnicodeWriter>, | ||
| last_error: Option<PyErr>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an advantage to having this as a public type rather than an implementation detail to PyString::from_fmt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage is that the user can use the fmt::Write trait to write into a PyString, using functions that accept a generic fmt::Write impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An other option is to hide the writer implementation using something like the following,
impl PyString {
pub fn with_writer<F, W>(
py: Python,
f: F,
) -> Bound<PyString>
where
F: FnOnce(&impl fmt::Write);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made PyUnicodeWriter private for now. So only PyString::from_fmt and py_format! are public.
0dc648f to
acbc13e
Compare
Merging this PR will not alter performance
Performance Changes
Comparing |
2cc5814 to
e5c3008
Compare
|
I ran some benchmarks with interesting results. simple case that can be optimised to a static stringBefore: 3.7 µs (3700 ns) let format_args = format_args!("Hello {}!", "world");
PyString::new(py, &format!("{format_args}"))After: 219.2 ns py_format!(py, "Hello {}!", "world")For static string this is a lot faster. complex case with variable formatting argumentsBefore: 7 µs let value = (black_box(42), black_box("foo"), [0; 0]);
let format_args = format_args!("This is some complex value: {value:?}");
PyString::new(py, &format!("{format_args}"))After: 9 µs let value = (black_box(42), black_box("foo"), [0; 0]);
py_format!(py, "This is some complex value: {value:?}")This case of |
I now see what happens, it is using the compat version of #[cfg(not(any(Py_LIMITED_API, PyPy)))]
compat_function!(
originally_defined_for(all(Py_3_14, not(Py_LIMITED_API)));
pub unsafe fn PyUnicodeWriter_WriteUTF8(writer: *mut crate::PyUnicodeWriter,str: *const std::os::raw::c_char, size: crate::Py_ssize_t) -> std::os::raw::c_int {
let size = if size < 0 {
libc::strlen(str) as isize
} else {
size
};
let py_str = crate::PyUnicode_FromStringAndSize(str, size);
if py_str.is_null() {
return -1;
}
let result = crate::_PyUnicodeWriter_WriteStr(writer.cast(), py_str);
crate::Py_DECREF(py_str);
result
}
);I don't think the compat version is that useful over buffering into a rust String. What do you think @davidhewitt? |
I completely agree, I think using the private CPython APIs to no advantage is unfortunate. So this means that on Python 3.13 and below we should just make the non-static macro expand to What happens if you run the benchmark on Python 3.14? In my opinion that's the interesting question, if it's faster on 3.14 this feature seems justified, if it's always slower then it may not be worth it? 🤔 |
1175c3a to
bae1dca
Compare
I think this is marginally faster. Around 1-2% faster. There may be bigger wins when converting larger strings or when we buffer in rust first. |
Fair enough. I guess overall we're not changing the algorithmic complexity, just cutting the constant factor. I can also imagine that performance may be heavily dependent on whether the particular thing being formatted makes use of Another appealing property of the proposal here is that it makes large string creation not peak at ~2x the actual memory (full Rust and Python strings allocated). Overall I feel sympathetic to this proposal, it seems like it should be a good thing on Python 3.14+. On older Pythons it seems like we should just make the macro "do the dumb thing" as per #5199 (comment) (maybe with the |
091c602 to
a1e0a55
Compare
Intern static strings in `py_format!` Add `'py` lifetime to `PyUnicodeWriter` Add benchmark Bench using new macro Bench on python 3.14 Don't measure pystring dealloc Remove PyUnicodeWriter compat shim Add `TryInto` impl
a1e0a55 to
acc8896
Compare
|
I did a final pass over this & rebased. So should be ready for review. |
PyString::from_fmt using new PyUnicodeWriterPyString::from_fmt & py_format! using new PyUnicodeWriter api
davidhewitt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is looking good to me.
Is it worth having a pass over the .md files and also the code itself to find places where we can apply this? e.g. I spotted at least one use in coroutine.rs, I expect there must be more.
src/fmt.rs
Outdated
| //! This module provides the `PyUnicodeWriter` struct, which is a utility for efficiently | ||
| //! constructing Python strings using Rust's `fmt::Write` trait. | ||
| //! It allows for incremental string construction, without the need for repeated allocations, and | ||
| //! is particularly useful for building strings in a performance-sensitive context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to make this module pub(crate) for now or remove this doc?
src/fmt.rs
Outdated
| std::ptr::NonNull, | ||
| }; | ||
|
|
||
| /// This is like the `format!` macro, but it returns a `PyString` instead of a `String`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given macros can be a bit hard to figure out the API on, it would probably be nice to expand the docs to clarify that the arguments are exactly like format!, but with py as first argument.
We might want to comment about how if the string is optimized to a static string at compile time then this macro offers the advantage of string interning over the PyString::from_fmt API.
Probably want to add an example usage here too, e.g.
let py_string: Bound<'_, PyString> = py_format!(py, "{} {}", "hello", "world");| static INTERNED: $crate::sync::PyOnceLock<$crate::Py<$crate::types::PyString>> = $crate::sync::PyOnceLock::new(); | ||
| Ok( | ||
| INTERNED | ||
| .get_or_init($py, || $crate::types::PyString::intern($py, static_string).unbind()) | ||
| .bind($py) | ||
| .to_owned() | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this work?
| static INTERNED: $crate::sync::PyOnceLock<$crate::Py<$crate::types::PyString>> = $crate::sync::PyOnceLock::new(); | |
| Ok( | |
| INTERNED | |
| .get_or_init($py, || $crate::types::PyString::intern($py, static_string).unbind()) | |
| .bind($py) | |
| .to_owned() | |
| ) | |
| $crate::intern!($py, static_string).to_owned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails with attempt to use a non-constant value in a constant. Because static_string is not defined as a constant, and cannot be until our MSRV is 1.84.
No description provided.