-
-
Notifications
You must be signed in to change notification settings - Fork 184
fix: enable UTF-8 console output for jbang.{cmd,ps1,sh} on Windows #2349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Although this definitely works the issue I have with this is that the encoding will persist after JBang exits. So we have basically changed the console's encoding affecting any commands we run afterwards. Now in itself that might not be bad, in many (most?) cases this might actually improve things for the user. But it would be somewhat weird to see (some) apps behaving differently before JBang was run vs after JBang was run. Of course we could try resetting the code page back again before exiting. Or point users to documentation on how to change codepages on the system level. (Although I definitely like being able to show correct output regardless of the user's system settings) |
|
Alternative CMD implementation. Something similar could be done for PowerShell. But I think this might be overkill. CMD script code to only set the code page to 65001 if required: |
Best practice for modern, cross-platform compatibility.
You might be right, I'll let @maxandersen decide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes Unicode character display issues in JBang's Windows launcher scripts by enabling UTF-8 encoding. The fix ensures that help text and other output containing Unicode characters (like special symbols) display correctly on Windows console environments instead of showing garbled characters.
Key Changes:
- Enabled UTF-8 encoding in PowerShell launcher by setting console encoding properties
- Enabled UTF-8 code page (65001) in CMD launcher script
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/main/scripts/jbang.ps1 | Added UTF-8 encoding configuration for PowerShell console input/output at script initialization |
| src/main/scripts/jbang.cmd | Added UTF-8 code page activation (chcp 65001) at script initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I do think we should clean up - it's not good behaviour to modify users environment. We do the env cleanup for other variables; especially on windows. My main concerns are:
|
Adds 7ms overhead on my 8-core dev machine |
It should be on, we want to have nice things on Windows too! :-) --- 🚀 JVM Runtime Options (VM Arguments) --- --- 📝 Application Arguments --- --- ✅ Execution Complete --- |
Not needed on Linux or MacOS |
|
Fixed jbang.ps1 - it now restores the code page to its original value. |
|
jbang,cmd recursively calling itself ...... oh no. I'm going to rename
|
|
Only one test case fails on Windows: |
|
@maxandersen , the original code page is now restored when using CMD, PowerShell or Bash on (Cygwin or Git-Bash). |
I get what you are after. And I applaud it. But unfortunately JEP 400 does NOT ensure seamless UTF-8 experience when running java - it ensures UTF-8 will mostly works IF you are running in a UTF-8 environment...subtle but important difference. Its still up to user to set things up properly. |
| if [ $err -eq 255 ]; then | ||
| eval "exec $output" | ||
| if [[ "$os" == "windows" ]] && [[ "$JBANG_WIN_UTF8" != "false" ]]; then | ||
| bash -c "$output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why bash not exec here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the command to return to the curent shell , to be able to restore the code page setting at the end of the script.
eval "exec ..." does not return.
|
The alternative is to document what users should do to enable UTF-8 for each shell. CMD - add the following registry key:
Bash (Git-Bash or Cygwin) - add the following line to
PowerShell - add the following line to file $PROFILE
This has the added benefit that programs run using
|
|
ok - first, I'm really appreciative of your efforts here @wfouche its been very educational ! But starting to do different execution flows and jump through hoops that potentially breaks user flows is just not worth the hassle. Here is my suggestion:
That at least makes #1 have an option that is just set but I'm also fine just leave it at #1 as it is probably the best for users to just set those values or registry key anyways. |
|
Another option would be to use something like JNA and set the codepage from within JBang itself: https://github.com/java-native-access/jna/blob/master/contrib/platform/src/com/sun/jna/platform/win32/Wincon.java#L106 |
|
New PR created to document how to enable console UTF-8 support on Windows. |
Currently evaluating this option. |
|
This program uses JNA to save the current Windows code page, changes it to 65001, and restores the original code page again before exiting. System.out has to be reinitialized for Unicode output to work. ///usr/bin/env jbang "$0" "$@" ; exit $?
//JAVA 25+
//DEPS net.java.dev.jna:jna:5.18.1
//DEPS net.java.dev.jna:jna-platform:5.18.1
import com.sun.jna.platform.win32.Kernel32;
import java.io.FileDescriptor;
import java.io.FileOutputStream;
import java.io.PrintStream;
import java.nio.charset.StandardCharsets;
void main(String... args) throws Exception {
int CODEPAGE_UTF8 = 65001;
if (!System.getProperty("os.name").toLowerCase().contains("windows")) {
return;
}
// check if env var JBANG_WIN_UTF8 is true, otherwise return
// Not yet implemented.
// save current code page
int currentOutputCP = Kernel32.INSTANCE.GetConsoleOutputCP();
// set code page to 65001
boolean setOutputOk = Kernel32.INSTANCE.SetConsoleOutputCP(CODEPAGE_UTF8);
// remap stdout
if (setOutputOk) {
FileOutputStream fos = new FileOutputStream(FileDescriptor.out);
PrintStream utf8Out = new PrintStream(
fos,
true,
StandardCharsets.UTF_8.name()
);
// Replace the default System.out stream with the new UTF-8 stream
System.setOut(utf8Out);
}
System.out.println("\nTesting UTF-8 output: \u2764");
// Restore original code page
Kernel32.INSTANCE.SetConsoleCP(currentInputCP);
Kernel32.INSTANCE.SetConsoleOutputCP(currentOutputCP);
} |
|
In the future if there ever is a need to move forward with this proposal, then the approach suggested by @quintesse to use JNA is the best way to implement the code page switching functionality. |
|
so im trying using utf-8 on a windows vm now and i'm for some reasons NOT seeing any effect of chaning the codepage. Still gets ?? in the output. |
|
...and i can for the live of me not find that Beta setting anymore....im on Windows 11. |
the internet lied to me. Its not under Settings but under Control Panel...stupid. |
|
anyhow - just calling chcp did NOT work for me. |
|
ok im just dumb - or rather windows is stupid :) chcp 65001 works everywhere but powershell, on powershell you need |
|
that |
jbang --helpon Windows does not display Unicode characters.but now it does.
Fixes #2350