CORS-4275: Add Windows support#10041
Conversation
|
@bear-redhat: This pull request references CORS-4275 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@bear-redhat: This pull request references CORS-4275 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
patrickdillon
left a comment
There was a problem hiding this comment.
I'm generally fine with adding unofficial windows support in this manner. Adding official support would require a significant amount of effort.
You should divide this up into some commits. Particularly the controller runtime stuff needs its own commit, and based on your commit message, it seems like windows is still leaking the processes? Does that always happen or only during interrupts?
| // add ZTP manifests to manifestPath | ||
| for _, file := range agentManifests.FileList { | ||
| manifestFile := ignition.FileFromBytes(filepath.Join(manifestPath, filepath.Base(file.Filename)), | ||
| manifestFile := ignition.FileFromBytes(path.Join(manifestPath, filepath.Base(file.Filename)), |
There was a problem hiding this comment.
it would be helpful to add context, for instance in the commit message, about why this change is needed. it's not obvious to me how this relates to windows support
There was a problem hiding this comment.
I added in the commit messages and PR decription
| @@ -0,0 +1,4 @@ | |||
| data/data/bootstrap/systemd/**/*.service text eol=lf | |||
There was a problem hiding this comment.
similar to the other comment, it would be helpful to add context, in a commit message and/or commit on why we need this file and how it relates to windows support (I assume it's specifically to handle windows new lines?)
There was a problem hiding this comment.
I added comments in .gitattribute. This is to ensure the file checked out always have Unix line ending, since these files are used in CoreOS.
| @@ -0,0 +1,23 @@ | |||
| //go:build !windows | |||
| // +build !windows | |||
There was a problem hiding this comment.
Can we handle this at runtime? It seems like we could.
There was a problem hiding this comment.
We could, but it is quite a common practice to have platform-dependent code in separate files and use conditional compiling.
Ref: https://go.dev/wiki/TargetSpecific
There was a problem hiding this comment.
Right. I am familiar with build tags, we have them for SCOS and in previous of the installer have used them for ARO and altinfra (non-Terraform versions of the installer, which has now become the default).
In general, I think we should use them only when necessary, and it seems in this case it is not necessary. That echoes the first point in the link you sent: "Minimize code in tagged files. As much code as possible should build for every target."
But more of my concern is that it would not be obvious to users that they should include the windows build tag. It would be better if it just worked for users, and that seems possible.
Also, I'm curious, does the non-windows build still run in windows? That is, is it possible for windows users to be running the binary that is built without the windows tag? It seems like it would only affect killing the processes.
There was a problem hiding this comment.
Discussed with Bear on slack, and determined that this cannot be handled at runtime. Windows builds fail with:
# github.com/openshift/installer/pkg/clusterapi/internal/process
pkg\clusterapi\internal\process\process.go:137:3: unknown field Setpgid in struct literal of type syscall.SysProcAttr
83aca62 to
af31103
Compare
|
@bear-redhat: This pull request references CORS-4275 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This PR improves cross-platform compatibility. It solves two main issues: 1. inconsistent line endings 2. inconsistent path separators Path separators, in installer, needs to target two different environments: 1. the OS where the installer runs 2. the OS where the injected files been used This PR unified path separators used in 2 to be UNIX path separators, while in 1 to be platform-dependant. Ref: https://forum.golangbridge.org/t/filepath-join-or-path-join/13479 Known issues: The spawn processes, including etcd.exe, kube-apiserver.exe, and openshift-installer.exe, will not exit once installation aborted or completed. Users need to manually terminate those processes in task manager.
af31103 to
5166c3e
Compare
|
@bear-redhat: This pull request references CORS-4275 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
It's an overall refinement on correcting the usage of
Almost always. There are some rough edges on the localcontrolplane for terminating the sub-processes. I think they can be addressed in a separate PR. |
|
@bear-redhat: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| const ( | ||
| cvoOverridesFilename = "manifests/cvo-overrides.yaml" | ||
| cvoOverridesFilename = path.Join("manifests", "cvo-overrides.yaml") |
There was a problem hiding this comment.
Is this change is on purpose, changing const to var ?
There was a problem hiding this comment.
path.Join("manifests", "cvo-overrides.yaml") (value of type string) is not constantcompiler[InvalidConstInit](https://pkg.go.dev/golang.org/x/tools/internal/typesinternal#InvalidConstInit)
on mac & linux I'm not aware of any issues with leaked processes. I implemented the signal handling if you want to discuss further or have any questions. Leaked processes on windows is not a blocking concern for me for this PR, although it is pretty pernicious. |
It's fine as it is now, but what I was trying to suggest is that you divide this PR into multiple commits (not open multiple PRs). One commit would handle updating But it's NBD, this is manageable as it is now. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: patrickdillon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
|
@patrickdillon: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| for _, f := range r.File { | ||
| name := f.Name | ||
| if !p.Sources.Has(name) { | ||
| nameWithoutExt := strings.TrimSuffix(name, ".exe") |
There was a problem hiding this comment.
| nameWithoutExt := strings.TrimSuffix(name, ".exe") | |
| nameWithoutExt := strings.TrimSuffix(name, filepath.Ext(name)) |
| func flockFile(f *os.File, lock bool) error { | ||
| if lock { | ||
| var overlapped windows.Overlapped | ||
| return windows.LockFileEx(windows.Handle(f.Fd()), windows.LOCKFILE_EXCLUSIVE_LOCK, 0, 1, 0, &overlapped) | ||
| } | ||
| var overlapped windows.Overlapped | ||
| return windows.UnlockFileEx(windows.Handle(f.Fd()), 0, 1, 0, &overlapped) | ||
| } |
There was a problem hiding this comment.
This seems like locking the first byte in the file? I guess it works, but should we lock the entire file?
const (
reserved = 0
allBytes = ^uint32(0)
)
func flockFile(f *os.File, lock bool) error {
if lock {
var overlapped windows.Overlapped
return windows.LockFileEx(windows.Handle(f.Fd()), windows.LOCKFILE_EXCLUSIVE_LOCK, reserved, allBytes, allBytes, &overlapped)
}
var overlapped windows.Overlapped
return windows.UnlockFileEx(windows.Handle(f.Fd()), reserved, allBytes, allBytes, &overlapped)
}For example: https://tip.golang.org/src/cmd/go/internal/lockedfile/internal/filelock/filelock_windows.go
| func setSysProcAttr(cmd *exec.Cmd) { | ||
| } |
There was a problem hiding this comment.
func setSysProcAttr(cmd *exec.Cmd) {
cmd.SysProcAttr = &syscall.SysProcAttr{
CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP,
}
}We should probably isolate the local controlplane into a separate process group?
This way, if the install process is interrupted/terminated, the envtest environment clean-up can be handled by the installer (instead of the child processes terminating on itself).
| func stopProcess(ps *State) error { | ||
| return ps.Cmd.Process.Kill() | ||
| } |
There was a problem hiding this comment.
Is there a way for grateful shutdown on windows 👀?
Maybe, like:
func stopProcess(ps *State) error {
err := windows.GenerateConsoleCtrlEvent(windows.CTRL_BREAK_EVENT, uint32(ps.Cmd.Process.Pid))
if err != nil {
return err
}
return nil
}There was a problem hiding this comment.
I am reading https://pkg.go.dev/os/signal#hdr-Windows so that hack comes to mind, but I don't exactly know how windows work 😞
8fe3603
into
openshift:main
|
Please feel free to open a new PR if the suggestions works 😁 |
This PR improves cross-platform compatibility.
It solves two main issues:
Path separators, in installer, needs to target two different
environments:
This PR unified path separators used in 2 to be UNIX path separators,
while in 1 to be platform-dependant.
Ref: https://forum.golangbridge.org/t/filepath-join-or-path-join/13479
Known issues:
The spawn processes, including etcd.exe, kube-apiserver.exe,
and openshift-installer.exe, will not exit once installation
aborted or completed. Users need to manually terminate those
processes in task manager.
Need kubernetes-sigs/controller-tools#1297