fix unshare infinite loop without CGO_ENABLED#590
fix unshare infinite loop without CGO_ENABLED#590lyp256 wants to merge 1 commit intocontainers:mainfrom
Conversation
|
✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6637 |
There was a problem hiding this comment.
Thanks!
I can’t see how this can possibly work. AFAICT the code with _Containers-pid-pipe and _Containers-continue-pipe exists in C became we need it to run (and give all the process environment changes in unshare.Cmd.Start an opportunity to happen) before the Go runtime starts creating any OS-level threads.
We can mimic the child process’ behavior in Go but I don’t see how we can reliably achieve the same outcome.
Cc: @giuseppe
giuseppe
left a comment
There was a problem hiding this comment.
how is this supposed to work?
Is this AI-generated?
storage/pkg/unshare/unshare_nocgo.go
Outdated
| attr.AmbientCaps = []uintptr{ | ||
| unix.CAP_CHOWN, | ||
| unix.CAP_DAC_OVERRIDE, | ||
| unix.CAP_DAC_READ_SEARCH, | ||
| unix.CAP_FOWNER, | ||
| unix.CAP_FSETID, | ||
| unix.CAP_SYS_ADMIN, | ||
| } | ||
| } |
There was a problem hiding this comment.
I've not tested this PR but this won't work, we need all caps because we launch containers from this environment (and they don't need to be in the ambient set)
There was a problem hiding this comment.
I've compared processes with C code and those without, and you're right—the C process does include all capabilities. However, I'm not entirely sure if adding all privileges is the correct decision. The reason I added these capabilities in my code was that when using storage, it directly attempted to delete a folder with permissions r-xr-xr-x, which resulted in a Permission deniederror. Adding the capabilities was my workaround to resolve this permission issue. Based on your suggestion, I will proceed with adding all the capabilities.
There was a problem hiding this comment.
I haven't found any alternative methods to add capabilities other than using the ambient set. Do you have any suggestions?
There was a problem hiding this comment.
When you create a user namespace you automatically get them. Is Go dropping them?
There was a problem hiding this comment.
Based on my testing, I found that child processes will not have capabilities unless the ambient method is used. I collected relevant child process information from the /proc/<pid>/status file under three scenarios: using CGO, not using CGO without setting ambient capabilities, and not using CGO with ambient capabilities set.
Using CGO:
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
Not using CGO and not setting ambient capabilities:
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
Not using CGO but setting ambient capabilities:
CapInh: 000001ffffffffff
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 000001ffffffffff
There was a problem hiding this comment.
This affects also processes executed by Podman, could you investigate why this is happening? An alternative would be to drop the ambient caps once we are in the namespace
I needed to use statically compiled Skopeo and Buildah for container image migration tasks in an offline environment. I compiled the necessary executables using commands similar to:
However, I encountered an issue where the unsharepackage did not function properly when CGO_ENABLED=0. While Skopeo worked correctly in most scenarios, operations involving containers-storage(which call unshare.MaybeReexecUsingUserNamespace, e.g. To resolve this, I modified the code to directly pass |
None of this addresses the concern that doing any such changes after the Go runtime starts is too late. What am I missing? (Giuseppe?) The conversation in #160 was AFAICT saying that to resolve this
I realize that does not help your use case but that doesn’t mean that we can ignore the constraints (if there indeed are any). |
|
Under Giuseppe guidance, I re-investigated the implementation of unshare. After reading the source code related to golang's The golang The current approach uses the The main process uses |
Signed-off-by: lyp256 <lyp256@qq.com>
|
LGTM |
|
re-ping @mtrmac |
mtrmac
left a comment
There was a problem hiding this comment.
I really know ~nothing about the problem space; even if this is correct, reviewing it would require much more familiarity with the user namespace mechanics than I now have.
| if c.Cmd.SysProcAttr == nil { | ||
| c.Cmd.SysProcAttr = &syscall.SysProcAttr{} | ||
| } | ||
| attr := c.Cmd.SysProcAttr |
There was a problem hiding this comment.
All of this could happen only inside the if below.
| } | ||
| attr := c.Cmd.SysProcAttr | ||
| if c.UnshareFlags&syscall.CLONE_NEWUSER != 0 { | ||
| attr.Cloneflags = uintptr(c.UnshareFlags) |
There was a problem hiding this comment.
Nothing documents why we are not using “unshare” flags for unshare.
AFAICS the two functions don’t accept exactly the same sets of flags, so even if this were correct for the callers we are currently anticipating, I’d expect the code here to report a failure if any unexpected flag value is set.
(And, well, if Cloneflags works here fine nowadays, wouldn’t we want to use it in the CGo code path as well, and save us the unshare trouble? Is there more subtlety to this? There probably is, given the two separate unshare calls.)
This sort of applies generally — if we can do something natively using SysProcAttr nowadays, I’d expect that to always be done, rather than having two different code paths.)
| attr := c.Cmd.SysProcAttr | ||
| if c.UnshareFlags&syscall.CLONE_NEWUSER != 0 { | ||
| attr.Cloneflags = uintptr(c.UnshareFlags) | ||
| attr.GidMappingsEnableSetgroups = c.GidMappingsEnableSetgroups |
There was a problem hiding this comment.
This does nothing at all unless …
| if c.Ctty != nil { | ||
| index := len(c.Cmd.ExtraFiles) | ||
| c.Cmd.ExtraFiles = append(c.Cmd.ExtraFiles, c.Ctty) | ||
| attr.Ctty = index |
| _ = os.Unsetenv(name) | ||
| v, err := strconv.Atoi(env) | ||
| if err != nil { | ||
| return -1 |
There was a problem hiding this comment.
The C code differentiates between “missing” and “invalid format”
| bailOnError(err, "Read containers continue pipe") | ||
| } | ||
| if n > 0 { | ||
| bailOnError(fmt.Errorf(string(buf)), "Unexpected containers continue pipe read") |
There was a problem hiding this comment.
Format string vulnerability.
Nothing is all that unexpected about this error reporting path, the data exists only to be reported this way.
| bailOnError(err, "Error during setsid") | ||
| } | ||
| } | ||
|
|
| bailOnError(err, "Setresgid failed") | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The C version does a two-stage unshare. Why is it necessary there and not necessary here?
| } | ||
|
|
||
| // Re-invoke the execve system call to obtain capabilities. | ||
| err := syscall.Exec("/proc/self/exe", os.Args, os.Environ()) |
There was a problem hiding this comment.
containers_reexec does a lot of things. I have no idea why it does all of them (and the code doesn’t document it :| ); either way, I must ask: why is all of that not necessary here?
| if err := syscall.Setresuid(0, 0, 0); err != nil { | ||
| bailOnError(err, "Setresuid failed") | ||
| } | ||
| if err := syscall.Setresgid(0, 0, 0); err != nil { | ||
| bailOnError(err, "Setresgid failed") | ||
| } |
There was a problem hiding this comment.
Why does the UID/GID order differ from the C version?
|
@lyp256 are you still working on this PR? |
fix unshare.MaybeReexecUsingUserNamespace() infinite loop without CGO_ENABLED.
fixes: #160