Skip to content

Conversation

@janosch-strato
Copy link

This pull request adds several new features, described in their respective commits.

I based it on the code from #79 but can easily rebase as required.

I am happy to make adaptions.

doomguy-stg and others added 20 commits April 7, 2025 14:37
Since the go-update in commit 858914a, chorus does not compile anymore
because it uses functions from otelgrpc and otelhttp which is now
deprecated.

This commit includes a least-effort fix which makes "make" usable again, by
downgrading otelgrpc and otelhttd to v.0.44.0.
It is now possible to modify the replication name used in "chorctl dash"
and "chorctl repl" using a printf-style format string.
With the introduction of custom destination buckets, it is possible that
two bucket replication policies replicate data to the same destination
bucket. This is now checked and reported as a conflict when the policy is
created.
If a replication has no Tobucket configured, the name of the source bucket
is used as destination.

Since replications without ToBucket just don't make sense, the mgmt api now
uses the source bucket name as default ToBucket in replication responses.

This can make clients a little bit simpler and more robust, because it is
not necessary to check for nil pointers and to anymore. The client does not
have to know about the meaning of "nil", and act according to the default
anymore.

On the other side, it is not possible to distinguish between "ToBucket not
set" and "ToBucket equals source bucket" anymore.
The check command now determines a replications destination bucket from
the configured replication policy instead of using the source bucket name
as default. This makes the check command usable for replications with
custom destination buckets.

Because this requires the replication policy to be present and
unambiuous, specifying a destination bucket name is now mandatory under
certain conditions. The chorus tests have been adapted accordingly.

"chorctl check" now allows the user to specify the destination bucket
name to use for the check. As a side effect, this makes it possible to
use "check" to compare arbitrary buckets, independently from any configured
replications.
By default, the standalone proxy prints all configured credentials on
startup, including the secret access keys.
This can be considered a security issue, depending on the environment
where the service is running.

With this commit, the standalone proxy avoids printing the secret keys
unless printing of secrets is explicitly requested in the config.
Chorctl can now be configured to enforce TLS (with and without
server certificate validation) and to send a client certificate
for mTLS.
The grpc management api can now be configured to enforce TLS connections.
The servers certificate and key have to be provided in two separate files.
Other ways to configure server certificates are currently not implemented.

The server can require the client to send a client certificate for mTLS.
A root CA for validating the client certificate has to be provided in a
dedicated ca file. The client will accept all valid certificates signed
by the configured client CA.

The Client side TLS config required for the http gateway is currently not
implemented. If TLS is enabled, the http api gateway will just be
disabled.
For addresses in the yaml config files, it is allowed to add "http" or
"https" but not required, as the protocol actually used is determined from
the separate IsSecure setting.

To support that, the code had various places that called TrimPrefix twice
for "http://" and "https://" at runtime. To avoid such unnecessary overhead
the ConfAddr type makes sure the string is stored without the protocol. It
offers various getters to also return the address with its protocol where
needed.

The custom YAML marshal und unmarshal functions are not supported by the
used viper config file framework, which needs some extra code to decode the
new type appropriately. They are still needed, though, because the
standalone's Start() function utilizes a yaml.Marshal / yaml.Unmarshal
combination to implement a deep-copy of the config strucure, among other
places.

Signed-off-by: Jan Schlien <schlien@strato.de>
Signed-off-by: Jan Schlien <schlien@strato.de>
Signed-off-by: Jan Schlien <schlien@strato.de>
Applicable domain names are configured per storage.

Signed-off-by: Jan Schlien <schlien@strato.de>
If the received request is already for the proxy's target domain, we can
forward it unaltered.

That saves computing a new signature for the request. It also protects from
errors from re-signing where the new signature is not computed correctly
(which is currently the case for v2 authentication when requests miss the
optional x-amz-content-sha256 header).

It is currently the only option to support "chunked uploads" requests
(https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html),
where the signature of each chunk is comprised of the previous chunk's
signature. That way, the first data chunk's signature depends on the
request's signature. If that is recomputed (which is required because it
depends on the host header), this means that all the other chunk signatures
will need to be recomputed as well, which is not implemented in chorus.

Signed-off-by: Jan Schlien <schlien@strato.de>
Get rid of code duplication and move them to the new central place in
pkg/s3/auth.go.

This change does not introduce any new dependencies, all users included
pkg/s3 beforehand already.

Signed-off-by: Jan Schlien <schlien@strato.de>
The new function will be usable in v2 request signing to eventually replace
minio-go/pkg/signer/SignV2().

Recommended git option to view this commit: --color-moved.

Signed-off-by: Jan Schlien <schlien@strato.de>
Have the code in a central place to make it accessible to additional
callers.

Recommended git option to view this commit: --color-moved.

Signed-off-by: Jan Schlien <schlien@strato.de>
The new function will be usable in v4 request signing to eventually replace
minio-go/pkg/signer/SignV4().

Remove hashedPayload parameter from doesSignatureV4Match(), it can be
computed within the function just like the other components of
stringToSign.

Signed-off-by: Jan Schlien <schlien@strato.de>
Have the code in a central place to make it accessible to additional
callers.

Also get the code for getContentSha256Cksum() from
service/proxy/auth/middleware.go as it is only called by
ComputeSignatureV4() and nowhere else.

Recommended git option to view this commit: --color-moved.

Signed-off-by: Jan Schlien <schlien@strato.de>
Most of the code was already present to support signature validation. Get
rid of the external dependency on github.com/minio/minio-go/v7/pkg/signer
with the rest of the code added by this commit.

Add "Authorization", "User-Agent" and "Accept-Encoding" to the list of
ignored headers as these are also ignored by the minio-go signer.

Signed-off-by: Jan Schlien <schlien@strato.de>
For some requests (chunked uploads) there is signature chaining in
progress, which makes it impossible to only modify the request's header and
send the body unmodified when switching v2 auth to v4 auth. So instead, we
now pass requests with v2 authentication as v2 authenticated requests
instead.

Signed-off-by: Jan Schlien <schlien@strato.de>
@arttor
Copy link
Collaborator

arttor commented May 22, 2025

@janosch-strato thank you for your contribution and sorry for the delayed response.

your PR is quite large and based on commit messages, it contains multiple features. so in current state it will be hard to review. can you please do following:

  • move out virtual domain support to a separate PR
  • move out s3 signature bugfixes to a separate PR. please also add bug description in PR description or create an issue
  • move out config refactoring and other refactoring to a separate PR. I can say that we might reject config refactoring change because there is ongoing work to support openstack swift api and it contains conflicting config changes

@janosch-strato
Copy link
Author

@arttor thank you for taking a look at the pull request.

I can reorder and split my changes of course. As they are partly based on the pull request of @doomguy-stg I'll put this one on hold until that pull request is resolved / closed so that we have common ground what the code should be based on. I'll monitor the progress and come back to this one soon after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants