Skip to content

Comments

Rework DPoPSigner, fixing nonce tracking and other errors#50

Open
ThisIsMissEm wants to merge 18 commits intoChimeHQ:mainfrom
germ-network:fix/dpop-signer-nonce-tracking
Open

Rework DPoPSigner, fixing nonce tracking and other errors#50
ThisIsMissEm wants to merge 18 commits intoChimeHQ:mainfrom
germ-network:fix/dpop-signer-nonce-tracking

Conversation

@ThisIsMissEm
Copy link
Contributor

@ThisIsMissEm ThisIsMissEm commented Feb 21, 2026

This completely reworks the DPoP Nonce handling, such that you have one DPoPSigner still, however, it is request-origin aware, and tracks DPoP Nonce's per origin.

The DPoPSigner internally uses a method to convert URLResponseProvider's URLResponse to a HTTPUrlResponse in order to simplify the logic in the DPoPSigner code.

The DPoPSigner.JWTParameters no longer contains a issuer as this isn't in the DPoP JWT Parameters, and would leak the user's authorization server to a third-party.

We now correctly handle errors and retries: If the Authorization Server (issuer) returns a OAuth Error Response we will only retry if the error is use_dpop_nonce per RFC 9449. This means that all other errors from the Authorization Server do not trigger a retry.

If we are requesting against a Resource Server (not a Authorization Server), and we get a WWW-Authenticate header with a value like DPoP error="use_dpop_nonce", then we will retry the request with the server-supplied DPoP-Nonce. All other errors from the Resource Server will not be retried.

No retries will happen if the server does not return a DPoP-Nonce header, or if the new DPoP-Nonce header is the same as the previous DPoP Nonce that we had for that origin, since the retry would just fail again. (That is it's only possible to retry on DPoP Nonce failure if the response contains a new DPoP-Nonce value.

The DPoP htu is now correctly the request URL without query parameters or hash fragments, per https://datatracker.ietf.org/doc/html/rfc9449#section-4.2-4.6 (This is the requestEndpoint in DPoPSigner.JWTParameters). This could have been causing issues previously.

Test Coverage

I've significantly increased the test coverage of the DPoPSigner, going from one test to an entire suite containing 16 test cases (5 are in a parameterized test case).

This test coverage includes verifying that different origins don't clobber each other's DPoP Nonce values, and asserting that the retry logic works correctly.

Before Merging

  • Test in an actual application to verify no regressions
  • Clean up commit history


public func response(
isolation: isolated (any Actor),
for request: URLRequest,
Copy link
Contributor Author

@ThisIsMissEm ThisIsMissEm Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have handling for this, due to the FIXME below, however, the request can include a Authorization: DPoP access-token header, this is the case when making a request with an OAuth Session. We do not encounter this in the tests currently.

If we see that we have an Authorization: DPoP access-token header, then we need to take the access-token value and get the sha256 of the first five characters of that value, and use that as the new Authorization header, so:

token = "e8db0489-6851-4f1d-bafa-b3914b37eb9c"
tokenHash = "jxjZyne7ZIPxhQeKKN-_qgN40uiaMa8izmRqIRqCGow"

request = { Authorization: "DPoP e8db0489-6851-4f1d-bafa-b3914b37eb9c" }

// `buildProof` method receives:
//     - not: `e8db0489-6851-4f1d-bafa-b3914b37eb9c`
//     - but: `jxjZyne7ZIPxhQeKKN-_qgN40uiaMa8izmRqIRqCGow`
// 
// Which is: base64Url(SHA256("e8db0489-6851-4f1d-bafa-b3914b37eb9c"))

request = { 
  Authorization: "DPoP JWT('jxjZyne7ZIPxhQeKKN-_qgN40uiaMa8izmRqIRqCGow')"
}

This prevents leaking the Access Token to an untrusted party through the DPoP ath parameter for a request with an Authorization header. However, as we're not using swift-crypto here, we actually need to use the tokenHash supplied from a external source, which is what the FIXME is about.

The current implementation would just completely discard the existing Authorization header.

We could provide a SHA256 method to DPoP Signer which would allow the DPoP Signer to calculate the tokenHash. It would just be pkce.hashFunction however in the current code, and that's probably just as messy, and we should prefer using swift crypto instead.

Source: https://github.com/bluesky-social/atproto/blob/4e96e2c7b7cc0231607d3065c95704069c4ca2a2/packages/oauth/oauth-client/src/fetch-dpop.ts#L54

@ThisIsMissEm ThisIsMissEm force-pushed the fix/dpop-signer-nonce-tracking branch from 6d8733b to e17e65b Compare February 22, 2026 00:29
}

@discardableResult
public func setNonce(from response: URLResponse) -> Bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where this method was ever called. I couldn't find any references to it in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's public API, so question is if this is valid public API; why would a client call it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's the thing, a client shouldn't ever be mutating the nonce cache. It's an internal detail to the OAuth Session / OAuth Client.

@mattmassicotte why does this API exist? I can't find any references to it anywhere.


let (data, _) = try await params.responseProvider(request)

let tokenResponse = try JSONDecoder().decode(TokenResponse.self, from: data)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd use OAuthErrorResponse here too, to actually decode Authorization Server errors, and throw those as a well formed Error. This would give the client details as to why the token exchange request failed.


let tokenResponse = try JSONDecoder().decode(TokenResponse.self, from: data)

guard tokenResponse.token_type == "DPoP" else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guard is a little funny / weird to see: Technically there are "Bearer" tokens in the reference server implementation, as com.atproto.server.createSession cannot be DPoP bound, and returns Bearer instead of DPoP tokens. However, when using AT Protocol OAuth the client is required to use DPoP.

}
}

extension DPoPSigner {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the DPoPSigner should be negotiating with the authorization server as to supported key algorithms, as to not use a incorrect key with the authorization server: https://github.com/bluesky-social/atproto/blob/4e96e2c7b7cc0231607d3065c95704069c4ca2a2/packages/oauth/oauth-client/src/fetch-dpop.ts#L226

However, in AT Protocol, we have this lovely part of the spec:

The ES256 (NIST "P-256") cryptographic algorithm must be supported by all clients and servers for DPoP JWT signing. The set of algorithms recommended for use is expected to evolve over time. Clients and Servers may implement additional algorithms and declare them in metadata documents to facilitate cryptographic evolution and negotiation.

From: https://atproto.com/specs/oauth#demonstrating-proof-of-possession-d-po-p

So because we're using ES256 keys in our DPoPKey implementation, this "mismatch" between supported algorithms and the key's algorithm, we never hit this issue. Yay, shortcuts?

@mattmassicotte
Copy link
Contributor

Wow! I'm delighted to see you looking at this. I don't really understand this area very deeply, so I'm sure you'll find lots of stuff to improve.

While I have no looked closely at all, one thing does give me pause. Changing the type of URLResponseProvider is going to make it hard for even me to adapt to this, and my own usage is very simple.

I will admit that I do not actually even know if there are other subclasses of URLResponse besides HTTPURLResponse. However, that is how Foundation is arranged and we cannot change it.

I wonder if this could be achieved in a less-invasive way by building a small wrapper function with this type and shape?

public typealias HTTPURLResponseProvider = @Sendable (URLRequest) async throws -> (Data, HTTPURLResponse)

@ThisIsMissEm ThisIsMissEm force-pushed the fix/dpop-signer-nonce-tracking branch from 8cacddf to 15783c7 Compare February 23, 2026 21:24
@ThisIsMissEm ThisIsMissEm force-pushed the fix/dpop-signer-nonce-tracking branch from f2d4240 to d3409b6 Compare February 23, 2026 21:33
}

public func buildProof(
_ request: inout URLRequest,
Copy link
Contributor Author

@ThisIsMissEm ThisIsMissEm Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutation of the request here feels wrong, since multiple calls could result in weird state, potentially?

@ThisIsMissEm ThisIsMissEm force-pushed the fix/dpop-signer-nonce-tracking branch from 0806c7c to 81802a9 Compare February 23, 2026 23:55
Comment on lines 32 to 119
#if !os(Linux)
// I'm unsure why exactly this test is failing on Linux only, but I suspect it is due to
// platform differences in FoundationNetworking.
#expect(headers["DPoP"] == "my_fake_jwt")
#endif
#if !os(Linux)
// I'm unsure why exactly this test is failing on Linux only, but I suspect it is due to
// platform differences in FoundationNetworking.
#expect(headers["DPoP"] == "my_fake_jwt")
#endif
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattmassicotte There was a really simple fix for this:

#expect(request.value(forHTTPHeaderField: "Authorization") == "DPoP token")
#expect(request.value(forHTTPHeaderField: "DPoP") == "my_fake_jwt")

It seems request.allHTTPHeaderFields doesn't return the DPoP header on linux — I'm not sure why.

Comment on lines 189 to 191
if let token {
request.setValue("DPoP \(token)", forHTTPHeaderField: "Authorization")
}
Copy link
Contributor Author

@ThisIsMissEm ThisIsMissEm Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattmassicotte the way we do this in typescript is that when the request is sent into dpop.response(...) the request already has the Authorization header added if there's an active session.

So we'd do something like this in dpopResponse, maybe:

		guard let pkce = config.tokenHandling.pkce else {
			throw AuthenticatorError.pkceRequired
		}

		if let token = login?.accessToken.value {
      request.setValue("Authorization", "DPoP \(token)")
    }

		
    return try await self.dpop.response(
			isolation: self,
			for: request,
			using: generator,
			issuingServer: login?.issuingServer,
			responseProvider: urlLoader
		)

This assumes that DPoPSigner internally calculates the sha256 hash of the Authorization header:

https://github.com/bluesky-social/atproto/blob/4e96e2c7b7cc0231607d3065c95704069c4ca2a2/packages/oauth/oauth-client/src/fetch-dpop.ts#L54-L57

Copy link
Contributor Author

@ThisIsMissEm ThisIsMissEm Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also go further if the Token struct contained the token_type then it'd be:

if let token = login?.accessToken {
  request.setValue("Authorization", "\(token.type) \(token.value)")
}

Which would support Bearer and DPoP through the same method.

throw AuthenticatorError.httpResponseExpected
}

// FIXME: This isn't really to spec: 401 doesn't mean "refresh", it just means unauthorized.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, more specifically, the RFC for Bearer tokens defines what should actually happen here, which is:

For example, in response to a protected resource request without
   authentication:

     HTTP/1.1 401 Unauthorized
     WWW-Authenticate: Bearer realm="example"

And in response to a protected resource request with an
   authentication attempt using an expired access token:

     HTTP/1.1 401 Unauthorized
     WWW-Authenticate: Bearer realm="example",
                       error="invalid_token",
                       error_description="The access token expired"

So the "correct" thing to do here is to actually parse the WWW-Authenticate header if you get a 401, and then based on the error type (invalid_request / invalid_token / insufficient_scope) take an appropriate action. On invalid_token you would attempt a token refresh if you have an refresh token.

But for invalid_request or insufficient_scope, you want to just raise an exception usually. You can go more complicated on insufficient_scope if you know the authorization grant's scope contains the scope value from the WWW-Authenticate header params (if it is given), then you can perform a token refresh to increase the token's scope back to that of the authorization grant's origin scope. This is only used when you want to narrow scopes from an initial broader scope set on an authorization grant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThisIsMissEm
Copy link
Contributor Author

Wow! I'm delighted to see you looking at this. I don't really understand this area very deeply, so I'm sure you'll find lots of stuff to improve.

While I have no looked closely at all, one thing does give me pause. Changing the type of URLResponseProvider is going to make it hard for even me to adapt to this, and my own usage is very simple.

I will admit that I do not actually even know if there are other subclasses of URLResponse besides HTTPURLResponse. However, that is how Foundation is arranged and we cannot change it.

I wonder if this could be achieved in a less-invasive way by building a small wrapper function with this type and shape?

public typealias HTTPURLResponseProvider = @Sendable (URLRequest) async throws -> (Data, HTTPURLResponse)

@mattmassicotte okay, I reverted the changes here, however, I did modify the DPoPSigner to use a makeRequest method which does cause that part of the code to cast URLResponse to HTTPUrlResponse, since it just makes all the logic that little bit simpler.

@ThisIsMissEm
Copy link
Contributor Author

@mattmassicotte okay, now I know why you responded to this, because it caught me by surprise! I'd actually intended to open against the germ-network repo, but uhh.. github's UI got me. I guess it's fine you saw this before it was ready.

@ThisIsMissEm ThisIsMissEm changed the title WIP: dpop signer nonce tracking Rework DPoPSigner, fixing nonce tracking and other errors Feb 24, 2026
@ThisIsMissEm ThisIsMissEm marked this pull request as ready for review February 24, 2026 01:09
var originComponents = URLComponents()
originComponents.scheme = scheme
originComponents.host = host
originComponents.path = self.relativePath
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -431,7 +431,7 @@ extension Authenticator {
token: token,
tokenHash: tokenHash,
issuingServer: login?.issuingServer,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@germ-mark @anna-germ this is the bug preventing auth. the line isAuthServer needs to take into account issuingServer being nil & pass through nil to the error logoc

@mattmassicotte
Copy link
Contributor

@ThisIsMissEm ahh. Ha! well that's fine. I've just been letting you do your thing, but if you need/want any feedback from me at any point I'd be happy to provide it.

@ThisIsMissEm
Copy link
Contributor Author

@mattmassicotte we're just putting it through it's paces in practice in an application, and the other Germ team members found some issues, and I'm going to be debugging them with them. Interestingly enough these bugs did not show up in test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants