Skip to content

Serve fake RRSIG RRSET with corrected TTL.#214

Merged
ximon18 merged 5 commits intomainfrom
fix-rrsig-ttl
Oct 16, 2025
Merged

Serve fake RRSIG RRSET with corrected TTL.#214
ximon18 merged 5 commits intomainfrom
fix-rrsig-ttl

Conversation

@ximon18
Copy link
Member

@ximon18 ximon18 commented Oct 14, 2025

To work around the zone tree storing RRSIGs as an RRSET with a single TTL which is incorrect for RRSIGs.

Note: Switches from ApexZone to LightWeightZone for signed zones so that the zone walk behaviour (used when serving the zone via XFR) can be overridden, which means we lose the more proper DNS response generation, but gain reduced memory consumption.

This PR is an alternative to NLnetLabs/domain#577 to address the same problem without having to hack on domain.

To work around the zone tree storing RRSIGs as an RRSET with a single
TTL which is incorrect for RRSIGs.

Note: Switches from Zone to LightWeightZone so that the zone walk
behaviour can be overridden, which means we lose the more proper DNS
response generation, but gain reduced memory consumption.
@ximon18 ximon18 requested a review from a team October 14, 2025 20:09
@ximon18 ximon18 added the bug Something isn't working label Oct 14, 2025
@ximon18
Copy link
Member Author

ximon18 commented Oct 14, 2025

Note: If we really want the response generation logic of the domain in-memory Zone type rather than the Cascade LightWeightZone we could wrap the Zone in a thin wrapper that delegates most functions through to the in-memory Zone impl but impls the walking with the same workaround as this PR does.

Copy link
Member

@Philip-NLnetLabs Philip-NLnetLabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there is any negative effect of switching to the lightweight zone tree, but the change looks good to me.

@ximon18
Copy link
Member Author

ximon18 commented Oct 15, 2025

I don't know if there is any negative effect of switching to the lightweight zone tree, but the change looks good to me.

There is a negative effect, but for Cascade today it is not a regression compared to what we aim to provide, namely that AXFR queries and simple queries like SOA still work as intended, but queries that maybe should result in a nuanced DNS response e.g. should it be NODATA or NXDOMAIN etc may not be answered per the RFCs, as the LightWeightZone has really simplistic logic for responding to queries compared to domain::zonetree::Zone.

Also, domain::zonetree::Zone has built-in support for accumulating a diff for serving via IXFR, but we're not using that yet nor is it clear if we will use it do IXFR or will solve that a different way.

The other effect is reduced memory usage.

I'll update this PR with examples of the differing query responses and impact on memory usage.

@ximon18
Copy link
Member Author

ximon18 commented Oct 15, 2025

I suspect we could make LWZ even more lightweight if we only want to respond to SOA and AXFR queries...

@Philip-NLnetLabs
Copy link
Member

I think we can leave it like this and see how we want to solve it properly.

@ximon18
Copy link
Member Author

ximon18 commented Oct 15, 2025

I'll update this PR with examples of the differing query responses and impact on memory usage.

Hmm, there is an unexpected big regression in signing time for an old version of the nl zone. I ran each version of the code twice and the difference persisted.

(baseline memory usage before the tests was ~50 MiB)

Property LightWeightZone domain::zonetree::Zone
Total signing time 2m 8s 1m 16s
Memory usage 33.76 GiB 40.44 GiB
DNSKEY RRSIG TTL 3600 0
SOA nl query No change No change
A i.dont.exist.nl query NOERROR, no answer NXDOMAIN
A outside.tree query. NXDOMAIN NXDOMAIN
Screenshot image image

@ximon18
Copy link
Member Author

ximon18 commented Oct 16, 2025

I don't know if there is any negative effect of switching to the lightweight zone tree, but the change looks good to me.

A bit more regarding this: we already use LightWeightZone for zones received via XFR, so it is not entirely new to also use it for signed output zones.

@ximon18
Copy link
Member Author

ximon18 commented Oct 16, 2025

Also, regarding the slow down introduced by using LightWeightZone, for now that it is compensated for to some extent (on a machine with multiple cores) by #219.

@ximon18
Copy link
Member Author

ximon18 commented Oct 16, 2025

I removed the use_lightweight_zone flag as setting it to false would break RRSIG TTLs again.

@ximon18 ximon18 merged commit 78b793b into main Oct 16, 2025
27 checks passed
@ximon18 ximon18 deleted the fix-rrsig-ttl branch October 16, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants