Fix replication service denial after MaxObjectSize increase#2911
Draft
cthulhu-rider wants to merge 2 commits intomasterfrom
Draft
Fix replication service denial after MaxObjectSize increase#2911cthulhu-rider wants to merge 2 commits intomasterfrom
MaxObjectSize increase#2911cthulhu-rider wants to merge 2 commits intomasterfrom
Conversation
Since storage node serves `ObjectService.Replicate` RPC, the gRPC server must be able to accept the biggest allowed object. Previously, node set max size of received gRPC messages to sum of payload (`MaxObjectSize` network setting) and header (16K at the moment) size limits. This was not entirely correct because replication request message also contains non-object fields (only signature for now) and protobuf service bytes. Thus, when the size of an object approached the maximum allowed, it was possible to overflow the calculated limit and receive service refuse. This adds 1 KB to the calculated limit. This is a heuristic value that is larger than the current protocol version's query extra data, while supporting small extensions in advance. The exact value is meaningless given the smallness volume degrees. Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
Since storage node serves `ObjectService.Replicate` RPC, the gRPC server must be able to accept the biggest allowed object. Previously, node calculated global message limit for the gRPC server once on startup. With this behavior, when network setting `MaxObjectSize` was increased, the node stopped accepting write objects larger than the previous limit. This manifested itself in a denial of replication service. From now storage node updates max received gRPC message size (if needed) on each refresh of the `MaxObjectSize` setting cache and via Netmap contract's polling done once per minute. Refs #2910. Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2911 +/- ##
==========================================
+ Coverage 23.75% 23.78% +0.02%
==========================================
Files 774 774
Lines 44866 44911 +45
==========================================
+ Hits 10660 10681 +21
- Misses 33356 33380 +24
Partials 850 850 ☔ View full report in Codecov by Sentry. |
Member
It's OK to me (a new feature there, solves a problem, not very invasive). But I wouldn't like to use any forks of the upstream library, so server restart is perfectly fine to me as well as a temporary solution (until your patch is accepted). This is a very rare event. We can consider some handling on the test side as well. Node restart will solve this for sure. |
Member
|
Please upstream the library improvement. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
overall, solution is very simple: listen to config changes and update the server limit. But...
gRPC
lib does not provide option to update max recv msg size on the runnig server. https://pkg.go.dev/google.golang.org/grpc#MaxRecvMsgSize is static. It can be very easily supported, so i patched my fork. If we decide this proposal is good, i suggest to add an organization fork (and later propose it to the upstream)
there is an alternative approach i dont like completely:
although
MaxObjectSizeis expected to be changed very rarely, such method looks very ugly taking into account the native option's simplicityNotifications
we dont have them now (nspcc-dev/neofs-contract#427). Polling is used as a temp soluition
this brings us closer to fix the test within which the bug was originally detected. It should be noted that there is not 100% stable solution for it: there will always be a gap b/w contract update and node reaction. The only option i see for now is to ignore overflow errors for about 1-2 minutes and retry
cc @evgeniiz321