-
Notifications
You must be signed in to change notification settings - Fork 1k
Sync testnet branch with commits in master #5100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
### What Add redis to the perftests image ### Why Redis-cli is a dependency of the paralle catchup V2 supercluster mission. We'd like to run parallel catchup with the perftests build as it's faster than the buildtests build.
### What Add redis to the perftests image ### Why Redis-cli is a dependency of the paralle catchup V2 supercluster mission. We'd like to run parallel catchup with the perftests build as it's faster than the buildtests build.
This only affects entries that have been archived in a corrupted state (due to bug in p23 eviction code) and have never been restored. Since the hot archive state is normally only observable when the entry actually gets restored, we don't emit any specific meta for this. The downstream observers will observe the correct value during the restoration (i.e. exactly the same value as the one that has been archived). The fix currently hardcodes all the 478 affected entries, including these that have since been restored and updated. This should be fine though, as the upgrade logic ensures that only the archived entries that match our corrupted state expectations are actually fixed, i.e. the live entries will be simply ignored during the upgrade.
…ak at least Windows build
This has been developed on a branch without CI, so we've missed quite a few necessary updates to test data.
This allows specifying an external CSV table that contains the expected p23 Hot Archive corruption data. Then during catchup that covers the whole range of p23 and the upgrade to p24 the file is used to verify that: - Only the entries from the table are ever incorrectly archived, and that their correct and archived states match those in the table - Every entry from the table has indeed been incorrectly archived - The entries that have been marked as restored in the table were indeed restored with the expected corrupted state - During the protocol 24 upgrade only the entries from the table that have never been restored have been updated, and that the update has brought them back to the correct state
# Description Merge changes from v24 release branch to `master`. # Checklist - [ ] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [ ] Rebased on top of master (no merge commits) - [ ] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [ ] Compiles - [ ] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
…rifier` (stellar#4973) Fixes the following build warning ``` In file included from main/ApplicationImpl.cpp:40: ./ledger/P23HotArchiveBug.h:55:1: warning: 'Protocol23CorruptionDataVerifier' defined as a struct here but previously declared as a class; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags] 55 | struct Protocol23CorruptionDataVerifier | ^ main/ApplicationImpl.h:41:1: note: did you mean struct here? 41 | class Protocol23CorruptionDataVerifier; | ^~~~~ | struct ./main/Application.h:51:1: note: did you mean struct here? 51 | class Protocol23CorruptionDataVerifier; | ^~~~~ | struct ```
- Remove p23-related logic for filtering transactions that interact with keys affected by the data corruption issue - Get rid of `isP24UpgradeLedger` flag - Make some checks for p24 upgrade hot archive fixes more strict, now that the upgrade has actually happened - Fix typos in the metric names (that's actually not from p24 branch, but has been noticed in the merge PR)
# Description Cleanup changes merged from P24 branch. - Remove p23-related logic for filtering transactions that interact with keys affected by the data corruption issue - Get rid of `isP24UpgradeLedger` flag - Make some checks for p24 upgrade hot archive fixes more strict, now that the upgrade has actually happened - Fix typos in the metric names (that's actually not from p24 branch, but has been noticed in the merge PR) # Checklist - [ ] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [ ] Rebased on top of master (no merge commits) - [ ] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [ ] Compiles - [ ] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
…WithEntryDiffs invariant
# Description Emit p23 mint/burn events for the SAC. I tested this against pubnet, but need to figure out how to write a simple test case for the new code path. <!--- Describe what this pull request does, which issue it's resolving (usually applicable for code changes). ---> # Checklist - [ ] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [ ] Rebased on top of master (no merge commits) - [ ] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [ ] Compiles - [ ] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
This is an initial implementation of the idea described in stellar#4971 -- what I'm now calling "ledger entry scopes" At a high level, the idea here is to: - Wrap `LedgerEntries` in helper classes called `ScopedLedgerEntry<S>` where `S` is one of a _static_ set of known scope-types (`GLOBAL_PAR_APPLY_STATE`, `THREAD_PAR_APPLY_STATE`, `TX_PAR_APPLY_STATE` and so forth). - Add a mix-in class `LedgerEntryScope<S>` that is inherited by each class we use to store "random hashtables-full-of-LedgerEntries" like `ThreadParallelApplyLedgerState` or `GlobalParallelApplyLedgerState` or such) - Require that when you unwrap a `ScopedLedgerEntry` to read or modify the contained `LedgerEntry`, you _provide the scope_ you're reading or writing it in, and _statically require_ the `S` parameters match. - Also _dynamically require_ that a couple other parameters stored in the `ScopedLedgerEntry` match the scope: the ledger number and a generic "index" number which can be used to encode either the thread-cluster number or the bucket list number or similar things where there are "a bunch of similar groups of LEs that should nonetheless be kept separate". - Track when each of those scopes is "active" the same way the LTX thinks about active-ness: when active you can do the read/write actions directly, but when _inactive_ all you can do is "adopt" the entries into some other scope for reading and writing. This is to help catch stale reads on entries from inactive scopes (eg. reading from the global scope directly rather than through a single method that adopts from global -> thread and then reads at the thread scope) All told this should provide a similar level of protection from stale reads as we're getting with the LTX, but in a sort of "decentralized" way that doesn't require uniformly using an LTX for everything. Neither system provides perfect protection against stale reads -- you can hold on to an unwrapped LedgerEntry longer than you should or pass it somewhere you shouldn't -- but it at least gives us a bit more structure to lean on to prevent such errors. ## Caveats It's _not_ wired-in to the bucket list or in-memory soroban state yet. I mean to do so but that'll come later. I tried to be a little more _explicit_ than the methods in the LTX: to not use operator* or operator(bool) or such, and not rely on destructors more than necessary (there's one guard object type to help with exception safety). I also named all the scope-mixin-provided methods with `scope_` as a prefix, which might make it a little more verbose but I think clarity should be the main priority with this code.
Increase ledger_max_instructions and tx_max_instructions to 600000000 to support higher computational throughput on testnet. # Checklist - [ ] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [ ] Rebased on top of master (no merge commits) - [ ] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [ ] Compiles - [ ] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
Small follow up to stellar#5081 to better handle some edge cases - Move the exception about potentially corrupt checkpoints further down the recovery function to capture all possible recovery failures. - Handle cases where publishing was enabled mid-checkpoint: skip publishing of the incomplete file, start publishing on the _next_ checkpoint.
# Description Fixes a race due to not copying snapshots. <!--- Describe what this pull request does, which issue it's resolving (usually applicable for code changes). ---> # Checklist - [ ] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [ ] Rebased on top of master (no merge commits) - [ ] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [ ] Compiles - [ ] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
Reason we didn't run into this previously was because LedgerTxn always clears prepared statements cache on commit/rollback. in `setLastClosedLedger` we were lucky to call LedgerTxn prior to `clearRebuildForOfferTable`, which cleared the cache. With the switch to multiple DB sessions, we now need to explicitly clean up main session prepared statement cache on startup when we setup state.
This change refactors the codebase to consistently use `toStellarValue` over various other one-off methods of working with values.
With recent bugs like stellar#5093 I realized prepared statements at this point cause more harm than good. We've moved to BucketListDB as a backend a while ago, removed SQL from publishing completely, and now the use of SQL in the codebase is minimal. In addition, for the remaining SQL bits like offers, we do batch load/inserts. The lack of consistent clearing discipline as described in stellar#1591 makes this even harder to think about. On pubnet infra, it looks like we're executing 30-50 queries per second, which is fairly low. This PR removes prepared statement caching completely. I'm currently doing a perf evaluation of this change, but we can probably merge it and if needed, bring back caching for subsystems that could benefit from it.
This change refactors the codebase to consistently use `toStellarValue` over various other one-off methods of working with values.
This change removes `master` from the set of branches that trigger `build.yml` workflow runs. The reason is that we currently build twice: once during the merge queue run, and once when master moves. Even though master is going to move to the exact same revision that the merge queue just tested. We do this on the off chance that master moved for some other reason (eg. someone did a force push). We try to _inhibit_ double-execution of CI by maintaining a somewhat fragile "which rev did we just test" file in the cache volume. But for this to work, cache volume hits have to be 100%, and they're presently nowhere near that. Every time there's a miss it will cost us an extra run. This PR proposes to remove `master` entirely and assume that 99.99% of the time nobody will ever force push to master, and if they _do_ they can also just manually run the CI action, it's just another button you can push. The odds that this will really be a problem in practice are very low and if it ever happened the very worst case would be that the next merge would notice it. IMO the savings in pointless re-executions are worth that (extremely small) risk.
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This is a large sync of the testnet branch with master, bringing in numerous protocol changes primarily focused on:
- Protocol 23 (Soroban state archival): Eviction scanning, in-memory state management, Hot Archive BucketList
- Protocol 24-25: Various Soroban improvements and new features
- Performance optimizations: In-memory bucket merges, improved caching, index improvements
- Infrastructure: Test framework improvements, CI updates, build system updates
Changes:
- Added Rust ed25519-dalek signature verification with new
VerifySigResultreturn type - Implemented in-memory bucket merge optimization for level 0 buckets
- Removed HOT_ARCHIVE_DELETED entry type (now only ARCHIVED and LIVE)
- Enhanced bucket indexing with type-specific range lookups
- Updated random number generator usage patterns throughout codebase
- Moved
#pragma onceto after copyright headers consistently - Updated const placement to follow project style (
constafter type)
Reviewed changes
Copilot reviewed 154 out of 840 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/crypto/* | Updated signature verification API with VerifySigResult, added Rust dalek support, moved pragma once, const placement fixes |
| src/catchup/* | Updated catchup workflow to remove bucket retention parameter, handle LedgerCloseMeta v2, formatting fixes |
| src/bucket/* | Major refactoring: in-memory merges, removed deleted entries from hot archive, type range indexing, improved eviction scanning |
test_expected_sizein check-sorobansfeeChargedvalue.getPossibleMuxedDataTransactionEventand emit stagemod protocol_agnosticand test for soroban_curr version.stream.pos()overUpgradeTimeoutLimitlogictoStellarValueDescription
Resolves #X
Checklist
clang-formatv8.0.0 (viamake formator the Visual Studio extension)