-
Notifications
You must be signed in to change notification settings - Fork 629
feat: add Galileo gas parameter derivation tooling #1776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Add scripts and environment setup for deriving Scroll gas fee parameters using data collected from L1 batches. Includes data collection, analysis, and batch visualization utilities.
📝 WalkthroughWalkthroughThis PR introduces a new script directory Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Script as derive_galileo_gas_parameter.py
participant Blockchain as Blockchain/Beacon RPC
participant FileSystem as File System
User->>Script: Run with collect mode
Script->>Blockchain: Fetch latest finalized blocks
Blockchain-->>Script: Block data + beacon slots
Script->>Blockchain: Fetch blob data and transaction receipts (parallelized)
Blockchain-->>Script: Raw transaction & blob data
Script->>Script: Parse blobs & compress transactions
Script->>Script: Build batch_df & tx_df DataFrames
Script->>FileSystem: Save data (pickle)
Script->>Script: Calculate penalty_multiplier (P95 tx size)
Script->>Script: Aggregate batch costs (commit/blob/finalize)
Script->>Script: Solve for commit_scalar & blob_scalar
Script->>Script: Analyze results (RMSE, MAE, recovery ratio)
Script-->>User: Output: parameters & metrics
User->>FileSystem: Inspect cached data
FileSystem-->>User: show_batches.py displays summary
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes The main script introduces substantial logic across ~25 public functions handling Web3 integration, blockchain data parsing (blobs, transactions, receipts), algebraic solving, and statistical analysis. The supporting scripts are simpler, but the heterogeneity of operations across transaction encoding, batch aggregation, and parameter solving requires careful review of data flow integrity and mathematical correctness. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (6)
scripts/derive_galileo_gas_parameter/Pipfile (1)
6-13: Consider pinning dependency versions for reproducibility.Using wildcard
"*"for most dependencies can lead to non-reproducible builds and potential breaking changes when new versions are released. For production tooling, consider pinning to specific versions or at least setting upper bounds.🔎 Suggested improvement
[packages] web3 = "<7,>=6" -pandas = "*" -numpy = "*" -rlp = "*" -zstandard = "*" -requests = "*" -async_timeout = "*" +pandas = ">=2.0,<3" +numpy = ">=1.24,<2" +rlp = ">=3.0,<4" +zstandard = ">=0.21,<1" +requests = ">=2.28,<3" +async_timeout = ">=4.0,<5"Alternatively, run
pipenv lockto generate aPipfile.lockthat captures exact versions for reproducibility.scripts/derive_galileo_gas_parameter/show_batches.py (2)
31-31: Remove extraneous f-prefix from string literal.This f-string has no placeholders.
🔎 Proposed fix
- print(f"\nBatch IDs:") + print("\nBatch IDs:")
72-73: Consider logging the exception type for better debugging.While catching broad
Exceptionis acceptable for a CLI utility, including the exception type would help with debugging.🔎 Proposed fix
except Exception as e: - print(f"Error reading file: {e}") + print(f"Error reading file: {type(e).__name__}: {e}")scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (3)
373-374: Global mutable state for caching.Using a global dictionary for caching works for single-threaded execution but could cause issues if this module is ever used in a multi-threaded context. Consider using a class-based approach or passing the cache explicitly if thread-safety becomes a concern.
426-426: Unused variableversion.This variable is assigned but never used. Consider either removing it or using it for validation (e.g., asserting expected version).
🔎 Proposed fix
- version = int(batch_data[0]) + version = int(batch_data[0]) # Currently unused, but parsed from batch header + # TODO: Add version validation if neededOr remove entirely if not needed:
- version = int(batch_data[0]) + # batch_data[0] contains version byte (unused)
1107-1112: DataFrame mutation side effect.This function modifies
tx_dfdirectly by adding new columns. Sincetx_dfis passed in and also used later inanalyze_results(which also modifies it), this side effect is likely intentional. However, consider either:
- Documenting this side effect in the docstring
- Using
tx_df = tx_df.copy()if you want to avoid mutation
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
scripts/derive_galileo_gas_parameter/Pipfile.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
scripts/derive_galileo_gas_parameter/Pipfilescripts/derive_galileo_gas_parameter/claude.mdscripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.pyscripts/derive_galileo_gas_parameter/show_batches.py
🧰 Additional context used
🪛 Ruff (0.14.10)
scripts/derive_galileo_gas_parameter/show_batches.py
19-19: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
31-31: f-string without any placeholders
Remove extraneous f prefix
(F541)
72-72: Do not catch blind exception: Exception
(BLE001)
scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py
38-38: Avoid specifying long messages outside the exception class
(TRY003)
40-40: Avoid specifying long messages outside the exception class
(TRY003)
146-146: Probable use of requests call without timeout
(S113)
186-186: Avoid specifying long messages outside the exception class
(TRY003)
331-331: Avoid specifying long messages outside the exception class
(TRY003)
398-398: Do not catch blind exception: Exception
(BLE001)
426-426: Local variable version is assigned to but never used
Remove assignment to unused variable version
(F841)
476-476: f-string without any placeholders
Remove extraneous f prefix
(F541)
518-518: Avoid specifying long messages outside the exception class
(TRY003)
522-522: f-string without any placeholders
Remove extraneous f prefix
(F541)
566-566: f-string without any placeholders
Remove extraneous f prefix
(F541)
607-607: f-string without any placeholders
Remove extraneous f prefix
(F541)
678-678: f-string without any placeholders
Remove extraneous f prefix
(F541)
744-744: f-string without any placeholders
Remove extraneous f prefix
(F541)
795-795: f-string without any placeholders
Remove extraneous f prefix
(F541)
809-809: f-string without any placeholders
Remove extraneous f prefix
(F541)
846-846: f-string without any placeholders
Remove extraneous f prefix
(F541)
884-884: f-string without any placeholders
Remove extraneous f prefix
(F541)
948-948: f-string without any placeholders
Remove extraneous f prefix
(F541)
959-959: f-string without any placeholders
Remove extraneous f prefix
(F541)
968-968: f-string without any placeholders
Remove extraneous f prefix
(F541)
1022-1022: Avoid specifying long messages outside the exception class
(TRY003)
1025-1025: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
1067-1067: f-string without any placeholders
Remove extraneous f prefix
(F541)
1096-1096: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
1097-1097: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
1127-1127: f-string without any placeholders
Remove extraneous f prefix
(F541)
1168-1168: f-string without any placeholders
Remove extraneous f prefix
(F541)
1170-1170: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF001)
1173-1173: f-string without any placeholders
Remove extraneous f prefix
(F541)
1175-1175: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF001)
1228-1228: f-string without any placeholders
Remove extraneous f prefix
(F541)
1240-1240: f-string without any placeholders
Remove extraneous f prefix
(F541)
1297-1297: f-string without any placeholders
Remove extraneous f prefix
(F541)
1302-1302: f-string without any placeholders
Remove extraneous f prefix
(F541)
1307-1307: f-string without any placeholders
Remove extraneous f prefix
(F541)
1330-1330: f-string without any placeholders
Remove extraneous f prefix
(F541)
1335-1335: f-string without any placeholders
Remove extraneous f prefix
(F541)
1340-1340: f-string without any placeholders
Remove extraneous f prefix
(F541)
1409-1409: Avoid specifying long messages outside the exception class
(TRY003)
1440-1440: f-string without any placeholders
Remove extraneous f prefix
(F541)
1444-1444: f-string without any placeholders
Remove extraneous f prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: tests
🔇 Additional comments (5)
scripts/derive_galileo_gas_parameter/claude.md (1)
1-70: Documentation looks good overall.The documentation clearly explains the CLI options, modes, and monitoring approach. A couple of minor suggestions:
- The hardcoded
/tmp/path won't work on Windows - consider noting this is Linux/macOS specific or using a cross-platform alternative.- Consider renaming to
CLAUDE.md(uppercase) orREADME.mdfor better visibility in the directory listing.scripts/derive_galileo_gas_parameter/show_batches.py (1)
17-19: Pickle deserialization - acceptable for local tooling.The static analysis flags
pickle.load()as a potential security issue (S301). Since this script only loads locally-generated pickle files from the companionderive_galileo_gas_parameter.pyscript, this is acceptable. However, adding a brief comment noting this assumption would be helpful for future maintainers.scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (3)
740-741: Verify block range calculation - potential off-by-one issue.The
last_blockis calculated asmax(initial_L2_block_number + num_blocks), and thenrange(first_block, last_block)is used. Since Python'srange()excludes the end value, the last block in the range won't be fetched.If
num_blocksrepresents the count of blocks starting frominitial_L2_block_number, the calculation should be:
- Last block index =
initial_L2_block_number + num_blocks - 1- Range should be
range(first_block, last_block + 1)to include itPlease verify the intended behavior:
- Does
num_blocksinclude or exclude the initial block?- Should the last block be fetched/processed?
Also applies to: 775-775, 813-813
198-331: Well-structured transaction serialization with good type coverage.The
get_raw_transaction_from_structuredfunction handles multiple transaction types comprehensively (Legacy, EIP-2930, EIP-1559, L1 message, and EIP-7702). The error handling for unsupported types is appropriate.
1362-1458: Well-structured main function with clear execution flow.The main function provides a clear step-by-step execution flow with good progress reporting. The argument handling is appropriate, and returning a results dictionary enables programmatic use of the script.
| url = f"{mainnet_beacon_url}/eth/v1/beacon/headers/head" | ||
| headers = {'accept': 'application/json'} | ||
| response = requests.get(url, headers=headers) | ||
| return int(response.json()['data']['header']['message']['slot']) - (latest_block_number - l1_head) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to prevent indefinite hangs.
This requests.get() call lacks a timeout, which could cause the script to hang indefinitely if the beacon API is unresponsive.
🔎 Proposed fix
url = f"{mainnet_beacon_url}/eth/v1/beacon/headers/head"
headers = {'accept': 'application/json'}
- response = requests.get(url, headers=headers)
+ response = requests.get(url, headers=headers, timeout=10)
return int(response.json()['data']['header']['message']['slot']) - (latest_block_number - l1_head)🧰 Tools
🪛 Ruff (0.14.10)
146-146: Probable use of requests call without timeout
(S113)
🤖 Prompt for AI Agents
In scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 144 to 147, the requests.get call has no timeout and can hang
indefinitely; modify the call to include a reasonable timeout (e.g., timeout=10)
and handle potential requests.exceptions.Timeout and other RequestException
errors by raising or returning a clear error so the script doesn't block
forever.
| print("=" * 60) | ||
|
|
||
| # Get L1 head | ||
| mainnet_beacon_url = "https://eth-mainnetbeacon.g.alchemy.com/v2/Lzj9QLupql91nuRFYAosy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Hardcoded API key must be moved to environment variable.
The Alchemy API key is hardcoded in the source code. This is a security risk as the key will be exposed in version control and could be abused.
🔎 Proposed fix
- mainnet_beacon_url = "https://eth-mainnetbeacon.g.alchemy.com/v2/Lzj9QLupql91nuRFYAosy"
+ mainnet_beacon_url = os.getenv('MAINNET_BEACON_URL', '')
+ if not mainnet_beacon_url:
+ raise ValueError("MAINNET_BEACON_URL environment variable is not set. Please set it to your Ethereum beacon chain RPC endpoint.")Also update the documentation in claude.md to mention this required environment variable.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mainnet_beacon_url = "https://eth-mainnetbeacon.g.alchemy.com/v2/Lzj9QLupql91nuRFYAosy" | |
| mainnet_beacon_url = os.getenv('MAINNET_BEACON_URL', '') | |
| if not mainnet_beacon_url: | |
| raise ValueError("MAINNET_BEACON_URL environment variable is not set. Please set it to your Ethereum beacon chain RPC endpoint.") |
🤖 Prompt for AI Agents
In scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
line 468 the Alchemy API key is hardcoded in mainnet_beacon_url; remove the
literal key and build the URL using an environment variable (e.g., read
ALCHEMY_API_KEY from os.environ), validate that the variable exists and raise a
clear error if missing, and ensure any tests/launch scripts use the env var;
also update claude.md to document the new ALCHEMY_API_KEY environment variable
and how to set it.
| while l1_base_fee == 0: | ||
| block = scroll_w3.eth.get_block(cur_block_num, full_transactions=True) | ||
| for tx in reversed(block['transactions']): | ||
| if tx.to == '0x5300000000000000000000000000000000000002' and tx.input.hex()[:10] == '0x39455d3a': | ||
| l1_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-64:-32], 'big') | ||
| l1_blob_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-32:], 'big') | ||
| cur_block_num -= 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential infinite loop or crash if L1 fee oracle transaction not found.
This while loop decrements cur_block_num without a bound check. If the fee oracle transaction is never found in the blocks being searched, this will either loop indefinitely or eventually crash when trying to fetch block 0 or negative block numbers.
🔎 Proposed fix
+ max_blocks_to_search = 1000 # Safety limit
+ blocks_searched = 0
while l1_base_fee == 0:
+ if blocks_searched >= max_blocks_to_search:
+ raise RuntimeError(f"Could not find L1 fee oracle transaction in {max_blocks_to_search} blocks starting from {first_block}")
block = scroll_w3.eth.get_block(cur_block_num, full_transactions=True)
for tx in reversed(block['transactions']):
if tx.to == '0x5300000000000000000000000000000000000002' and tx.input.hex()[:10] == '0x39455d3a':
l1_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-64:-32], 'big')
l1_blob_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-32:], 'big')
cur_block_num -= 1
+ blocks_searched += 1🤖 Prompt for AI Agents
In scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 800 to 806, the while loop that decrements cur_block_num can run forever
or underflow if the target L1 fee oracle transaction is never found; add a
bounded search by introducing a minimum block number or max_lookback counter and
check it each iteration, bail out when reached (raise a clear exception or log
and exit) and wrap the block fetch in try/except to handle out-of-range
requests; optionally add a configurable parameter (e.g., MAX_LOOKBACK or
min_block_num) and use it to stop the loop gracefully instead of looping until
negative block numbers.
Summary
Changes
derive_galileo_gas_parameter.py: Main script for collecting batch data and deriving optimal gas parametersshow_batches.py: Utility to display batch informationSummary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.