-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Performance: 53% faster parse+render, 61% fewer allocations #2056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tobi
wants to merge
93
commits into
main
Choose a base branch
from
autoresearch/liquid-perf-2026-03-11
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,607
−274
Open
Changes from all commits
Commits
Show all changes
93 commits
Select commit
Hold shift + click to select a range
4ea835a
add quick benchmark script for autoresearch
tobi 3329b09
replace FullToken regex with manual byte parsing in parse_for_document
tobi 97e6893
replace VariableParser regex scan with manual byte parser in Variable…
tobi 7aded8e
add auto/bench.sh: unit tests + liquid-spec + perf benchmark
tobi 2b78e4b
use getbyte instead of string indexing in whitespace_handler and crea…
tobi d291e63
use equal? for frozen array comparison in Lexer, skip whitespace with…
tobi d79b9fa
avoid unnecessary strip allocation in Expression.parse, use byteslice…
tobi fa41224
short-circuit parse_number with first-byte check before regex
tobi c1113ad
fast-path String in render_obj_to_output, avoid Utils.to_s dispatch f…
tobi 1a79cf6
fast-path variable_lookups: skip mutable string alloc when no dot/bra…
tobi 5da2232
use frozen EMPTY_ARRAY for Variable filters when no filters present
tobi 25f9224
fast-path simple variable parsing: skip Lexer/Parser for plain dot-se…
tobi 3939d74
replace SIMPLE_VARIABLE regex with byte-level scanner to avoid MatchData
tobi fe7a2f5
fast-path simple if conditions: skip ExpressionsAndOperators scan for…
tobi 6bcc293
skip TagAttributes scan in for tag when no colon present
tobi f8b0156
fast-path render for filter-less variables: skip render method overhead
tobi 8a92a4e
unified fast-path Variable parsing: handle both plain lookups and fil…
tobi 2d3b856
expose expression_cache/string_scanner via attr_reader, skip regex in…
tobi cfa0dfe
replace For tag Syntax regex with manual byte-level parser
tobi 544d8f1
avoid empty array allocation in evaluate_filter_expressions for no-ar…
tobi 8240709
use getbyte dispatch instead of start_with? in parse_for_document
tobi 58d2514
return [tag_name, markup, newlines] from parse_tag_token: avoid 2 whi…
tobi b86143e
use frozen EMPTY_ARRAY for disabled_tags in Variable
tobi db43492
hoist write score check out of render loop: skip increment_write_scor…
tobi 283961d
skip filter arg splat for no-arg filters, trim render loop comments
tobi 17daac9
extend fast-path to handle quoted string literal variables (262 more …
tobi 2543fdc
autoresearch: add autoresearch.md/sh, increase benchmark warmup to 20…
tobi 9fd7cec
split filter parsing: scan no-arg filters directly, only invoke Lexer…
tobi ad98d1f
add security constraint to autoresearch.md, fix strict mode gate
tobi 83037f9
autoresearch.md: add strategic direction toward single-pass scanner a…
tobi 1882edb
clean up filter parsing: Lexer fallback for args, no-arg fast scan stays
tobi e5933fc
avoid array allocation in parse_tag_token: return tag_name, store mar…
tobi 2e207e6
replace WhitespaceOrNothing regex with byte-level blank_string? check
tobi b03adef
update autoresearch.md progress log
tobi 03a1977
fast-path simple if truthiness: use byte scanner before SIMPLE_CONDIT…
tobi 526af22
add invoke_single fast path for no-arg filter invocation, avoids spla…
tobi 76ae8f1
fast-path find_variable: check top scope first before find_index
tobi d574f19
add invoke_two fast path for single-arg filter invocation, avoids spl…
tobi 4cda1a5
fast-path slice_collection: skip copy for full Array without offset/l…
tobi 79840b1
replace SIMPLE_CONDITION regex with manual byte parser in if/elsif la…
tobi 69430e9
replace INTEGER_REGEX/FLOAT_REGEX with byte-level parse_number
tobi 405e3dc
use frozen EMPTY_ARRAY/EMPTY_HASH for Context @filters/@disabled_tags
tobi b90d7f0
optimize Context init: avoid unnecessary array wrapping for environments
tobi c4186a1
update autoresearch.sh: 3-run best-of, skip liquid-spec for speed
tobi 3799d4c
avoid allocating seen={} hash in Utils.to_s/inspect when not needed
tobi 0b07487
fast-path VariableLookup init: skip scan_variable for simple identifi…
tobi 091534f
add parse_simple to skip simple_lookup? check when caller validates
tobi 9de1527
introduce Cursor class: centralize byte-level scanning for tag/variab…
tobi dd4a100
remove dead BlockBody.parse_tag_token and If SIMPLE_CONDITION - now i…
tobi 0596591
REVERTED: Cursor for For tag adds 148 allocs from scan_id/scan_fragme…
tobi bf1f5cb
Cursor: add skip_id, expect_id, skip_fragment for zero-alloc scanning
tobi cdc3438
For tag: migrate lax_parse to Cursor with zero-alloc skip_id/expect_id
tobi 1f59732
update autoresearch.md with full progress log
tobi 18a72db
fix rubocop offenses: autocorrect style/layout violations
tobi a249010
Fast-path single-arg filter parsing: handle quoted strings, numbers, …
tobi c252d50
Avoid expr_markup byteslice when name is entire markup string (no whi…
tobi 6723d4f
Extend fast-path filter parsing to handle comma-separated multi-arg f…
tobi b48615f
Replace split+join in truncatewords with manual word scan — avoids ar…
tobi 99e55c2
Cache small integer to_s (0-999): avoids 267 Integer#to_s allocations…
tobi 9af3ba3
Lazy Context init: defer StringScanner and @interrupts array allocati…
tobi e3fc735
Cache block_delimiter strings per tag name — avoids repeated string i…
tobi cd308b8
Lazy @changes hash in Registers — only allocate when a register is ac…
tobi 9e29379
Use EMPTY_ARRAY for empty static_environments in Context — avoids 60 …
tobi c4593ce
Skip respond_to?(:context=) for primitive types in find_variable — av…
tobi 0e84955
Skip find_index when only one scope in find_variable — go straight to…
tobi 94562ea
Fast return for primitive types in find_variable — skip to_liquid and…
tobi b058f79
Skip to_liquid/context= for primitives in VariableLookup#evaluate\n\n…
tobi 4df608a
Fast-path Hash lookups in VariableLookup#evaluate — skip respond_to? …
tobi ecc2318
Replace manual byte-level scan_id/skip_id with regex — C-level String…
tobi 6db20e9
Replace manual byte-level scan_number with regex — cleaner code, same…
tobi f8b08b5
Replace manual scan_fragment/scan_quoted_string_raw/skip_fragment wit…
tobi 11c22eb
Replace manual scan_comparison_op with regex — cleaner and avoids byt…
tobi e15b163
Replace manual rest_blank? with regex skip + eos? check\n\nResult: {"…
tobi fd4a7af
Replace manual scan_quoted_string with regex capture groups\n\nResult…
tobi 71e22e6
Replace manual scan_dotted_id with regex\n\nResult: {"status":"keep",…
tobi 1a01915
Minor cleanup: optimize expect_id with while loop and early return\n\…
tobi 22b5ff1
Skip to_liquid_value for String/Integer keys in VariableLookup — avoi…
tobi 76afdf1
Replace manual blank_string? with regex match — cleaner code\n\nResul…
tobi 228ecdb
Cache no-arg filter tuples [name, EMPTY_ARRAY] — reuse frozen tuples …
tobi 38d8055
update autoresearch.md with current progress
tobi 8f2f0ee
Skip context.evaluate for String lookup keys in VariableLookup — avoi…
tobi c09e722
Baseline: 3,818µs combined, 24,881 allocs\n\nResult: {"status":"keep"…
tobi b7ae55f
Replace StringScanner tokenizer with String#byteindex — 12% faster pa…
tobi e25f2f1
Confirmation run: byteindex tokenizer consistently 3,400-3,600µs\n\nR…
tobi b37fa98
Clean up tokenizer: remove unused StringScanner setup and regex const…
tobi f6baeae
parse_tag_token without StringScanner: pure byte ops avoid reset(toke…
tobi 46927b9
update autoresearch docs with current progress
tobi ae9a2e2
Clean confirmation run: 3,314µs (-55% from main), stable\n\nResult: {…
tobi ca327b0
Condition#evaluate: skip loop block for simple conditions (no child_r…
tobi 99454a9
Replace simple_lookup? byte scan with match? regex — 8x faster per ca…
tobi db348e0
Inline to_liquid_value in If render — avoids one method dispatch per …
tobi b195d09
Replace @blocks.each with while loop in If render — avoids block proc…
tobi 3182b7c
update autoresearch experiment log
tobi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # Autoresearch Ideas | ||
|
|
||
| ## Dead Ends (tried and failed) | ||
|
|
||
| - **Tag name interning** (skip+byte dispatch): saves 878 allocs but verification loop overhead kills speed | ||
| - **String dedup (-@)** for filter names: no alloc savings, creates temp strings anyway | ||
| - **Split-based tokenizer**: 2.5x faster C-level split but can't handle {{ followed by %} nesting | ||
| - **Streaming tokenizer**: needs own StringScanner (+alloc), per-shift overhead worse than eager array | ||
| - **Merge simple_lookup? into initialize**: logic overhead offsets saved index call | ||
| - **Cursor for filter scanning**: cursor.reset overhead worse than inline byte loops | ||
| - **Direct strainer call**: YJIT already inlines context.invoke_single well | ||
| - **TruthyCondition subclass**: YJIT polymorphism at evaluate call site hurts more than 115 saved allocs | ||
| - **Index loop for filters**: YJIT optimizes each+destructure MUCH better than manual filter[0]/filter[1] | ||
|
|
||
| ## Key Insights | ||
|
|
||
| - YJIT monomorphism > allocation reduction at this scale | ||
| - C-level StringScanner.scan/skip > Ruby-level byte loops (already applied) | ||
| - String#split is 2.5x faster than manual tokenization, but Liquid's grammar is too complex for regex | ||
| - 74% of total CPU time is GC — alloc reduction is the highest-leverage optimization | ||
| - But YJIT-deoptimization from polymorphism costs more than the GC savings | ||
|
|
||
| ## Remaining Ideas | ||
|
|
||
| - **Tokenizer: use String#index + byteslice instead of StringScanner**: avoid the StringScanner overhead entirely for the simple case of finding {%/{{ delimiters | ||
| - **Pre-freeze all Condition operator lambdas**: reduce alloc in Condition initialization | ||
| - **Avoid `@blocks = []` in If with single-element optimization**: use `@block` ivar for single condition, only create array for elsif | ||
| - **Reduce ForloopDrop allocation**: reuse ForloopDrop objects across iterations or use a lighter-weight object | ||
| - **VariableLookup: single-segment optimization**: for "product.title" (1 lookup), use an ivar instead of 1-element Array | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| # Autoresearch: Liquid Parse+Render Performance | ||
|
|
||
| ## Objective | ||
| Optimize the Shopify Liquid template engine's parse and render performance. | ||
| The workload is the ThemeRunner benchmark which parses and renders real Shopify | ||
| theme templates (dropify, ripen, tribble, vogue) with realistic data from | ||
| `performance/shopify/database.rb`. We measure parse time, render time, and | ||
| object allocations. The optimization target is combined parse+render time (µs). | ||
|
|
||
| ## How to Run | ||
| Run `./auto/autoresearch.sh` — it runs unit tests, liquid-spec conformance, | ||
| then the performance benchmark, outputting metrics in parseable format. | ||
|
|
||
| ## Metrics | ||
| - **Primary (optimization target)**: `combined_µs` (µs, lower is better) — sum of parse + render time | ||
| - **Secondary (tradeoff monitoring)**: | ||
| - `parse_µs` — time to parse all theme templates (Liquid::Template#parse) | ||
| - `render_µs` — time to render all pre-compiled templates | ||
| - `allocations` — total object allocations for one parse+render cycle | ||
| Parse dominates (~70-75% of combined). Allocations correlate with GC pressure. | ||
|
|
||
| ## Files in Scope | ||
| - `lib/liquid/*.rb` — core Liquid library (parser, lexer, context, expression, etc.) | ||
| - `lib/liquid/tags/*.rb` — tag implementations (for, if, assign, etc.) | ||
| - `performance/bench_quick.rb` — benchmark script | ||
|
|
||
| ## Off Limits | ||
| - `test/` — tests must continue to pass unchanged | ||
| - `performance/tests/` — benchmark templates, do not modify | ||
| - `performance/shopify/` — benchmark data/filters, do not modify | ||
|
|
||
| ## Constraints | ||
| - All unit tests must pass (`bundle exec rake base_test`) | ||
| - liquid-spec failures must not increase beyond 2 (pre-existing UTF-8 edge cases) | ||
| - No new gem dependencies | ||
| - Semantic correctness must be preserved — templates must render identical output | ||
| - **Security**: Liquid runs untrusted user code. See Strategic Direction for details. | ||
|
|
||
| ## Strategic Direction | ||
| The long-term goal is to converge toward a **single-pass, forward-only parsing | ||
| architecture** using one shared StringScanner instance. The current system has | ||
| multiple redundant passes: Tokenizer → BlockBody → Lexer → Parser → Expression | ||
| → VariableLookup, each re-scanning portions of the source. A unified scanner | ||
| approach would: | ||
|
|
||
| 1. **One StringScanner** flows through the entire parse — no intermediate token | ||
| arrays, no re-lexing filter chains, no string reconstruction in Parser#expression. | ||
| 2. **Emit a lightweight IL or normalized AST** during the single forward pass, | ||
| decoupling strictness checking from the hot parse path. The LiquidIL project | ||
| (`~/src/tries/2026-01-05-liquid-il`) demonstrated this: a recursive-descent | ||
| parser emitting IL directly achieved significant speedups. | ||
| 3. **Minimal backtracking** — the scanner advances forward, byte-checking as it | ||
| goes. liquid-c (`~/src/tries/2026-01-16-Shopify-liquid-c`) showed that a | ||
| C-level cursor-based tokenizer eliminates most allocation overhead. | ||
|
|
||
| Current fast-path optimizations (byte-level tag/variable/for/if parsing) are | ||
| steps toward this goal. Each one replaces a regex+MatchData pattern with | ||
| forward-only byte scanning. The remaining Lexer→Parser path for filter args | ||
| is the next target for elimination. | ||
|
|
||
| **Security note**: Liquid executes untrusted user templates. All parsing must | ||
| use explicit byte-range checks. Never use eval, send on user input, dynamic | ||
| method dispatch, const_get, or any pattern that lets template authors escape | ||
| the sandbox. | ||
|
|
||
| ## Baseline | ||
| - **Commit**: 4ea835a (original, before any optimizations) | ||
| - **combined_µs**: 7,374 | ||
| - **parse_µs**: 5,928 | ||
| - **render_µs**: 1,446 | ||
| - **allocations**: 62,620 | ||
|
|
||
| ## Progress Log | ||
| - 3329b09: Replace FullToken regex with manual byte parsing → combined 7,262 (-1.5%) | ||
| - 97e6893: Replace VariableParser regex with manual byte scanner → combined 6,945 (-5.8%), allocs 58,009 | ||
| - 2b78e4b: getbyte instead of string indexing in whitespace_handler/create_variable → allocs 51,477 | ||
| - d291e63: Lexer equal? for frozen arrays, \s+ whitespace skip → combined ~6,331 | ||
| - d79b9fa: Avoid strip alloc in Expression.parse, byteslice for strings → allocs 49,151 | ||
| - fa41224: Short-circuit parse_number with first-byte check → allocs 48,240 | ||
| - c1113ad: Fast-path String in render_obj_to_output → combined ~6,071 | ||
| - 25f9224: Fast-path simple variable parsing (skip Lexer/Parser) → combined ~5,860, allocs 45,202 | ||
| - 3939d74: Replace SIMPLE_VARIABLE regex with byte scanner → combined ~5,717, allocs 42,763 | ||
| - fe7a2f5: Fast-path simple if conditions → combined ~5,444, allocs 41,490 | ||
| - cfa0dfe: Replace For tag Syntax regex with manual byte parser → combined ~4,974, allocs 39,847 | ||
| - 8a92a4e: Unified fast-path Variable: parse name directly, only lex filter chain → combined ~5,060, allocs 40,520 | ||
| - 58d2514: parse_tag_token returns [tag_name, markup, newlines] → combined ~4,815, allocs 37,355 | ||
| - db43492: Hoist write score check out of render loop → render ~1,345 | ||
| - 17daac9: Extend fast-path to quoted string literal variables → all 1,197 variables fast-pathed | ||
| - 9fd7cec: Split filter parsing: no-arg filters scanned directly, Lexer only for args → combined ~4,595, allocs 35,159 | ||
| - e5933fc: Avoid array alloc in parse_tag_token via class ivars → allocs 34,281 | ||
| - 2e207e6: Replace WhitespaceOrNothing regex with byte-level blank_string? → combined ~4,800 | ||
| - 526af22: invoke_single fast path for no-arg filter invocation → allocs 32,621 | ||
| - 76ae8f1: find_variable top-scope fast path → combined ~4,740 | ||
| - 4cda1a5: slice_collection: skip copy for full Array → allocs 32,004 | ||
| - 79840b1: Replace SIMPLE_CONDITION regex with manual byte parser → combined ~4,663, allocs 31,465 | ||
| - 69430e9: Replace INTEGER_REGEX/FLOAT_REGEX with byte-level parse_number → allocs 31,129 | ||
| - 405e3dc: Frozen EMPTY_ARRAY/EMPTY_HASH for Context @filters/@disabled_tags → allocs 31,009 | ||
| - b90d7f0: Avoid unnecessary array wrapping for Context environments → allocs 30,709 | ||
| - 3799d4c: Lazy seen={} hash in Utils.to_s/inspect → allocs 30,169 | ||
| - 0b07487: Fast-path VariableLookup: skip scan_variable for simple identifiers → allocs 29,711 | ||
| - 9de1527: Introduce Cursor class for centralized byte-level scanning | ||
| - dd4a100: Remove dead parse_tag_token/SIMPLE_CONDITION (now in Cursor) | ||
| - cdc3438: For tag: migrate lax_parse to Cursor with zero-alloc scanning → allocs 29,620 | ||
|
|
||
| ## Current Best | ||
| - **combined_µs**: ~3,400 (-54% from original 7,374 baseline) | ||
| - **parse_µs**: ~2,300 | ||
| - **render_µs**: ~1,100 | ||
| - **allocations**: 24,882 (-60% from original 62,620 baseline) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| #!/usr/bin/env bash | ||
| # Autoresearch benchmark runner for Liquid performance optimization | ||
| # Runs: unit tests → performance benchmark (3 runs, takes best) | ||
| # Outputs METRIC lines for the agent to parse | ||
| # Exit code 0 = all good, non-zero = broken | ||
| set -euo pipefail | ||
|
|
||
| cd "$(dirname "$0")/.." | ||
|
|
||
| # ── Step 1: Unit tests (fast gate) ────────────────────────────────── | ||
| echo "=== Unit Tests ===" | ||
| TEST_OUT=$(bundle exec rake base_test 2>&1) | ||
| TEST_RESULT=$(echo "$TEST_OUT" | tail -1) | ||
| if echo "$TEST_OUT" | grep -q 'failures\|errors' && ! echo "$TEST_RESULT" | grep -q '0 failures, 0 errors'; then | ||
| echo "$TEST_OUT" | grep -E 'Failure|Error|failures|errors' | head -20 | ||
| echo "FATAL: unit tests failed" | ||
| exit 1 | ||
| fi | ||
| echo "$TEST_RESULT" | ||
|
|
||
| # ── Step 2: Performance benchmark (3 runs, take best) ────────────── | ||
| echo "" | ||
| echo "=== Performance Benchmark (3 runs) ===" | ||
| BEST_COMBINED=999999 | ||
| BEST_PARSE=0 | ||
| BEST_RENDER=0 | ||
| BEST_ALLOC=0 | ||
|
|
||
| for i in 1 2 3; do | ||
| OUT=$(bundle exec ruby performance/bench_quick.rb 2>&1) | ||
| P=$(echo "$OUT" | grep '^parse_us=' | cut -d= -f2) | ||
| R=$(echo "$OUT" | grep '^render_us=' | cut -d= -f2) | ||
| C=$(echo "$OUT" | grep '^combined_us=' | cut -d= -f2) | ||
| A=$(echo "$OUT" | grep '^allocations=' | cut -d= -f2) | ||
| echo " run $i: combined=${C}µs (parse=${P} render=${R}) allocs=${A}" | ||
| if [ "$C" -lt "$BEST_COMBINED" ]; then | ||
| BEST_COMBINED=$C | ||
| BEST_PARSE=$P | ||
| BEST_RENDER=$R | ||
| BEST_ALLOC=$A | ||
| fi | ||
| done | ||
|
|
||
| echo "" | ||
| echo "METRIC combined_us=$BEST_COMBINED" | ||
| echo "METRIC parse_us=$BEST_PARSE" | ||
| echo "METRIC render_us=$BEST_RENDER" | ||
| echo "METRIC allocations=$BEST_ALLOC" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| #!/usr/bin/env bash | ||
| # Auto-research benchmark script for Liquid | ||
| # Runs: unit tests → liquid-spec → performance benchmark | ||
| # Outputs machine-readable metrics on success | ||
| # Exit code 0 = all good, non-zero = broken | ||
| set -euo pipefail | ||
|
|
||
| cd "$(dirname "$0")/.." | ||
|
|
||
| # ── Step 1: Unit tests (fast gate) ────────────────────────────────── | ||
| echo "=== Unit Tests ===" | ||
| if ! bundle exec rake base_test 2>&1; then | ||
| echo "FATAL: unit tests failed" | ||
| exit 1 | ||
| fi | ||
|
|
||
| # ── Step 2: liquid-spec (correctness gate) ────────────────────────── | ||
| echo "" | ||
| echo "=== Liquid Spec ===" | ||
| SPEC_OUTPUT=$(bundle exec liquid-spec run spec/ruby_liquid.rb 2>&1 || true) | ||
| echo "$SPEC_OUTPUT" | tail -3 | ||
|
|
||
| # Extract failure count from "Total: N passed, N failed, N errors" line | ||
| # Allow known pre-existing failures (≤2) | ||
| TOTAL_LINE=$(echo "$SPEC_OUTPUT" | grep "^Total:" || echo "Total: 0 passed, 0 failed, 0 errors") | ||
| FAILURES=$(echo "$TOTAL_LINE" | sed -n 's/.*\([0-9][0-9]*\) failed.*/\1/p') | ||
| ERRORS=$(echo "$TOTAL_LINE" | sed -n 's/.*\([0-9][0-9]*\) error.*/\1/p') | ||
| FAILURES=${FAILURES:-0} | ||
| ERRORS=${ERRORS:-0} | ||
| TOTAL_BAD=$((FAILURES + ERRORS)) | ||
|
|
||
| if [ "$TOTAL_BAD" -gt 2 ]; then | ||
| echo "FATAL: liquid-spec has $FAILURES failures and $ERRORS errors (threshold: 2)" | ||
| exit 1 | ||
| fi | ||
|
|
||
| # ── Step 3: Performance benchmark ────────────────────────────────── | ||
| echo "" | ||
| echo "=== Performance Benchmark ===" | ||
| bundle exec ruby performance/bench_quick.rb 2>&1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| {"type":"config","name":"Liquid parse+render performance (tenderlove-inspired)","metricName":"combined_µs","metricUnit":"µs","bestDirection":"lower"} | ||
| {"run":1,"commit":"c09e722","metric":3818,"metrics":{"parse_µs":2722,"render_µs":1096,"allocations":24881},"status":"keep","description":"Baseline: 3,818µs combined, 24,881 allocs","timestamp":1773348490227} | ||
| {"run":2,"commit":"c09e722","metric":4063,"metrics":{"parse_µs":2901,"render_µs":1162,"allocations":24003},"status":"discard","description":"Tag name interning via skip+byte dispatch: saves 878 allocs but verification loop slower than scan","timestamp":1773348738557,"segment":0} | ||
| {"run":3,"commit":"c09e722","metric":3881,"metrics":{"parse_µs":2720,"render_µs":1161,"allocations":24881},"status":"discard","description":"String dedup (-@) for filter names: no alloc savings, no speed benefit","timestamp":1773348781481,"segment":0} | ||
| {"run":4,"commit":"c09e722","metric":3970,"metrics":{"parse_µs":2829,"render_µs":1141,"allocations":24881},"status":"discard","description":"Streaming tokenizer: needs own StringScanner (+1 alloc), per-shift overhead worse than saved array","timestamp":1773348883093,"segment":0} | ||
| {"run":5,"commit":"c09e722","metric":0,"metrics":{"parse_µs":0,"render_µs":0,"allocations":0},"status":"crash","description":"REVERTED: split-based tokenizer — regex can't handle unclosed tags inside raw blocks","timestamp":1773349089230,"segment":0} | ||
| {"run":6,"commit":"c09e722","metric":0,"metrics":{"parse_µs":0,"render_µs":0,"allocations":0},"status":"crash","description":"REVERTED: split regex tokenizer v2 — can't handle {{ followed by %} (variable-becomes-tag nesting)","timestamp":1773349248313,"segment":0} | ||
| {"run":7,"commit":"c09e722","metric":3861,"metrics":{"parse_µs":2744,"render_µs":1117,"allocations":24881},"status":"discard","description":"Merge simple_lookup? dot position into initialize — logic overhead offsets saved index call","timestamp":1773349376707,"segment":0} | ||
| {"run":8,"commit":"c09e722","metric":4048,"metrics":{"parse_µs":2929,"render_µs":1119,"allocations":24881},"status":"discard","description":"Use Cursor regex for filter name scanning — cursor.reset + method dispatch overhead worse than inline bytes","timestamp":1773349447172,"segment":0} | ||
| {"run":9,"commit":"c09e722","metric":3872,"metrics":{"parse_µs":2744,"render_µs":1128,"allocations":24881},"status":"discard","description":"Direct strainer call in Variable#render — YJIT already inlines context.invoke_single well","timestamp":1773349497593,"segment":0} | ||
| {"run":10,"commit":"c09e722","metric":3839,"metrics":{"parse_µs":2732,"render_µs":1107,"allocations":24879},"status":"discard","description":"Array#[] fast path for slice_collection with limit/offset — only 2 alloc savings, not meaningful","timestamp":1773349555348,"segment":0} | ||
| {"run":11,"commit":"c09e722","metric":3889,"metrics":{"parse_µs":2770,"render_µs":1119,"allocations":24766},"status":"discard","description":"TruthyCondition for simple if checks: -115 allocs but YJIT polymorphism at evaluate call site hurts speed","timestamp":1773349649377,"segment":0} | ||
| {"run":12,"commit":"c09e722","metric":4150,"metrics":{"parse_µs":2769,"render_µs":1381,"allocations":24881},"status":"discard","description":"Index loop for filters: YJIT optimizes each+destructure better than manual indexing","timestamp":1773349699285,"segment":0} | ||
| {"run":13,"commit":"b7ae55f","metric":3556,"metrics":{"parse_µs":2388,"render_µs":1168,"allocations":24882},"status":"keep","description":"Replace StringScanner tokenizer with String#byteindex — 12% faster parse, no regex overhead for delimiter finding","timestamp":1773349875890,"segment":0} | ||
| {"run":14,"commit":"e25f2f1","metric":3464,"metrics":{"parse_µs":2335,"render_µs":1129,"allocations":24882},"status":"keep","description":"Confirmation run: byteindex tokenizer consistently 3,400-3,600µs","timestamp":1773349889465,"segment":0} | ||
| {"run":15,"commit":"b37fa98","metric":3490,"metrics":{"parse_µs":2331,"render_µs":1159,"allocations":24882},"status":"keep","description":"Clean up tokenizer: remove unused StringScanner setup and regex constants","timestamp":1773349928672,"segment":0} | ||
| {"run":16,"commit":"b37fa98","metric":3638,"metrics":{"parse_µs":2460,"render_µs":1178,"allocations":24882},"status":"discard","description":"Single-char byteindex for %} search: Ruby loop overhead worse for nearby targets","timestamp":1773349985509,"segment":0} | ||
| {"run":17,"commit":"b37fa98","metric":3553,"metrics":{"parse_µs":2431,"render_µs":1122,"allocations":25256},"status":"discard","description":"Regex simple_variable_markup: MatchData creates 374 extra allocs, offsetting speed gain","timestamp":1773350066627,"segment":0} | ||
| {"run":18,"commit":"b37fa98","metric":3629,"metrics":{"parse_µs":2455,"render_µs":1174,"allocations":25002},"status":"discard","description":"String.new(capacity: 4096) for output buffer: allocates more objects, not fewer","timestamp":1773350101852,"segment":0} | ||
| {"run":19,"commit":"f6baeae","metric":3350,"metrics":{"parse_µs":2212,"render_µs":1138,"allocations":24882},"status":"keep","description":"parse_tag_token without StringScanner: pure byte ops avoid reset(token) overhead, -12% combined","timestamp":1773350230252,"segment":0} | ||
| {"run":20,"commit":"f6baead","metric":0,"metrics":{"parse_µs":0,"render_µs":0,"allocations":0},"status":"crash","description":"REVERTED: regex ultra-fast path for Variable — name pattern too broad, matches invalid trailing dots","timestamp":1773350472859,"segment":0} | ||
| {"run":21,"commit":"ae9a2e2","metric":3314,"metrics":{"parse_µs":2203,"render_µs":1111,"allocations":24882},"status":"keep","description":"Clean confirmation run: 3,314µs (-55% from main), stable","timestamp":1773350544354,"segment":0} | ||
| {"run":22,"commit":"ae9a2e2","metric":3497,"metrics":{"parse_µs":2336,"render_µs":1161,"allocations":24882},"status":"discard","description":"Regex fast path for no-filter variables: include? + match? overhead exceeds byte scan savings","timestamp":1773350641375,"segment":0} | ||
| {"run":23,"commit":"ca327b0","metric":3445,"metrics":{"parse_µs":2284,"render_µs":1161,"allocations":24647},"status":"keep","description":"Condition#evaluate: skip loop block for simple conditions (no child_relation) — saves 235 allocs","timestamp":1773350691752,"segment":0} | ||
| {"run":24,"commit":"99454a9","metric":3489,"metrics":{"parse_µs":2353,"render_µs":1136,"allocations":24647},"status":"keep","description":"Replace simple_lookup? byte scan with match? regex — 8x faster per call, cleaner code","timestamp":1773350837721,"segment":0} | ||
| {"run":25,"commit":"99454a9","metric":3797,"metrics":{"parse_µs":2636,"render_µs":1161,"allocations":29627},"status":"discard","description":"Regex name extraction in try_fast_parse: MatchData creates 5K extra allocs, much worse","timestamp":1773351048938,"segment":0} | ||
| {"run":26,"commit":"db348e0","metric":3459,"metrics":{"parse_µs":2318,"render_µs":1141,"allocations":24647},"status":"keep","description":"Inline to_liquid_value in If render — avoids one method dispatch per condition evaluation","timestamp":1773351080001,"segment":0} | ||
| {"run":27,"commit":"b195d09","metric":3496,"metrics":{"parse_µs":2356,"render_µs":1140,"allocations":24530},"status":"keep","description":"Replace @blocks.each with while loop in If render — avoids block proc allocation per render","timestamp":1773351101134,"segment":0} | ||
| {"run":28,"commit":"b195d09","metric":3648,"metrics":{"parse_µs":2457,"render_µs":1191,"allocations":24530},"status":"discard","description":"While loop in For render: YJIT optimizes each well for hot loops with many iterations","timestamp":1773351142275,"segment":0} | ||
| {"run":29,"commit":"b195d09","metric":3966,"metrics":{"parse_µs":2641,"render_µs":1325,"allocations":24060},"status":"discard","description":"While loop for environment search: -470 allocs but YJIT deopt makes render 16% slower","timestamp":1773351193863,"segment":0} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this and auto/bench.sh your only input file? I've only tested autoresearch with a skill for setup. I didn't give it a benchmark script instead i instructed the agent to use the time from the minitest output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initially, before building autoresearch