Conversation
For my test app with 100K tiddlers and 20K tags, the tag indexer improvement gives a significant improvement:
| Metric | Before | After Tag Indexer | Improvement |
|--------|--------|-------------------|-------------|
| **Initial render** | 261.95ms | 83.51ms | **3.1x faster** |
| **Total refresh** | 2,496.25ms | 1,338.84ms | **1.9x faster (46%)** |
| **Mean refresh** | 416.04ms | 223.14ms | **1.9x faster** |
| **Max refresh** | 480.76ms | 287.26ms | **1.7x faster** |
The specific filter impact: `[subfilter{$:/core/config/GlobalImportFilter}]` went from **1,172ms** (the #1 bottleneck) to effectively **0ms** — it dropped out of the top 20 entirely. That single filter was consuming 47% of total refresh time.
The root cause was that `TagSubIndexer.prototype.update()` was setting `this.index = null` on every tiddler change, forcing a complete rebuild (iterating all 95k tiddlers) on the next tag lookup. The fix makes it incremental — just removing/adding the changed tiddler's tags — which is O(number of tags on the changed tiddler) instead of O(all tiddlers in the wiki).
✅ Deploy Preview for tiddlywiki-previews ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Confirmed: Jermolene has already signed the Contributor License Agreement (see contributing.md) |
📊 Build Size Comparison:
|
| Branch | Size |
|---|---|
| Base (master) | 2487.2 KB |
| PR | 2496.6 KB |
Diff: ⬆️ Increase: +9.3 KB
⚠️ Change Note Status
This PR appears to contain code changes but doesn't include a change note.
Please add a change note by creating a .tid file in editions/tw5.com/tiddlers/releasenotes/<version>/
📚 Documentation: Release Notes and Changes
💡 Note: If this is a documentation-only change, you can ignore this message.
|
That's very interesting. I did also let Claude play with performance for the following elements.
Making it measurable was the hardest thing to do. I did create a little test runner, that could be activated with Before testing it did create 10000 tiddlers with elements that should be found. I will use this plugin to see, if it comes up with the same results and or solutions. -- Cool stuff!! |
|
For getTiddlerBacklinks I got a 6x improvement. ... with 10000 tids, 10% of them link each other, 20% have no links at all, 10% link targets are not existent, links per tiddler 1 - 5 random, 2 warmup runs, 5 measurement runs. ... For getOrphans it was between 4x-19x .. may be still a warm-up problem. -- will test with this plugin again |
These ones stemming from a wiki with 80K tiddlers and 5K tags, and a total wiki size of 150MB. > The biggest win is the P50 (median) refresh dropping from 124ms to 46ms — a 63% improvement
|
@Jermolene ... Do you have something like a summary report, what the different changes do? IMO it would be nice, to know what actually was going on. |
|
@Jermolene ... I let Claude find out, what the different optimisations do. But there is still one problem left. This PR does not contain any info, how to reproduce any test runs. So IMO there needs to be some info, how to create your test wiki, and a replay-receipt, that you used for testing. IMO the info should be somewhere in ./edtions/test edition. |
|
@Jermolene ... Do you think we can combine some concepts from my draft PR: FindDraft-performcance-improvement #9729 with this plugin? Eg: Some documentation, how to reproduce the results on a different machine in the test-edition. The main problem I do have with my approach is, that the benchmark is only valid for one TW version. The Jasmine test It contains the following code snippet. Which is sub optimal |
|
Chrome MCP have https://github.com/ChromeDevTools/chrome-devtools-mcp/blob/main/docs/tool-reference.md#performance Could you use that in the debug workflow instead of using performance plugin? One benefit of performance plugin is it produce less data, so less token will be occupide. While Chrome performance Flamechart will use tons of token, while it gives LLM more insight. |
|
hmmm, @linonetwo ... IMO the advantage is, this plugin runs without a browser. On my system I do have no Chrome installed. There is Edge, because it is there. Only use it for free CoPilot chat. I am 100% FireFox ;) |
|
@Jermolene .. I did update the test runner benchmark code at PR: GetOrphanTitles-performance-improvement #9730 Now it runs with Jasmine, command line for windows, and the -benchmark-core.js now can be copy / pasted into any wiki browser console. So manual tests would be easy too. |
|
@Jermolene ... There seems to be a bug in the tagIndexer.js To reproduce with VSCode / GitLens plugin
|
|
Thanks @pmario, @linonetwo.
Making things measurable is the cornerstone of this technique. My focus so far has been measuring interactive performance, but I would see the scope of the plugin as being broad enough to accommodate any kind of performance measurement.
Can you not use
I will update the OP but my plan is to remove the example optimisations from this PR. We need to make each optimisation a separate PR so that we can easily git bisect problems. This PR is about establishing the shared infrastructure we need for performance measurement.
That is because the two wikis that I used for the experiments above are confidential to two of my clients who are working with enormous wikis. At this size performance is critical, but paradoxically easier to measure and therefore optimise. The value of using client data is that it is reflects real world usage, complementing the synthetic test data that we will also need. It also benefits the clients while preserving their privacy.
We will certainly need to do that, but we also need to accommodate people who want to be able to optimise performance while using their private data. Downstream a bit, we can imagine TiddlyWiki end users being able to optimise their copy of TiddlyWiki for the best possible performance with their own, real data.
As noted above, each optimisation must be a separate PR. The scope of this PR is the infrastructure that we can share, which certainly includes the kind of test runner approach in #9729.
There is certainly a role for that approach, but it is orthogonal to the problem addressed here of automating interactive performance testing.
Perhaps it is more helpful to think of it as a bug in our test suite. |
Sure that's why I did create several optimisation drafts which contain very similar info in the OP. But the test code and the test tiddlers it creates are different.
OK - That's what I was thinking about. I do like the idea, to have some documentation, and "how to test" information in ./editions/test The main difficulty I faced, is that optimisation benchmarks are good to proof, that a code change works. But once it is merged, that test is "outdated" and only wastes time on Netlify's side. In my code I did include a version check -- But I think that's a bit hacky. Just some thoughts. |
Allows us to test startup performance in a real browser or Playwright


This PR was created with assistance from Claude (using Claude Opus 4.6). It was prompted by Shopify/liquid#2056 which improves Shopify liquid template rendering with 53% faster parse+render, 61% fewer allocations.
The Performance Plugin provides a framework for measuring the performance of TiddlyWiki's refresh cycle — the process that updates the display when tiddlers are modified.
The idea is to capture a realistic workload by recording store modifications while a user interacts with a wiki in the browser, and then replaying those modifications under Node.js where the refresh cycle can be precisely measured in isolation.
Motivation
An important motivation for this framework is to enable LLMs to iteratively optimise TiddlyWiki's performance. The workflow is:
--perf-replayagainst a recorded timeline to measure the impactThis tight edit-measure-iterate loop works because
--perf-replayruns entirely under Node.js with no browser required, produces machine-readable JSON output, and completes in seconds.Initial Success
I used Claude to optimise a timeline from a test app of 150MB with 95K tiddlers and 20K tags. It came up with an improvement to the tag indexer that gives an impressively significant improvement:
The specific filter impact:
[subfilter{$:/core/config/GlobalImportFilter}]went from 1,172ms (the #1 bottleneck) to effectively 0ms — it dropped out of the top 20 entirely. That single filter was consuming 47% of total refresh time.The root cause was that
TagSubIndexer.prototype.update()was settingthis.index = nullon every tiddler change, forcing a complete rebuild (iterating all 95k tiddlers) on the next tag lookup. The fix makes it incremental — just removing/adding the changed tiddler's tags — which is O(number of tags on the changed tiddler) instead of O(all tiddlers in the wiki).4b04688
How the Performance Plugin Works
The framework has two parts:
1. Recording (Browser)
The plugin intercepts
wiki.addTiddler()andwiki.deleteTiddler()to capture every store modification as it happens. Each operation is recorded with:$tw.utils.nextTick()The batch tracking is important because TiddlyWiki groups multiple store changes that occur in the same tick into a single refresh cycle. The recorder preserves these batch boundaries so that playback triggers the same pattern of refreshes.
2. Playback (Node.js)
The
--perf-replaycommand loads a wiki and builds the full widget tree using TiddlyWiki's$tw.fakeDocument— the lightweight DOM implementation used for server-side rendering. It then replays the recorded timeline batch by batch, callingwidgetNode.refresh(changedTiddlers)after each batch and measuring how long it takes.This means we are measuring TiddlyWiki's own refresh logic (widget tree traversal, filter evaluation, DOM diffing) in isolation from browser layout and paint. This is intentional — it lets us identify performance bottlenecks within TiddlyWiki itself, independent of which browser is being used.
Why Store-Level Recording?
An alternative would be to record DOM events (clicks, keystrokes) and replay them in a headless browser. Store-level recording was chosen instead because:
Recording
timeline.jsonfileDraft Coalescing
When editing a tiddler, TiddlyWiki writes to draft tiddlers on every keystroke. By default, the recorder coalesces rapid draft updates within the same batch, keeping only the last update. This produces a more compact timeline that focuses on the refresh-relevant changes.
Uncheck "Coalesce rapid draft updates" to record every individual keystroke. This is useful when you specifically want to measure the performance impact of rapid typing.
Playback
Or from any edition that includes this plugin:
Playback runs at full speed with no delays between batches. The recorded timestamps are preserved in the timeline for reference but are not used for pacing.
What Gets Measured
widgetNode.refresh(changedTiddlers)takes for each batch of store modificationsOutput
The command produces two forms of output:
Text Report (stdout)
A human-readable table printed to the console showing per-batch timings, a summary with percentile statistics, and a breakdown of the most expensive filter executions.
JSON Results File
A
<timeline-name>-results.jsonfile is written alongside the input timeline. This is the primary output for automated consumption. The file contains:{ "wiki": { "tiddlerCount": 2076 }, "timeline": { "operations": 156, "batches": 42 }, "initialRender": 55.46, "summary": { "totalRefreshTime": 234.5, "meanRefresh": 5.58, "p50": 4.12, "p95": 18.7, "p99": 31.2, "maxRefresh": 31.2, "totalFilterInvocations": 4821 }, "batches": [ { "batch": 1, "ops": 1, "changed": 1, "refreshMs": 12.3, "filters": 293, "tiddlers": ["$:/StoryList"] } ], "topFilters": [ { "name": "filter: [subfilter{$:/core/config/GlobalImportFilter}]", "time": 5.65, "invocations": 5 } ] }All times are in milliseconds. The key fields for automated analysis:
summary.totalRefreshTime— the single most important number: total time spent in refresh across all batchessummary.meanRefresh— average refresh time per batchsummary.p95/summary.p99— tail latency indicatorsinitialRender— time to build the widget tree from scratch (measures startup cost)batches[].refreshMs— per-batch breakdown, useful for identifying which user actions are expensivetopFilters[]— the most expensive filters by total execution time, useful for identifying optimisation targetsExample: LLM Optimisation Workflow
An LLM optimising TiddlyWiki performance would follow this pattern:
Step 1: Establish baseline
Read
timeline-results.jsonand note the baselinesummary.totalRefreshTime.Step 2: Make a change
Edit a source file (e.g. optimise a filter operator in
core/modules/filters/).Step 3: Measure impact
Run the same
--perf-replaycommand again and read the newtimeline-results.json.Step 4: Compare
Compare
summary.totalRefreshTimeandsummary.p95between baseline and new results. If improved, keep the change. If regressed, revert and try a different approach.Step 5: Iterate
Repeat steps 2-4 until the target metric is optimised.
The JSON results file makes step 4 straightforward — an LLM can read two JSON files and compare numeric fields directly without parsing tabular text output.
Timeline Format
The timeline is a JSON array of operations:
[ { "seq": 0, "t": 123.45, "batch": 0, "op": "add", "title": "$:/StoryList", "isDraft": false, "fields": { "title": "$:/StoryList", "list": "GettingStarted", "text": "" } } ]seq— sequential operation numbert— milliseconds since recording startedbatch— batch identifier (operations in the same batch trigger a single refresh)op—"add"or"delete"isDraft— whether this is a draft tiddler (used for coalescing)fields— complete tiddler fields (null for delete operations)