Performance plugin by Jermolene · Pull Request #9728 · TiddlyWiki/TiddlyWiki5

Jermolene · 2026-03-13T17:36:04Z

This PR was created with assistance from Claude (using Claude Opus 4.6). It was prompted by Shopify/liquid#2056 which improves Shopify liquid template rendering with 53% faster parse+render, 61% fewer allocations.

The Performance Plugin provides a framework for measuring the performance of TiddlyWiki's refresh cycle — the process that updates the display when tiddlers are modified.

The idea is to capture a realistic workload by recording store modifications while a user interacts with a wiki in the browser, and then replaying those modifications under Node.js where the refresh cycle can be precisely measured in isolation.

Motivation

An important motivation for this framework is to enable LLMs to iteratively optimise TiddlyWiki's performance. The workflow is:

An LLM makes a change to the TiddlyWiki codebase (e.g. optimising a filter operator, caching a computation, or restructuring a widget's refresh logic)
The LLM runs --perf-replay against a recorded timeline to measure the impact
The LLM reads the JSON results file to determine whether the change improved, regressed, or had no effect on performance
The LLM iterates: tries another approach, measures again, and converges on the best solution

This tight edit-measure-iterate loop works because --perf-replay runs entirely under Node.js with no browser required, produces machine-readable JSON output, and completes in seconds.

Initial Success

I used Claude to optimise a timeline from a test app of 150MB with 95K tiddlers and 20K tags. It came up with an improvement to the tag indexer that gives an impressively significant improvement:

Metric	Before	After Tag Indexer	Improvement
Initial render	261.95ms	83.51ms	3.1x faster
Total refresh	2,496.25ms	1,338.84ms	1.9x faster (46%)
Mean refresh	416.04ms	223.14ms	1.9x faster
Max refresh	480.76ms	287.26ms	1.7x faster

The specific filter impact: [subfilter{$:/core/config/GlobalImportFilter}] went from 1,172ms (the #1 bottleneck) to effectively 0ms — it dropped out of the top 20 entirely. That single filter was consuming 47% of total refresh time.

The root cause was that TagSubIndexer.prototype.update() was setting this.index = null on every tiddler change, forcing a complete rebuild (iterating all 95k tiddlers) on the next tag lookup. The fix makes it incremental — just removing/adding the changed tiddler's tags — which is O(number of tags on the changed tiddler) instead of O(all tiddlers in the wiki).
4b04688

How the Performance Plugin Works

The framework has two parts:

1. Recording (Browser)

The plugin intercepts wiki.addTiddler() and wiki.deleteTiddler() to capture every store modification as it happens. Each operation is recorded with:

A sequence number and high-resolution timestamp
The full tiddler fields (so the exact state can be recreated)
A batch identifier that tracks TiddlyWiki's change batching via $tw.utils.nextTick()

The batch tracking is important because TiddlyWiki groups multiple store changes that occur in the same tick into a single refresh cycle. The recorder preserves these batch boundaries so that playback triggers the same pattern of refreshes.

2. Playback (Node.js)

The --perf-replay command loads a wiki and builds the full widget tree using TiddlyWiki's $tw.fakeDocument — the lightweight DOM implementation used for server-side rendering. It then replays the recorded timeline batch by batch, calling widgetNode.refresh(changedTiddlers) after each batch and measuring how long it takes.

This means we are measuring TiddlyWiki's own refresh logic (widget tree traversal, filter evaluation, DOM diffing) in isolation from browser layout and paint. This is intentional — it lets us identify performance bottlenecks within TiddlyWiki itself, independent of which browser is being used.

Why Store-Level Recording?

An alternative would be to record DOM events (clicks, keystrokes) and replay them in a headless browser. Store-level recording was chosen instead because:

The refresh cycle responds to store changes, not DOM events — store modifications are the natural input
Store changes are fully deterministic and reproducible
No DOM dependency means playback works in pure Node.js with no headless browser to install
A headless browser would add its own overhead, making measurements less precise

Recording

Include this plugin in your wiki
Open the Control Panel and find the "Performance Testing Recorder" tab
Click "Start Recording"
Interact with the wiki — open tiddlers, edit, type, navigate, switch tabs
Click "Stop Recording"
Download the timeline.json file

Draft Coalescing

When editing a tiddler, TiddlyWiki writes to draft tiddlers on every keystroke. By default, the recorder coalesces rapid draft updates within the same batch, keeping only the last update. This produces a more compact timeline that focuses on the refresh-relevant changes.

Uncheck "Coalesce rapid draft updates" to record every individual keystroke. This is useful when you specifically want to measure the performance impact of rapid typing.

Playback

tiddlywiki editions/performance --load mywiki.html --perf-replay timeline.json

Or from any edition that includes this plugin:

tiddlywiki myedition --perf-replay timeline.json

Playback runs at full speed with no delays between batches. The recorded timestamps are preserved in the timeline for reference but are not used for pacing.

What Gets Measured

Initial render time — the time to build and render the full widget tree from scratch
Refresh time per batch — the time widgetNode.refresh(changedTiddlers) takes for each batch of store modifications
Filter execution — individual filter timings and invocation counts, showing which filters are the most expensive
Statistical summary — mean, P50, P95, P99, and maximum refresh times across all batches

Output

The command produces two forms of output:

Text Report (stdout)

A human-readable table printed to the console showing per-batch timings, a summary with percentile statistics, and a breakdown of the most expensive filter executions.

JSON Results File

A <timeline-name>-results.json file is written alongside the input timeline. This is the primary output for automated consumption. The file contains:

{
  "wiki": {
    "tiddlerCount": 2076
  },
  "timeline": {
    "operations": 156,
    "batches": 42
  },
  "initialRender": 55.46,
  "summary": {
    "totalRefreshTime": 234.5,
    "meanRefresh": 5.58,
    "p50": 4.12,
    "p95": 18.7,
    "p99": 31.2,
    "maxRefresh": 31.2,
    "totalFilterInvocations": 4821
  },
  "batches": [
    {
      "batch": 1,
      "ops": 1,
      "changed": 1,
      "refreshMs": 12.3,
      "filters": 293,
      "tiddlers": ["$:/StoryList"]
    }
  ],
  "topFilters": [
    {
      "name": "filter: [subfilter{$:/core/config/GlobalImportFilter}]",
      "time": 5.65,
      "invocations": 5
    }
  ]
}

All times are in milliseconds. The key fields for automated analysis:

summary.totalRefreshTime — the single most important number: total time spent in refresh across all batches
summary.meanRefresh — average refresh time per batch
summary.p95 / summary.p99 — tail latency indicators
initialRender — time to build the widget tree from scratch (measures startup cost)
batches[].refreshMs — per-batch breakdown, useful for identifying which user actions are expensive
topFilters[] — the most expensive filters by total execution time, useful for identifying optimisation targets

Example: LLM Optimisation Workflow

An LLM optimising TiddlyWiki performance would follow this pattern:

Step 1: Establish baseline

node ./tiddlywiki.js editions/performance --load mywiki.html --perf-replay timeline.json

Read timeline-results.json and note the baseline summary.totalRefreshTime.

Step 2: Make a change

Edit a source file (e.g. optimise a filter operator in core/modules/filters/).

Step 3: Measure impact

Run the same --perf-replay command again and read the new timeline-results.json.

Step 4: Compare

Compare summary.totalRefreshTime and summary.p95 between baseline and new results. If improved, keep the change. If regressed, revert and try a different approach.

Step 5: Iterate

Repeat steps 2-4 until the target metric is optimised.

The JSON results file makes step 4 straightforward — an LLM can read two JSON files and compare numeric fields directly without parsing tabular text output.

Timeline Format

The timeline is a JSON array of operations:

[
  {
    "seq": 0,
    "t": 123.45,
    "batch": 0,
    "op": "add",
    "title": "$:/StoryList",
    "isDraft": false,
    "fields": {
      "title": "$:/StoryList",
      "list": "GettingStarted",
      "text": ""
    }
  }
]

seq — sequential operation number
t — milliseconds since recording started
batch — batch identifier (operations in the same batch trigger a single refresh)
op — "add" or "delete"
isDraft — whether this is a draft tiddler (used for coalescing)
fields — complete tiddler fields (null for delete operations)

For my test app with 100K tiddlers and 20K tags, the tag indexer improvement gives a significant improvement: | Metric | Before | After Tag Indexer | Improvement | |--------|--------|-------------------|-------------| | **Initial render** | 261.95ms | 83.51ms | **3.1x faster** | | **Total refresh** | 2,496.25ms | 1,338.84ms | **1.9x faster (46%)** | | **Mean refresh** | 416.04ms | 223.14ms | **1.9x faster** | | **Max refresh** | 480.76ms | 287.26ms | **1.7x faster** | The specific filter impact: `[subfilter{$:/core/config/GlobalImportFilter}]` went from **1,172ms** (the #1 bottleneck) to effectively **0ms** — it dropped out of the top 20 entirely. That single filter was consuming 47% of total refresh time. The root cause was that `TagSubIndexer.prototype.update()` was setting `this.index = null` on every tiddler change, forcing a complete rebuild (iterating all 95k tiddlers) on the next tag lookup. The fix makes it incremental — just removing/adding the changed tiddler's tags — which is O(number of tags on the changed tiddler) instead of O(all tiddlers in the wiki).

netlify · 2026-03-13T17:36:09Z

✅ Deploy Preview for tiddlywiki-previews ready!

Name	Link
🔨 Latest commit	`1e06098`
🔍 Latest deploy log	https://app.netlify.com/projects/tiddlywiki-previews/deploys/69b6bd72a39d980008f8201e
😎 Deploy Preview	https://deploy-preview-9728--tiddlywiki-previews.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2026-03-13T17:36:14Z

Confirmed: Jermolene has already signed the Contributor License Agreement (see contributing.md)

github-actions · 2026-03-13T17:36:20Z

📊 Build Size Comparison: `empty.html`

Branch	Size
Base (master)	2487.2 KB
PR	2496.6 KB

Diff: ⬆️ Increase: +9.3 KB

⚠️ Change Note Status

This PR appears to contain code changes but doesn't include a change note.

Please add a change note by creating a .tid file in editions/tw5.com/tiddlers/releasenotes/<version>/

📚 Documentation: Release Notes and Changes

💡 Note: If this is a documentation-only change, you can ignore this message.

pmario · 2026-03-14T01:53:03Z

That's very interesting. I did also let Claude play with performance for the following elements.

finding transclusion backlinks
findDraft tiddler
getMissingTitles
getTiddlerBacklinks
getOrphanTitles

Making it measurable was the hardest thing to do. I did create a little test runner, that could be activated with tiddlywiki editions/test --test and also starting the test-index.html file in the browser. The main problem in the browser is the max resulution is 1ms .

Before testing it did create 10000 tiddlers with elements that should be found.

I will use this plugin to see, if it comes up with the same results and or solutions.
I did not upload the Drafts yet, since I need to check the auto-created test tiddlers. They have been created with Claude and I did not manually check them. So I am not sure yet, "what is measured" ;)

-- Cool stuff!!

pmario · 2026-03-14T02:02:36Z

For getTiddlerBacklinks I got a 6x improvement. ... with 10000 tids, 10% of them link each other, 20% have no links at all, 10% link targets are not existent, links per tiddler 1 - 5 random, 2 warmup runs, 5 measurement runs. ...

For getOrphans it was between 4x-19x .. may be still a warm-up problem. -- will test with this plugin again

These ones stemming from a wiki with 80K tiddlers and 5K tags, and a total wiki size of 150MB. > The biggest win is the P50 (median) refresh dropping from 124ms to 46ms — a 63% improvement

pmario · 2026-03-14T11:15:39Z

@Jermolene ... Do you have something like a summary report, what the different changes do? IMO it would be nice, to know what actually was going on.

pmario · 2026-03-14T11:58:36Z

@Jermolene ... I let Claude find out, what the different optimisations do.

But there is still one problem left. This PR does not contain any info, how to reproduce any test runs.

So IMO there needs to be some info, how to create your test wiki, and a replay-receipt, that you used for testing. IMO the info should be somewhere in ./edtions/test edition.

pmario · 2026-03-14T12:37:12Z

@Jermolene ... Do you think we can combine some concepts from my draft PR: FindDraft-performcance-improvement #9729 with this plugin?

Eg: Some documentation, how to reproduce the results on a different machine in the test-edition.

The main problem I do have with my approach is, that the benchmark is only valid for one TW version.

The Jasmine test $:/tags/test-spec in test-finddraft-benchmark.js

It contains the following code snippet. Which is sub optimal

// only run for v5.5.0 and v5.5.0-prerelease
// TODO: Adjust the version check! Currently for the draft it is v5.4.0-pre..

if($tw.version.indexOf("5.4.0") === 0) {

linonetwo · 2026-03-14T13:56:37Z

Chrome MCP have https://github.com/ChromeDevTools/chrome-devtools-mcp/blob/main/docs/tool-reference.md#performance

Could you use that in the debug workflow instead of using performance plugin? One benefit of performance plugin is it produce less data, so less token will be occupide. While Chrome performance Flamechart will use tons of token, while it gives LLM more insight.

pmario · 2026-03-14T14:09:58Z

hmmm, @linonetwo ... IMO the advantage is, this plugin runs without a browser. On my system I do have no Chrome installed. There is Edge, because it is there. Only use it for free CoPilot chat.

I am 100% FireFox ;)

pmario · 2026-03-14T14:45:50Z

@Jermolene .. I did update the test runner benchmark code at PR: GetOrphanTitles-performance-improvement #9730

Now it runs with Jasmine, command line for windows, and the -benchmark-core.js now can be copy / pasted into any wiki browser console. So manual tests would be easy too.

pmario · 2026-03-14T16:01:49Z

@Jermolene ... There seems to be a bug in the tagIndexer.js

To reproduce with VSCode / GitLens plugin

Check out master
GitLens: Open branch performance-plugin
Open commit named: Update tag-indexer.js
- tag-indexer.js -> right click
Apply changes from tag-indexer.js from performance-plugin branch
- Make sure they are applied.
node tiddlywiki.js editions/tw5.com-server --listen
open: http://localhost:8080/#Acknowledgements
Click About tag
Drag License tiddler to top -> Problem
About list field is updated -> OK
But UI still shows old list

Jermolene · 2026-03-14T16:57:25Z

Thanks @pmario, @linonetwo.

Making it measurable was the hardest thing to do. I did create a little test runner, that could be activated with tiddlywiki editions/test --test and also starting the test-index.html file in the browser.

Making things measurable is the cornerstone of this technique. My focus so far has been measuring interactive performance, but I would see the scope of the plugin as being broad enough to accommodate any kind of performance measurement.

The main problem in the browser is the max resulution is 1ms .

Can you not use performance.now() for higher resolution?

@Jermolene ... Do you have something like a summary report, what the different changes do? IMO it would be nice, to know what actually was going on.

I will update the OP but my plan is to remove the example optimisations from this PR. We need to make each optimisation a separate PR so that we can easily git bisect problems. This PR is about establishing the shared infrastructure we need for performance measurement.

But there is still one problem left. This PR does not contain any info, how to reproduce any test runs.

That is because the two wikis that I used for the experiments above are confidential to two of my clients who are working with enormous wikis. At this size performance is critical, but paradoxically easier to measure and therefore optimise.

The value of using client data is that it is reflects real world usage, complementing the synthetic test data that we will also need. It also benefits the clients while preserving their privacy.

So IMO there needs to be some info, how to create your test wiki, and a replay-receipt, that you used for testing. IMO the info should be somewhere in ./edtions/test edition.

We will certainly need to do that, but we also need to accommodate people who want to be able to optimise performance while using their private data.

Downstream a bit, we can imagine TiddlyWiki end users being able to optimise their copy of TiddlyWiki for the best possible performance with their own, real data.

@Jermolene ... Do you think we can combine some concepts from my draft PR: FindDraft-performcance-improvement #9729 with this plugin?

As noted above, each optimisation must be a separate PR. The scope of this PR is the infrastructure that we can share, which certainly includes the kind of test runner approach in #9729.

Chrome MCP have https://github.com/ChromeDevTools/chrome-devtools-mcp/blob/main/docs/tool-reference.md#performance

Could you use that in the debug workflow instead of using performance plugin? One benefit of performance plugin is it produce less data, so less token will be occupide. While Chrome performance Flamechart will use tons of token, while it gives LLM more insight.

There is certainly a role for that approach, but it is orthogonal to the problem addressed here of automating interactive performance testing.

@Jermolene ... There seems to be a bug in the tagIndexer.js

Perhaps it is more helpful to think of it as a bug in our test suite.

pmario · 2026-03-14T17:11:35Z

@Jermolene ... Do you think we can combine some concepts from my draft PR: FindDraft-performcance-improvement #9729 with this plugin?

As noted above, each optimisation must be a separate PR. ...

Sure that's why I did create several optimisation drafts which contain very similar info in the OP. But the test code and the test tiddlers it creates are different.

... The scope of this PR is the infrastructure that we can share, which certainly includes the kind of test runner approach in #9729.

OK - That's what I was thinking about. I do like the idea, to have some documentation, and "how to test" information in ./editions/test

The main difficulty I faced, is that optimisation benchmarks are good to proof, that a code change works.

But once it is merged, that test is "outdated" and only wastes time on Netlify's side.

In my code I did include a version check -- But I think that's a bit hacky.

Just some thoughts.
Mario

Allows us to test startup performance in a real browser or Playwright

Jermolene added 2 commits March 13, 2026 15:36

Initial Commit

f272f71

Fix plugin stability

b20b578

pmario added the ⟲ admin-review A label for admins, to review the issue again label Mar 14, 2026

More optimisations

b378f3f

These ones stemming from a wiki with 80K tiddlers and 5K tags, and a total wiki size of 150MB. > The biggest win is the P50 (median) refresh dropping from 124ms to 46ms — a 63% improvement

Startup performance markers

1e06098

Allows us to test startup performance in a real browser or Playwright

Conversation

Jermolene commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Initial Success

How the Performance Plugin Works

1. Recording (Browser)

2. Playback (Node.js)

Why Store-Level Recording?

Recording

Draft Coalescing

Playback

What Gets Measured

Output

Text Report (stdout)

JSON Results File

Example: LLM Optimisation Workflow

Step 1: Establish baseline

Step 2: Make a change

Step 3: Measure impact

Step 4: Compare

Step 5: Iterate

Timeline Format

Uh oh!

netlify bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for tiddlywiki-previews ready!

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Build Size Comparison: empty.html

⚠️ Change Note Status

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

linonetwo commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

Jermolene commented Mar 14, 2026

Uh oh!

pmario commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jermolene commented Mar 13, 2026 •

edited

Loading

netlify bot commented Mar 13, 2026 •

edited

Loading

github-actions bot commented Mar 13, 2026 •

edited

Loading

📊 Build Size Comparison: `empty.html`

pmario commented Mar 14, 2026 •

edited

Loading