Skip to content

Support all operation types in User Defined Index (UDI) interface#14399

Open
zaidoon1 wants to merge 3 commits intofacebook:mainfrom
zaidoon1:zaidoon/udi-index-full-operation-support
Open

Support all operation types in User Defined Index (UDI) interface#14399
zaidoon1 wants to merge 3 commits intofacebook:mainfrom
zaidoon1:zaidoon/udi-index-full-operation-support

Conversation

@zaidoon1
Copy link
Contributor

Remove the restriction that limited UDI to ingest-only, Puts-only use cases. This enables UDI plugins (including the trie index from #14310) to work with all operation types: Put, Delete, Merge, SingleDelete, PutEntity, etc.

Part of #12396

@meta-cla meta-cla bot added the CLA Signed label Feb 27, 2026
@github-actions
Copy link

github-actions bot commented Feb 27, 2026

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 411.3s.

Summary by check

Check Count
concurrency-mt-unsafe 1
Total 1

Details

db_stress_tool/db_stress_tool.cc (1 warning(s))
db_stress_tool/db_stress_tool.cc:235:5: warning: function is not thread safe [concurrency-mt-unsafe]

@xingbowang xingbowang self-requested a review February 27, 2026 21:58
@zaidoon1
Copy link
Contributor Author

some notes: I don't have db level tests, currently i'm ussing the sstfilewriter for tests which doesn't support all operations so that's not great.

also we are missing things like seekToFirst, etc.. so i'll see if i can add that too as part of this

@zaidoon1 zaidoon1 force-pushed the zaidoon/udi-index-full-operation-support branch 2 times, most recently from ef0dd67 to 59b6943 Compare February 28, 2026 00:24
@zaidoon1
Copy link
Contributor Author

  • [ ]

some notes: I don't have db level tests, currently i'm ussing the sstfilewriter for tests which doesn't support all operations so that's not great.

also we are missing things like seekToFirst, etc.. so i'll see if i can add that too as part of this

this is done for now. I think what i have here is good for this PR.

@zaidoon1 zaidoon1 force-pushed the zaidoon/udi-index-full-operation-support branch 3 times, most recently from efb01b7 to 5bf0770 Compare March 1, 2026 08:12
Remove the restriction that limited UDI to ingest-only, Puts-only use cases.
This enables UDI plugins (including the trie index from facebook#14310) to work with
all operation types: Put, Delete, Merge, SingleDelete, PutEntity, etc.

Part of facebook#12396
@zaidoon1 zaidoon1 force-pushed the zaidoon/udi-index-full-operation-support branch from 5bf0770 to 83c30bc Compare March 5, 2026 03:28
@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Mar 5, 2026

FYI, still working on some bugs i found. please ignore this PR for now.

…null comparator

Fix three bugs in the trie-based User Defined Index (UDI):

1. BFS overflow reordering in louds_trie.cc Finish(): overflow handles and
   seqnos were not BFS-reordered, causing wrong block handles when BFS
   traversal order differs from key-sorted insertion order. Refactored
   leaf emission into an emit_leaf lambda that also reorders overflow
   arrays.

2. NextAndGetResult in user_defined_index_wrapper.h returned a user key
   instead of an internal key, violating the InternalIteratorBase contract.
   Now sets result->key = key() (internal key with 8-byte suffix) and
   copies bound_check_result/value_prepared individually.

3. NewReader in trie_index_factory.cc crashed on null comparator even
   though NewBuilder already handled it. Now defaults to
   BytewiseComparator().

Also fixes minor inconsistencies: overflow_run_size_=1 initialization in
NewBuilder, SCOPED_TRACE brace style, and a stale comment in table_test.cc.

Tests: OverflowBfsReordering (verified fails without fix),
WrapperNextAndGetResultReturnsInternalKey (verified fails without fix),
expanded NullComparator test.

facebook#12396
@zaidoon1 zaidoon1 force-pushed the zaidoon/udi-index-full-operation-support branch from db5bacc to e8931c3 Compare March 5, 2026 05:54
@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Mar 5, 2026

the current changes look good to me. I've run a 10 minute blackbox crash test without any issues. Currently running a 1 hour blackbox crash test then 1 hour whitebox crash test to really make sure we are good

The trie index builder incorrectly called FindShortSuccessor() on the
last block's separator key, producing a wider key range than the standard
ShortenedIndexBuilder (which uses the last key as-is with the default
kShortenSeparators mode). This caused the trie to claim coverage of keys
between the actual last key and its shortened successor, leading to
iterator desynchronization in test_batches_snapshots with deletes.

Fix: use last_key_in_current_block directly as the last block's separator,
matching the standard builder's behavior. Also use comparator->Compare()
instead of string equality for same-key-run detection.

Add regression tests (LastBlockSeparatorNotShortened,
LastBlockSeparatorWithDeletes) and BatchedPrefixScan tests that reproduce
the crash-test failure pattern.

Remove leftover debug cross-validation scaffolding from the UDI wrapper's
NewIterator. Fix all clang-tidy warnings across UDI/trie files: missing
override, uninitialized members, redundant virtual, missing destructors,
braces around statements, unnecessary copies, and static string init.

Ref: facebook#12396
// Last block: use the last key itself as the separator, NOT a shortened
// successor. The standard index builder (ShortenedIndexBuilder) only
// calls FindShortInternalKeySuccessor when shortening_mode ==
// kShortenSeparatorsAndSuccessor, which is not the default. With the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was causing the crash tests to fail. I'm not sure what we want to do in general and whether we want to pass the block options that sets the mode and change behaviour depending on that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more on the issue?

@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Mar 5, 2026

@xingbowang this is now ready for review whenever you have a chance.

@xingbowang
Copy link
Contributor

M1 — key_type_log_ thread safety (2 agents agree): The TestUserDefinedIndexFactory::key_type_log_ shared vector has no synchronization. Not a production bug, but a latent data race in test infrastructure.

M2 — SeekToFirstAndGetResult default assumes BytewiseComparator (2 agents agree): The default implementation SeekAndGetResult(Slice()) is incorrect for non-bytewise comparators. Current risk is low (trie UDI overrides this and only supports BytewiseComparator), but third-party UDI implementations could hit silent bugs. Consider making it pure-virtual.

M3 — Simultaneous removal of all stress test restrictions (2 agents agree): All operation-type restrictions removed at once. The targeted DB-level tests (22 test cases) are thorough, but the crash test explores a much wider combinatorial space. Acceptable for EXPERIMENTAL, but worth monitoring.

M4 — prev_key_scratch_ string copy in NextAndGetResult: Could use std::move instead of copy to avoid per-Next() heap allocation for keys exceeding SSO threshold. One-line change, zero risk.

M5 — Missing degenerate SST edge cases at DB level: No DB-level tests for single-entry SST, deletion-only SST, or all-same-key SST. SST-level tests exist but don't exercise the full DB path.

Comment on lines -897 to -921
dest_params["use_merge"] = 0
dest_params["use_put_entity_one_in"] = 0
dest_params["use_timed_put_one_in"] = 0
dest_params["use_get_entity"] = 0
dest_params["use_multi_get_entity"] = 0
# TransactionDB ROLLBACK writes DELETE entries to WAL to undo
# uncommitted changes. These DELETEs violate UDI's Put-only restriction.
dest_params["use_txn"] = 0
# Trie UDI uses zero-copy pointers into block data, which is
# incompatible with mmap_read.
dest_params["mmap_read"] = 0
# Redistribute delete/delrange percents to write percent
dest_params["writepercent"] += dest_params["delpercent"]
dest_params["writepercent"] += dest_params["delrangepercent"]
dest_params["delpercent"] = 0
dest_params["delrangepercent"] = 0
# Ingestion with standalone range deletions is incompatible
dest_params["test_ingest_standalone_range_deletion_one_in"] = 0
# Parallel compression is incompatible with UDI
dest_params["compression_parallel_threads"] = 1
# Trie UDI has a known issue with prefix scanning where certain prefix
# patterns cause "SeekToFirst not supported" errors. Disable prefix
# scanning and redistribute its percentage to reads.
dest_params["readpercent"] += dest_params.get("prefixpercent", 0)
dest_params["prefixpercent"] = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are removing quite a lot of these. Could you update the PR with stress test results that have these configuration on? These configuration could be overridden in the db_crashtest.py command line. I want to make sure we have these well tested before removing them. It caused bunch of stress test internally before which created lots of work for us to identify the root cause and add these configs.

// Last block: use the last key itself as the separator, NOT a shortened
// successor. The standard index builder (ShortenedIndexBuilder) only
// calls FindShortInternalKeySuccessor when shortening_mode ==
// kShortenSeparatorsAndSuccessor, which is not the default. With the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more on the issue?

@@ -260,8 +295,10 @@ class UserDefinedIndexIteratorWrapper
valid_ = result_.bound_check_result == IterBoundCheck::kInbound;
if (valid_) {
ikey_.Set(result_.key, 0, ValueType::kTypeValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to explain this? Essentially set result with internal key using user key. This is a general for multiple other APIs. Maybe extract a function helper to make this more explicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants