Potential Bug: Records permanently stuck in `unsyncedRecordIDs` due to out-of-order record delivery #415

timbueno · 2026-03-06T02:19:22Z

timbueno
Mar 6, 2026

I'm putting this here as a discussion because I can't get this bug to reliably reproduce.

Summary

When many records are created quickly on one device, handleFetchedRecordZoneChanges on a receiving device can process child records before their parent records have been upserted. The child records are correctly added to unsyncedRecordIDs for retry on FK constraint failure, but the parent records that failed to upsert are silently swallowed by withErrorReporting and never queued for retry. This creates a permanent deadlock where child records retry forever but their parents never exist locally.

Steps to Reproduce

Set up two devices syncing via CKSyncEngine
On Device A, rapidly create many records where each creates both a parent record and child records with foreign key references to that parent
Observe Device B

Expected Behavior

All records appear on Device B.

Actual Behavior

Some child records are permanently stuck in unsyncedRecordIDs along with their dependents. Their referenced parent records exist in sqlitedata_icloud_metadata but not in the user database. App restarts do not recover the stuck records.

Root Cause Analysis

The issue spans two code paths in handleFetchedRecordZoneChanges:

1. Parent records fail silently

In upsertFromServerRecord(_:db:) (line 1850), the entire body is wrapped in withErrorReporting, which catches and swallows errors. When a parent record fails to upsert, it is never added to unsyncedRecordIDs. It exists in sync metadata but not in the user database — effectively lost.

2. Child records are added to retry queue but can never succeed

When a child record hits a FK constraint violation (lines 1918-1930), it's correctly added to unsyncedRecordIDs. But on subsequent sync cycles, the retry logic (lines 1473-1522) re-fetches the child from CloudKit and tries to upsert again. It fails again because the parent still doesn't exist locally — and the parent was never queued for retry.

3. The retry path also has a silent failure mode

Even if a retry fetch from CloudKit fails, it's silently skipped:

case .failure:
    continue  // line 1515-1516

The record remains in unsyncedRecordIDs but is never actually retried successfully.

Contributing Factor

The modifications array is sorted topologically at line 1524, but only after merging with previously unsynced records. The initial delivery from CKSyncEngine is not guaranteed to arrive in topological order, so parent records may appear after their children in the batch.

Possible Fixes

Add failed parent records to unsyncedRecordIDs: When upsertFromServerRecord fails for any reason (not just FK constraints), add the record to the retry queue so it isn't silently lost.
Retry from local metadata instead of CloudKit: The _lastKnownServerRecordAllFields blob in sqlitedata_icloud_metadata already contains the full record data. The retry could upsert from this local copy instead of re-fetching from CloudKit, avoiding the silent case .failure: continue path.
Ensure topological ordering of the initial CKSyncEngine delivery: Sort the incoming modifications before processing, not just after merging with unsynced records.

Workaround

Calling deleteLocalData() or deleting and reinstalling the app forces a full re-sync from CloudKit, which recovers the stuck records.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Bug: Records permanently stuck in `unsyncedRecordIDs` due to out-of-order record delivery #415

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Potential Bug: Records permanently stuck in unsyncedRecordIDs due to out-of-order record delivery #415

Uh oh!

timbueno Mar 6, 2026

Summary

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

1. Parent records fail silently

2. Child records are added to retry queue but can never succeed

3. The retry path also has a silent failure mode

Contributing Factor

Possible Fixes

Workaround

Replies: 0 comments

Potential Bug: Records permanently stuck in `unsyncedRecordIDs` due to out-of-order record delivery #415

timbueno
Mar 6, 2026