Skip to content

release: v0.20260313.0 - SQLite Memory & SIF Reliability#125

Closed
ranaroussi wants to merge 11 commits intorcfrom
develop
Closed

release: v0.20260313.0 - SQLite Memory & SIF Reliability#125
ranaroussi wants to merge 11 commits intorcfrom
develop

Conversation

@ranaroussi
Copy link
Member

Summary

Five bug fixes addressing SQLite memory retrieval and SIF container reliability.

Bug Fixes

  • SQLiteMemory search in single-user modesearch() and _search_internal() referenced a nonexistent default_user_id attribute. Memories stored correctly but never retrieved. Fix: 4-way SQL branching when user_id=None.
  • Embedding model missing from SIFall-MiniLM-L6-v2 was not pre-downloaded during Docker build. Fix: pre-download both models.
  • HuggingFace cache writes in read-only SIF — HF Hub wrote .no_exist cache files to read-only filesystem. Fix: HF_HUB_OFFLINE=1 + TRANSFORMERS_OFFLINE=1.
  • auto_decomposition default override — Hardcoded True overriding constructor False. Fix: defaults to enable_workflow_by_default.
  • sqlite-vec ELFCLASS32 on aarch64 — PyPI wheel ships 32-bit binary. Fix: compile from amalgamation source in Dockerfile.

Changes

  • 965e2b9 docs: add 0.20260313.0 changelog entry
  • 43c6970 fix: pre-download all-MiniLM-L6-v2 and set HF offline mode for SIF
  • 115caea fix: SQLiteMemory search fails in single-user mode (missing default_user_id)
  • 0580eb4 fix: compile sqlite-vec from source for aarch64 in Docker build
  • 8b4f5ee fix: auto_decomposition defaults to False unless explicitly enabled
  • 0b9fa1b Merge remote-tracking branch 'origin/main' into develop
  • 6c4c38b chore: merge main back to develop (v0.20260312.1) [skip ci]
  • 96c0314 chore: release v0.20260312.1
  • 7c7dd26 Merge branch 'rc'
  • 16b0ad5 chore: release v0.20260312.0

github-actions bot and others added 10 commits March 12, 2026 12:06
The overlord config loading path defaulted auto_decomposition to True,
overriding the constructor default of False. This caused simple messages
to trigger workflow planning even when the formation didn't configure it.

Now defaults to enable_workflow_by_default (False) unless the formation
explicitly sets overlord.workflow.auto_decomposition: true.

Also: surface persistent memory init failures in startup output instead
of silently swallowing them.
The sqlite-vec 0.1.6 PyPI package ships a 32-bit ARM binary in the
aarch64 wheel (known bug: asg017/sqlite-vec#211). This causes
'wrong ELF class: ELFCLASS32' at runtime, silently breaking persistent
memory in SIF containers on ARM64.

Fix: download the sqlite-vec amalgamation source and compile a correct
64-bit vec0.so during the Docker build (aarch64 only).
…ser_id)

In single-user mode, persistent_memory_manager passes user_id=None to
SQLiteMemory.search(). The search method tried to fall back to
self.default_user_id which doesn't exist, causing an AttributeError
that was silently caught. Memories were stored but never retrieved.

Fix: when user_id is None, search without user filtering (appropriate
for single-user mode). Removed the nonexistent default_user_id
references from both search() and _search_internal().
SQLiteMemory uses all-MiniLM-L6-v2 for local embeddings but it was not
pre-downloaded in the Docker build. In read-only SIF containers, HuggingFace
Hub failed with 'Read-only file system' when trying to download or cache
the model at runtime.

- Pre-download both embedding models at build time
- Set HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1 to prevent cache writes
- Bump version to 0.20260312.1
@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This release PR ships five targeted bug fixes for SQLite memory retrieval and SIF container reliability. The changes are generally correct and well-scoped, though two issues in the Dockerfile and one behavioral asymmetry in sqlite.py are worth addressing before or after merge.

Key changes:

  • sqlite.py — 4-way SQL branching in _search_internal: The core fix correctly adds a formation-scoped "search all users" branch when user_id=None, replacing the previously broken self.default_user_id reference. However, get_recent_memories() still falls back to default_user_id in the same user_id=None case, creating a behavioral inconsistency between the two methods.
  • overlord.pyauto_decomposition default: One-line fix replacing the hardcoded True with self.enable_workflow_by_default; clean and correct.
  • initialization.py — persistent memory init error visibility: Adds a print to surface init failures in console output alongside the existing observability event; straightforward improvement.
  • Dockerfile — sqlite-vec aarch64 recompilation: The approach of compiling from the amalgamation source is sound for working around the ELFCLASS32 PyPI wheel bug. The destination path /install/lib/python3.10/site-packages/... is hardcoded and fragile against Python version bumps, and the tarball download lacks a SHA-256 integrity check.
  • Dockerfile — HuggingFace offline mode: HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1 are set at image level rather than entrypoint level, which may silently break non-SIF Docker deployments that need to load models not baked into the image.

Confidence Score: 3/5

  • Safe to merge for SIF deployments; two Dockerfile issues (hardcoded Python path, missing tarball checksum) and a search/get_recent_memories behavioral asymmetry should be tracked as follow-ups.
  • All five described bugs have correct fixes. The main concerns are: (1) no SHA-256 check on the sqlite-vec tarball download introduces a supply-chain risk in the build, (2) the hardcoded python3.10 path in the cp command will silently fail on a Python version bump, and (3) get_recent_memories() now behaves differently from search() when user_id=None, which could confuse callers expecting symmetric semantics. None of these are regressions from the previous state, and the core memory retrieval and workflow fixes are sound.
  • Pay close attention to Dockerfile (tarball integrity and hardcoded Python path) and src/muxi/runtime/services/memory/sqlite.py (get_recent_memories inconsistency with search).

Important Files Changed

Filename Overview
src/muxi/runtime/services/memory/sqlite.py Fixes single-user mode search by removing the nonexistent default_user_id fallback in search() and adding a 4-way SQL branch in _search_internal(); however, get_recent_memories() still falls back to default_user_id, creating a behavioral asymmetry when user_id=None.
Dockerfile Adds aarch64 sqlite-vec recompilation from source and pre-downloads the missing all-MiniLM-L6-v2 model; the compiled .so path hardcodes python3.10, and the tarball download lacks a checksum verification step. HF_HUB_OFFLINE=1 is set image-wide, which may unexpectedly break non-SIF Docker deployments.
src/muxi/runtime/formation/overlord/overlord.py Fixes auto_decomposition defaulting to hardcoded True by substituting self.enable_workflow_by_default; clean one-liner change with no side effects.
src/muxi/runtime/formation/initialization.py Adds a print statement in the persistent memory init error handler so failures are visible in console output; InitEventFormatter is already imported so no risk of secondary failures.
CHANGELOG.md Documentation-only addition of the v0.20260313.0 changelog entry; accurately describes all five bug fixes in the release.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["search(query, user_id=None/str, collection=None/str)"] --> B{user_id provided?}
    B -- Yes --> C["get_or_create_user(user_id)\ninternal_user_id = int"]
    B -- No --> D["internal_user_id = None"]
    C --> E["_search_internal(embedding, k, collection, user_id)"]
    D --> E

    E --> F{collection AND user_id?}
    F -- Yes --> G["SQL: WHERE collection=? AND user_id=? AND formation_id=?"]
    F -- No --> H{collection only?}
    H -- Yes --> I["SQL: WHERE collection=? AND formation_id=?"]
    H -- No --> J{user_id only?}
    J -- Yes --> K["SQL: WHERE user_id=? AND formation_id=?"]
    J -- No --> L["SQL (new): WHERE formation_id=? — all users in formation"]

    G --> M[Return results]
    I --> M
    K --> M
    L --> M

    style L fill:#d4edda,stroke:#28a745,color:#000
    style D fill:#d4edda,stroke:#28a745,color:#000
Loading

Comments Outside Diff (1)

  1. src/muxi/runtime/services/memory/sqlite.py, line 516-523 (link)

    search() and get_recent_memories() behave differently when user_id=None

    After this fix, calling search(query) with no user_id will return results from all users in the formation (the new else branch in _search_internal), while calling get_recent_memories() with no user_id still falls back to self.default_user_id and returns only the default user's memories (lines 753–755).

    This asymmetry is unlikely to be intentional and may cause confusion: a single-user deployment that stores a memory via add() (which also uses default_user_id at line 419) will find it through search() — but the same query through get_recent_memories() could return an empty list if the default user was recreated with a new internal ID. Consider applying the same "search all formation users when user_id=None" approach to get_recent_memories() to keep the semantics consistent:

    # get_recent_memories() lines 744-755
    if user_id:
        cursor = self.conn.execute(
            "SELECT u.id FROM users u "
            "JOIN user_identifiers ui ON u.id = ui.user_id "
            "WHERE ui.identifier = ? AND ui.formation_id = ?",
            (user_id, self.formation_id),
        )
        user_row = cursor.fetchone()
        internal_user_id = user_row[0] if user_row else self.default_user_id
    else:
        internal_user_id = None  # search all formation users, same as search()

    And then branch the SELECT query the same way _search_internal does.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/muxi/runtime/services/memory/sqlite.py
    Line: 516-523
    
    Comment:
    **`search()` and `get_recent_memories()` behave differently when `user_id=None`**
    
    After this fix, calling `search(query)` with no `user_id` will return results from **all users** in the formation (the new `else` branch in `_search_internal`), while calling `get_recent_memories()` with no `user_id` still falls back to `self.default_user_id` and returns only the default user's memories (lines 753–755).
    
    This asymmetry is unlikely to be intentional and may cause confusion: a single-user deployment that stores a memory via `add()` (which also uses `default_user_id` at line 419) will find it through `search()` — but the same query through `get_recent_memories()` could return an empty list if the default user was recreated with a new internal ID. Consider applying the same "search all formation users when `user_id=None`" approach to `get_recent_memories()` to keep the semantics consistent:
    
    ```python
    # get_recent_memories() lines 744-755
    if user_id:
        cursor = self.conn.execute(
            "SELECT u.id FROM users u "
            "JOIN user_identifiers ui ON u.id = ui.user_id "
            "WHERE ui.identifier = ? AND ui.formation_id = ?",
            (user_id, self.formation_id),
        )
        user_row = cursor.fetchone()
        internal_user_id = user_row[0] if user_row else self.default_user_id
    else:
        internal_user_id = None  # search all formation users, same as search()
    ```
    And then branch the SELECT query the same way `_search_internal` does.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 47-50

Comment:
**No integrity verification on downloaded tarball**

The sqlite-vec amalgamation tarball is downloaded from GitHub over HTTPS and compiled immediately, but there is no checksum verification step. If the GitHub release artifact or the CDN serving it were ever replaced by a malicious build (typosquatting, compromised release, MitM on the CDN edge), the container would silently compile and install the attacker's code.

Consider adding an expected SHA-256 check after the download:

```
python -c "import urllib.request; urllib.request.urlretrieve('https://github.com/asg017/sqlite-vec/releases/download/v${VEC_VERSION}/sqlite-vec-${VEC_VERSION}-amalgamation.tar.gz', 'sqlite-vec.tar.gz')" && \
echo "<expected-sha256>  sqlite-vec.tar.gz" | sha256sum -c - && \
tar xzf sqlite-vec.tar.gz && \
```

The expected SHA-256 can be fetched from the GitHub release's checksum file or computed locally against a known-good download.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: Dockerfile
Line: 50

Comment:
**Hardcoded `python3.10` path is fragile**

The destination path `/install/lib/python3.10/site-packages/sqlite_vec/vec0.so` is hardcoded to Python 3.10. If the base image (`FROM python:3.10-slim`) is ever bumped to Python 3.11 or 3.12, this `cp` will silently succeed by writing to a non-existent (or stale) path while the actual `site-packages` is under the new version directory, leaving the broken aarch64 wheel in place.

Consider deriving the path at build time:

```suggestion
        python_ver=$(python -c "import sys; print(f'python{sys.version_info.major}.{sys.version_info.minor}')") && \
        cp vec0.so /install/lib/${python_ver}/site-packages/sqlite_vec/vec0.so && \
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: Dockerfile
Line: 128-129

Comment:
**`HF_HUB_OFFLINE=1` baked into the image may surprise non-SIF users**

Setting `HF_HUB_OFFLINE=1` and `TRANSFORMERS_OFFLINE=1` as image-level `ENV` instructions means every consumer of this image — including plain Docker deployments, CI pipelines, and development environments — will have HuggingFace's network access silently disabled. Any agent that tries to load a model not baked into the image will receive an opaque offline-mode error.

The comment correctly describes the *intent* (SIF read-only filesystem), but the fix is applied at the wrong layer. A safer approach is to set these variables only in the container entrypoint when running under Singularity/SIF (e.g. check `SINGULARITY_CONTAINER` or a custom `MUXI_SIF_MODE` env var), leaving the Docker image itself network-capable:

```bash
# In docker-entrypoint.sh
if [ -n "$SINGULARITY_CONTAINER" ] || [ "$MUXI_SIF_MODE" = "1" ]; then
    export HF_HUB_OFFLINE=1
    export TRANSFORMERS_OFFLINE=1
fi
```

If the image is *only* ever used in SIF contexts this is a non-issue, but worth a comment clarifying that intent so future maintainers don't spend time debugging mysterious model-load failures in Docker.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/muxi/runtime/services/memory/sqlite.py
Line: 516-523

Comment:
**`search()` and `get_recent_memories()` behave differently when `user_id=None`**

After this fix, calling `search(query)` with no `user_id` will return results from **all users** in the formation (the new `else` branch in `_search_internal`), while calling `get_recent_memories()` with no `user_id` still falls back to `self.default_user_id` and returns only the default user's memories (lines 753–755).

This asymmetry is unlikely to be intentional and may cause confusion: a single-user deployment that stores a memory via `add()` (which also uses `default_user_id` at line 419) will find it through `search()` — but the same query through `get_recent_memories()` could return an empty list if the default user was recreated with a new internal ID. Consider applying the same "search all formation users when `user_id=None`" approach to `get_recent_memories()` to keep the semantics consistent:

```python
# get_recent_memories() lines 744-755
if user_id:
    cursor = self.conn.execute(
        "SELECT u.id FROM users u "
        "JOIN user_identifiers ui ON u.id = ui.user_id "
        "WHERE ui.identifier = ? AND ui.formation_id = ?",
        (user_id, self.formation_id),
    )
    user_row = cursor.fetchone()
    internal_user_id = user_row[0] if user_row else self.default_user_id
else:
    internal_user_id = None  # search all formation users, same as search()
```
And then branch the SELECT query the same way `_search_internal` does.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 965e2b9

…conditional HF offline

- Add SHA-256 checksum verification for sqlite-vec amalgamation download
- Derive Python version dynamically instead of hardcoding python3.10
- Move HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE from Dockerfile ENV to
  entrypoint, gated on SINGULARITY_CONTAINER or MUXI_SIF_MODE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant