Skip to content

Fix/micro learning#137

Merged
lpi-tn merged 15 commits intomainfrom
Fix/micro-learning
Mar 10, 2026
Merged

Fix/micro learning#137
lpi-tn merged 15 commits intomainfrom
Fix/micro-learning

Conversation

@lpi-tn
Copy link
Contributor

@lpi-tn lpi-tn commented Mar 10, 2026

This pull request refactors the handling of embedding model IDs throughout the micro-learning API and related services. The main change is to support multiple embedding models for a given collection or language, instead of just one, which improves flexibility and accuracy in querying and retrieving subject and context documents. Several functions and endpoints now accept and process lists of embedding model IDs, and related tests have been updated accordingly.

API and Endpoint Changes:

  • Updated get_subject_list and get_full_journey endpoints to retrieve and use a list of embedding model IDs instead of a single ID, ensuring all relevant models are considered when fetching subjects and context documents. [1] [2]

Service and Helper Refactoring:

  • Refactored collection_and_model_id_according_lang to return a list of EmbeddingModel objects, and updated its consumers to handle lists of IDs. [1] [2]
  • Modified get_subject_vector to accept a model name and retrieve vectors using all corresponding embedding models, improving subject vector retrieval accuracy. [1] [2] [3]

Database Query Updates:

  • Changed get_subject, get_subjects, and get_context_documents to accept and filter by lists of embedding model IDs, ensuring queries return results for all applicable models. [1] [2] [3] [4]
  • Updated get_embeddings_model_id_according_name to return a list of EmbeddingModel objects instead of a single ID, removing caching logic and simplifying the query. [1] [2]

Testing Adjustments:

  • Modified tests to mock and validate the new behavior with lists of embedding models, ensuring correctness of the updated endpoints and service logic. [1] [2] [3]

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the micro-learning flow to support multiple embedding models per collection/language by propagating lists of embedding model IDs through helper functions, DB queries, and API endpoints.

Changes:

  • Updated micro-learning endpoints to derive and use a list of embedding model IDs when fetching subjects and context documents.
  • Refactored SQL query helpers to accept embedding_models_ids lists and changed the embedding-model lookup to return multiple matches.
  • Updated tests and search subject-vector retrieval to work with the new “multiple models” behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/app/tests/api/api_v1/test_micro_learning.py Updates mocks for new list-of-models return type from helper.
src/app/services/sql_db/queries.py Refactors subject/context queries to filter by multiple embedding model IDs; changes model lookup to return a list.
src/app/services/search.py Updates subject-vector lookup to use model name → embedding models → list of IDs.
src/app/services/helpers.py Refactors collection_and_model_id_according_lang to return a list of EmbeddingModel objects.
src/app/api/api_v1/endpoints/micro_learning.py Updates endpoints to pass lists of embedding model IDs into DB query helpers.
Comments suppressed due to low confidence (1)

src/app/services/helpers.py:197

  • collection_and_model_id_according_lang now returns a list of EmbeddingModel objects, but the function name/docstring still refer to a single "model id". Rename the function (and update the docstring) to reflect the new semantics (e.g., "models"/"embedding_models") so callers don’t assume a single ID.
async def collection_and_model_id_according_lang(
    lang: str | None, sp
) -> tuple[Collection, list[EmbeddingModel]]:
    """
    Get the collection info and model id according to the language.
    Args:
        sp: The search service.
        lang: The language to get the collection info and model id for. If None, the default language is used wich is multilingual.

    Returns:
        A tuple of the collection info and model id.
    """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from sqlalchemy import func, select
from welearn_database.data.enumeration import Step
from welearn_database.data.models import (
from welearn_database.data.models import ( # FilterType,; FilterUsedInQuery,
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment in the welearn_database.data.models import (# FilterType,; FilterUsedInQuery,) looks like leftover editing noise and makes the import harder to read. Please remove it (and if those symbols are still needed, import them cleanly or document the reason separately).

Suggested change
from welearn_database.data.models import ( # FilterType,; FilterUsedInQuery,
from welearn_database.data.models import (

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@lpi-tn lpi-tn merged commit 5e18f51 into main Mar 10, 2026
3 checks passed
@lpi-tn lpi-tn deleted the Fix/micro-learning branch March 10, 2026 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants