-
Notifications
You must be signed in to change notification settings - Fork 115
Description
Description
We propose enabling a standardized experience for users to bring and utilize their own Machine Learning (ML) models within the Texera platform. To achieve this, we need to adopt a unified protocol for the entire lifecycle of model saving, loading, and execution.
After evaluating several standards, we recommend integrating MLflow as a starting protocol for model management in Texera.
Motivation & User Persona
Currently Texera serves two user groups with distinct needs:
- Students: Who use the platform to learn the fundamentals of Machine Learning and Data Science.
- Researchers in bioinformatics: Who require heavy computation for tasks such as sequence alignment and "shallow" machine learning (e.g., Scikit-Learn, classic statistical models).
Currently, there is no standardized way for these users to import and run pre-trained models seamlessly. Implementing a standard protocol will streamline this workflow and enhance Texera's extensibility.
Evaluation of Alternatives
We explored several options before selecting MLflow:
- Hugging Face:
- Pros: Excellent standards and ease of use; industry standard for LLMs.
- Cons: Primarily focused on LLMs and Deep Learning. It does not offer a comprehensive solution for managing the full lifecycle (storage to loading) of general-purpose or "shallow" ML models often used by our target audience.
- ONNX (Open Neural Network Exchange):
- Pros: Great interoperability for deep learning models.
- Cons: Heavily focused on Neural Networks, making it less suitable for the broad range of general ML libraries (like Scikit-Learn) that our biomedical users rely on.
- MLflow (Selected):
- Pros: Supports a wide variety of libraries including TensorFlow, PyTorch, and Scikit-Learn. Crucially, it manages the entire lifecycle from standardizing the storage format to loading the model for inference.
Proposed Implementation
The integration will leverage two existing architectural features within Texera:
1. Model Storage (via LakeFS)
- We will utilize our existing LakeFS integration to store MLflow artifacts.
- Models will be stored similarly to how we handle datasets, but with a key difference: we will enforce the MLflow protocol/structure on the files during upload to ensure compatibility.
2. Model Execution (New Operator)
- We will introduce a new operator type:
MLflow. - This will be built upon our existing Python Native Operator infrastructure.
- The operator will automatically handle loading the model using the standard
mlflowlibrary and executing inference against the input data stream.
Impact / Priority
(P2)Medium – useful enhancement
Affected Area
Workflow Engine (Amber)

