[MICCAI 2025] Multi-Agent Collaboration for Integrating Echocardiography Expertise in Multi-Modal Large Language Models
We will complete the TODO list in the following weeks.
- [] Release the pipeline code.
This automatic pipeline converts Echocardiography PDFs to Concept-Indexed Databases (EchoCardiography Expertise Database).
- Images with their captions
- Pure text content
- Subfigure detection and splitting
- Medical keywords for searchability
Please check out the sheets here.
- For books, we obtained either physical or digital copies (PDF versions) through our clinical collaborators. \
- For clinical guidelines, we accessed the PDF versions under an Open Access license via the institutional library using Elsevier's Text and Data Mining (TDM) services.
All resources used in this work were obtained and processed solely for non-commercial research purposes. We strongly encourage others to obtain these resources through their institutional libraries or through appropriate purchasing channels, and to use them in accordance with licensing terms and strictly for non-commercial research purposes.
pip install mistralai
pip install openai
pip install pillow
pip install pandas
pip install tqdm
pip install qwen-vl-utilsCreate a key.py file in the project root:
MISTRAL_KEY = "your-mistral-api-key"
OPENAI_KEY = "your-azure-openai-api-key"
QWEN_KEY = "your-qwen-api-key"key.py to version control. Add it to .gitignore.
release_v1/
├── step1-mistral_ocr.py
├── step2-rawContentSplit.py
├── step3-checkSplitSubcaption.py
├── step4-splitSubfigure.py
├── step5-cleanTextandKeywords.py
├── key.py (create this - not in repo)
├── Todo/
│ ├── ENG/
│ │ ├── Guideline/
│ │ └── Textbook/
│ └── CHN/
│ ├── Guideline/
│ └── Textbook/
├── Intermediate/
│ └── [Generated during processing]
└── final_output/
└── [Final results with keywords]
Execute steps sequentially:
# Step 1: OCR Processing
python step1-mistral_ocr.py
# Step 2: Content Splitting
python step2-rawContentSplit.py
# Step 3: Subcaption Detection
python step3-checkSplitSubcaption.py
# Step 4: Subfigure Splitting
python step4-splitSubfigure.py
# Step 5: Keyword Extraction
python step5-cleanTextandKeywords.pyPlace PDF files in the appropriate directories:
- English Guidelines:
./Todo/ENG/Guideline/ - English Textbooks:
./Todo/ENG/Textbook/ - Chinese Guidelines:
./Todo/CHN/Guideline/ - Chinese Textbooks:
./Todo/CHN/Textbook/
Final outputs are stored in ./final_output/ with the following structure:
- Image-caption pairs with keywords
- Text content with keywords
- Split subfigure images
- Recheck files for manual validation
MIT License. Copyright (c) 2025 The Hong Kong University of Science and Technology
Contact yqinar@connect.ust.hk for any questions or feedback.