Skip to content

sethigeet/LG-SAM

Repository files navigation

Language Guided SAM - Advanced AI/ML Pipeline for Remote Sensing Images Analysis

🌟 Key Features

  • Multi-Modal AI Models: Integration of cutting-edge models including Qwen3-VL, Falcon, EarthMind, RemoteSAM, etc.
  • Advanced Pipelines: Specialized pipelines for remote sensing tasks combining multiple AI models
  • Visual Question Answering: Natural language querying capabilities for satellite imagery
  • Image Classification: Automated classification between SAR and optical satellite imagery
  • FastAPI Backend: Production-ready REST API for model serving
  • Comprehensive Evaluation: Built-in metrics and testing frameworks for model performance assessment

🏗️ Architecture Overview

LG-SAM/
├── models/           # Model wrappers and implementations
├── pipelines/        # Task-specific pipeline combinations
├── vqa/              # Visual Question Answering modules
├── utils/            # Utility functions and metrics
├── app.py            # FastAPI application
├── api.py            # API endpoints
└── requirements.txt  # Dependencies

The framework follows a modular architecture where individual AI models are wrapped in standardized interfaces and combined into sophisticated pipelines for complex computer vision tasks.

🚀 Installation

Automated Setup

# Run the setup script
chmod +x setup.sh
./setup.sh

The setup script will:

  1. Create a virtual environment using uv
  2. Install all dependencies
  3. Clone and install SAM3
  4. Extract model checkpoints
  5. Start the FastAPI server

API Usage

uv run uvicorn app:app --host 0.0.0.0 --port 8001

The API will be available at http://localhost:8001.

The API endpoints are:

  • POST /classify-image/ — Classifies the uploaded satellite image into SAR and OPTICAL.

  • POST /caption-query/ — Generates a descriptive caption for the uploaded image.

  • POST /binary-query/ — Answers yes/no (binary) questions about the uploaded image.

  • POST /semantic-query/ — Provides an answer to questions about scene features.

  • POST /numeric-query/ — Returns numerical answers such as counts or quantities from the image.

  • POST /grounding-query/ — Generates bounding boxes and grounding results based on the prompt and image content.

Direct Pipeline Usage

from pipelines.remotesam_sam3 import RemoteSAMSAM3Pipeline

# Initialize pipeline
pipeline = RemoteSAMSAM3Pipeline(device="cuda")

# Process image with text prompt
results = pipeline.process_image(
    image="satellite.jpg",
    text_prompt="locate the airport runway",
)

print(f"Found {len(results)} objects")
for result in results:
    print(f"Score: {result['score']:.3f}")
    print(f"Bounding box: {result['oriented_bbox']}")

📊 Evaluation

Running Tests

# Test pipelines
python test_pipeline.py --annotations_dir data/vrsbench/annotations --images_dir data/vrsbench/images --num_gpus 4 --num_workers_per_gpu 2 --num_images 100 --batch_size 4 --viz_dir results

# VQA evaluation
python vqa/object_count.py annotations.json results.json data/vrsbench/images

Built with ❤️ for advancing India's space technology capabilities

About

A set of novel pipelines to help guide SAM (Segment Anything Model) via language to ground things in any image

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors