Skip to content

A multimodal AI security pipeline for detecting visual prompt injections.

License

Notifications You must be signed in to change notification settings

ca7ai/ImageWarden

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImageWarden™

ImageWarden™ is an AI security tool designed to detect Visual Prompt Injections and Adversarial Attacks in images before they are processed by Multimodal LLMs (like GPT-4o, Gemini 1.5, or Claude 3.5 Sonnet).

It acts as a firewall for your vision model, flagging images that contain hidden textual instructions or mathematical adversarial noise (e.g., the "Panda/Monkey" attack).

🚀 Key Features

  • Module 1: OCR Text Scanner (EasyOCR) - Detects hidden or low-contrast text instructions embedded in the image pixels (e.g., "Ignore previous instructions"). Optimized for CPU servers.
  • Module 2: Adversarial Noise Breaker - Generates a "sanitized" version of the image (via Gaussian blur/resizing) to disrupt fragile adversarial pixel patterns.
  • Module 3: Semantic Judge (CLIP) - Uses a lightweight, secure vision model (CLIP) to cross-reference the content. If the LLM sees a "Monkey" but CLIP sees a "Panda," the image is flagged.

⚙️ Installation (Headless / Server Support)

This project is optimized for Headless Environments (AWS EC2, Docker, Linux Servers). It uses opencv-python-headless and easyocr to perform efficiently on low-memory instances (like AWS t2.micro).

1. Clone the repo

git clone [https://github.com/ca7ai/ImageWarden.git](https://github.com/ca7ai/ImageWarden.git)
cd ImageWarden

2. Install Dependencies

pip install --no-cache-dir -r requirements.txt

3. System Requirements (Linux/EC2 only)

If you encounter ImportError: libxcb.so.1, your server needs basic system-level GL libraries.

# Amazon Linux 2023 / Fedora / CentOS
sudo dnf install libxcb libX11 libXext libSM libXrender -y

# Ubuntu / Debian
sudo apt-get update && sudo apt-get install libgl1-mesa-glx -y

Note on Memory: This tool uses EasyOCR (Low RAM) and CLIP (Medium RAM). It runs comfortably on 1GB RAM instances if Swap is enabled.

🛠️ Usage

Run the tool from the command line. You provide the image and the "Suspect Claim" (what the LLM incorrectly thinks the image is).

python main.py <image_path> --check_label "<suspect_label>" --true_label "<actual_label>"

Output:

  1. OCR: Checks for hidden text.
  2. Judge: CLIP confirms "This is 100% a Panda." -> [!] MISMATCH DETECTED.
  3. Sanitizer: Saves sanitized_panda.jpg. Upload this clean version to the LLM to verify if the "Monkey" response disappears.

Example:

Example Input Image:

image2

Example Output:

34

Example Input Image:

panda_bear

source: Wikipedia [https://tinyurl.com/2h85ybux]

Example Output:

99

❓ Troubleshooting (EC2 / Linux Common Errors)

If you are running on a Free Tier EC2 instance (t2.micro / t3.micro), you may encounter resource limits. Here is how to fix them.

1. Error: Killed

The Problem: Your server ran out of RAM while loading the AI model. The Fix: You need to enable Swap Memory (virtual RAM).

# Run these commands to add 2GB of Swap space
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
# Make it permanent
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

2. Error: [Errno 28] No space left on device

The Problem: PyTorch is trying to download massive GPU drivers that fill up your 8GB hard drive. The Fix: You must install the CPU-only version and disable caching.

# 1. Clear junk files to free up space
rm -rf ~/.cache/pip

# 2. Force install the lightweight CPU version
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu --no-cache-dir

# 3. Install the rest
pip install -r requirements.txt --no-cache-dir

📂 Project Structure

ImageWarden/
├── main.py              # Core logic pipeline
├── requirements.txt     # Python dependencies (headless optimized)
├── .gitignore           # Ignores venv and generated images
├── LICENSE              # PolyForm Noncommercial License
└── README.md            # Documentation

📜 License

This project is licensed under the PolyForm Noncommercial License 1.0.0.

ImageWarden™ is a trademark of ca7ai (Calistus Christian).

Free for: Researchers, students, hobbyists, and non-profit organizations. Commercial Use: If you want to use this code in a commercial product or business context, you must purchase a Commercial License. Please contact me via LinkedIn.

About

A multimodal AI security pipeline for detecting visual prompt injections.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages