ImageInsight

A Python pipeline for extracting visual activations from images, processing them, and generating semantic descriptions using a neural network model. The pipeline utilizes PyTorch, a pre-trained AlexNet model (DOI:10.1145/3065386), and a customa recurrent neural network (RNN) decoder comprised of bidirectional GRU's (Gated Recurrent Unit). The RNN takes the activations from the penultimate layer of Alexnet which is passesed through fully connected layers (FC layers) and then is decoded into semnatic descriptors of the image (e.g., is red, is green, is round etc). Lastly the penultimate layer from the RNN can be extracted for further evaluation.

For more information, please refer to --> (link to paper - currently in progress)

Features

Activation Extraction: Extract visual activations from images using a pre-trained AlexNet model.
Semantic Descriptions: Generate descriptions from the extracted activations using a RNN.
Device Support: Optionally run on GPU or CPU.
Configurable Model: Easily switch between different model layers for activation extraction from Alexnet.

Installation

pip install imageinsight

Download InsightFace Model

Please download the semantic model to later call "ImageInsight"

Download Model from Google Drive

Usage

After installing the required dependencies, you can run the pipeline using your own set of images and a pre-trained model. Here's an example of how to use the pipeline:

   from ImageInsight import ImageInsight  
   import numpy as np

   # Initialize the ImageInsight model with the path to the pre-trained model and GPU usage option
   insight = ImageInsight(model_path="path/to/your/model.pt", use_gpu=True)

   # Define paths and settings for running the pipeline
   image_folder = "path/to/your/images"  # Folder containing the input images
   model_name = "alexnet"  # Name of the pre-trained model to use
   layer_index = 4  # The index of the layer from which activations will be extracted
   use_gpu = False  # Set to True if a GPU is available for faster processing
   csv_output_path = "path/to/output/folder"  # Path to the folder where the CSV output will be saved
   csv_file_name = "visual_activations_output.csv"  # Name of the CSV file for the visual activations
   model_path = "path/to/your/model.pt"  # Path to the pre-trained model

   # Run the pipeline
   visual_activations, semantic_activations, image_descriptions = insight.run_pipeline(
      image_folder=image_folder,
      model_name=model_name,
      layer_index=layer_index,
      csv_output_path=csv_output_path,
      csv_file_name=csv_file_name
   )

Expected Image Folder Structure

Objects/
├── Apple/
│   ├── Apple_1.jpg             
│   ├── Apple_two.jpg                   
│   ├── apple_3.jpg                
├── Cats/

Directory Structure

my_pipeline/
│
├── my_pipeline/
│   ├── __init__.py               # Package initialization
│   ├── main.py                   # Main pipeline logic
│   ├── models.py                 # Model definitions (e.g., ActivationToDescriptionModel)
│   ├── utils.py                  # Utility functions (e.g., image activation extraction)
│   └── tokenizer.py              # Tokenizer setup and handling
│
├── README.md                     # Project documentation
├── requirements.txt              # Python dependencies
├── setup.py                      # Packaging information for pip

Dependencies

torch
torchvision
transformers
Pillow
numpy
scikit-learn
matplotlib

License

This project is licensed under the following terms:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, subject to the following conditions:

Citation Requirement: Any use of the Software in research or publications must cite the following GitHub repository:
- GitHub Repository Link
Attribution: The above copyright notice, this permission notice, and the citation requirement must be included in all copies or substantial portions of the Software.

See the full LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImageInsight

Features

Table of Contents

Installation

Download InsightFace Model

Usage

Expected Image Folder Structure

Directory Structure

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
ImageInsight.egg-info		ImageInsight.egg-info
ImageInsight		ImageInsight
build/lib/ImageInsight		build/lib/ImageInsight
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

kikiluvbrains/ImageInsight

Folders and files

Latest commit

History

Repository files navigation

ImageInsight

Features

Table of Contents

Installation

Download InsightFace Model

Usage

Expected Image Folder Structure

Directory Structure

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages