🐞 InsectAgent

InsectAgent is a demo iOS application that demonstrates a hybrid approach to insect recognition by combining traditional vision models with on-device multimodal large language models (MLLMs).

This app runs fully on-device (iOS 18.2+), using:

A ResNet18 classifier trained on the IP102 dataset
Apple’s FastVLM MLLM

When the confidence from the vision model is low, the app retrieves taxonomic knowledge and uses FastVLM to refine predictions — mimicking how entomologists resolve ambiguity.

🧠 Paper Abstract (ISVLSI 2025)

Insect Agent: Improving Insect Recognition through Dynamic Information Augmentation with Multimodal Large Language Models

Insect recognition remains a critical challenge for biodiversity monitoring, conservation efforts, and agricultural sustainability. Current computer vision approaches struggle with accurate species identification due to subtle morphological differences.
Our analysis reveals that while vision classifiers often fail to predict the correct species as their top choice, the true species is usually included in the top-k predictions.

We introduce Insect Agent, a two-stage framework:

A vision classifier proposes candidate species with confidence scores.

If confidence is low, the system retrieves relevant expert knowledge and invokes a multimodal language model (MLLM) to refine the prediction.

This dynamic invocation strategy minimizes computational cost while improving classification accuracy. Our experiments show that Insect Agent improves performance by 14.24% on average compared to vision-only models.

📱 Application Demo

The demo includes 3 sample images from the IP102 dataset. These can be used to test the full pipeline on-device.

⚡️ Pretrained FastVLM Models

InsectAgent supports multiple pre-trained FastVLM variants:

Model	Size	Use Case
FastVLM 0.5B	Small	Fastest and smallest – ideal for mobile
FastVLM 1.5B	Medium	Balanced in speed and accuracy
FastVLM 7B	Large	Most accurate – best for powerful devices

🔽 Download Instructions

Use the get_pretrained_mlx_model.sh script:

Make the script executable:
```
chmod +x get_pretrained_mlx_model.sh
```

Download the desired model:

./get_pretrained_mlx_model.sh --model 0.5b --dest ./FastVLM/model

Open the project in Xcode, build, and run.

To switch models, rerun the script with a different flag (e.g., --model 1.5b) and rebuild.

📄 License

This project includes software and model components from Apple’s FastVLM and related research projects.

Please refer to the following files for licensing and attribution:

LICENSE
LICENSE_MODEL
Acknowledgements

⚠️ Model weights are provided for non-commercial research use only.

📚 Citation

If you use this work or build on it, please cite:

@inproceedings{insectAgent,
  title={Insect Agent: Improving Insect Recognition via Dynamic Knowledge Augmentation Using Multimodal Large Language Models},
  author={Zhao*, Shu and Narayanan Sridhar* and Ajay and Patch, Harland and Narayanan, Vijaykrishnan},
  booktitle={2025 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)},
  year={2025},
  organization={IEEE},
  note={(* indicates equal contribution)}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Configuration		Configuration
FastVLM		FastVLM
InsectAgent.xcodeproj		InsectAgent.xcodeproj
InsectAgent		InsectAgent
.gitignore		.gitignore
ACKNOWLEDGEMENTS		ACKNOWLEDGEMENTS
LICENSE		LICENSE
LICENSE_MODEL		LICENSE_MODEL
README.md		README.md
demo.gif		demo.gif
get_pretrained_mlx_model.sh		get_pretrained_mlx_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐞 InsectAgent

🧠 Paper Abstract (ISVLSI 2025)

📱 Application Demo

⚡️ Pretrained FastVLM Models

🔽 Download Instructions

📄 License

📚 Citation

About

Uh oh!

Releases

Packages

Languages

License

ajaynarayanan/InsectAgent

Folders and files

Latest commit

History

Repository files navigation

🐞 InsectAgent

🧠 Paper Abstract (ISVLSI 2025)

📱 Application Demo

⚡️ Pretrained FastVLM Models

🔽 Download Instructions

📄 License

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages