HotkeyTagger

This is a hackathon project to manually classify images faster and easier. This project was built for a short 3 hour hackathon run by c0mpiled and the Penn State Builder's Collective on February 27, 2026.

Simply run HotkeyTagger.py, and point it at a folder of images to get started. The hotkeys are customizable, and the classifications are saved in a csv file in the same folder as the images. You should be able to add keybindings at anytime without the csv breaking, and the purpose of the program is to more quickly and easily classify images for later use in supervised machine learning. The project is built in python, making use of pyQT5 for the GUI.

A brief (rushed, due to time constraints) demo: https://www.youtube.com/watch?v=62CccaczzBQ A brief slidedeck overviewing the project and its features: https://pennstateoffice365-my.sharepoint.com/:p:/g/personal/jad507_psu_edu/IQBQE9y4SEuzTJRWDDnQN1taAfqEADTHDavQ7fp_GkdA6D0?e=zjhQbf

You can run datasetDL.py to download the datasets used in the demo to your local machine for testing.

Before the hackathon, while looking for potential teammates, I posted the following synopsis of the project:

I'm Jeffrey, in Astronomy and Astrophysics

I want to make a tool that makes it easier to hand-classify a portion of an image set, so that you can train on that portion and have the machine do the rest. I'm planning to use python and a pyQT gui that will load up the image, and you can bind classifications to each keystroke.

For test data, I am thinking of using the handwritten digit dataset from scikit-learn https://scikit-learn.org/1.5/auto_examples/datasets/plot_digits_last_image.html#sphx-glr-auto-examples-datasets-plot-digits-last-image-py (hotkeys would be either 1-0, or have 5678 bound to qwer), and the Olivetti Faces dataset https://scikit-learn.org/1.5/datasets/real_world.html#olivetti-faces-dataset (hotkeys for glasses, mustache, beard, etc)

A major feature I'd like to make sure gets finished is that the hotkey settings are saveable, so that you can work on hand-classifying for a few hours, shut down, and then pick up right where you left off.

A major question I haven't figured out yet is "what's the best format to save the classifications?" and I can imagine it being easy to just shove it into a csv or json, or maybe i'll finally mess around with postgreSQL or something. It might also make sense to use metadata attached to the image, but I don't really understand how platform-compatible that is.

I haven't done a hackathon before, so I'm not sure how people successfully split the work under such tight time constraints, but if anyone's interested in joining me, I'd be happy to work with them.

Ultimately, it remained a solo proejct, and was made with the help of Github Copilot and Microsoft Copilot. Part of the inspiration for the project was League of Legends, which standardized hotkey bindings to the qwer keys, compared to its predecessor, DotA. I did not win any prizes during this hackathon, but I hope that this project can be of use to others (and I will definitely be using it, and may work on further improvements). I believe it to be platform agnostic, due to the use of Path, but have not tested it on anything except my local Windows machine.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
csv_manager.py		csv_manager.py
datasetDL.py		datasetDL.py
hotkey_tagger.py		hotkey_tagger.py
requirements.txt		requirements.txt
settings.py		settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HotkeyTagger

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HotkeyTagger

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages