We can take some rule-based approach as a benchmark: email contains bot word or no-reply. However, there are emails like tensorflow-gardener@tensorflow.org that is hard to find. So some ML should be applied to find them. Commit-time-series features can be used.