lets say i want to classiy domain document into about 48 categories, am I create like The RVL-CDIP Dataset? what`s the proper dpi of document image ?should I process them into grayscale?
400,000 grayscale images in 16 classes, with 25,000 images per class
400,0003 grayscale images in 163 classes, with 25,000 images per class