This is my attempt on the Object Localization problem from the Flipkart Grid challenge.
The training data was provided by Flipkart and consisted of 14k images.
https://dare2compete.com/o/Flipkart-GRiD-Teach-The-Machines-2019-74928 The solution that was propose to solve the object localization problem consists of a CNN having 4 neurons in the output layer where each corresponds to the 4 output values i.e the bounding boxes coordedinates to be determined. Therefore it will be solved as a regression problem. The architecture of the CNN is based on the popular VGG-16 architecture commonly used for image classification problems. The final layer has been modified to output four bounding box coordinates and the activation has changed from softmax to relu. The CNN model was trained on Google Colab for 10 epochs. It achieved 79% accuracy when tested on the test set.
Due to the inavailability of images for training such a model the accuracy is a bit less.
Also perhaps a bit more data augmentation would have helped.

