We propose a general object counting method that does not use any prior category information. We learn from local image divisions to predict global image-level counts without using any form of local annotations. Our method separates the input image into a set of image divisions - each fully covering the image. Each image division is composed of a set of region proposals or uniform grid cells. Our approach learns in an end-to-end deep learning architecture to predict global image-level counts from local image divisions. The method incorporates a counting layer which predicts object counts in the complete image, by enforcing consistency in counts when dealing with overlapping image regions. Our counting layer is based on the inclusion-exclusion principle from set theory. We analyze the individual building blocks of our proposed approach on Pascal-VOC2007 and evaluate our method on the MS-COCO large scale generic object data set as well as on three class-specific counting data sets: UCSD pedestrian data set, and CARPK, and PUCPR+ car data sets.
- counting with region proposals
- fully convolutional networks
- Generic-class object counting
- inclusion-exclusion principle