Although the orientation and scale properties of the objects in remote sensing images have been widely considered in the modern deep learning based object detection methods, the spatial distribution property of objects has rarely been investigated. There is a distinct spatial distribution difference between close-range objects and remote sensing objects: the former may exhibit extensive mutual occlusion and overlap, whereas the latter rarely overlap. A current remote sensing object detection algorithm that ignores the spatial distribution difference may unnecessarily apply the massive anchor-based proposal bounding box generation and non-maximum suppression (NMS) operations. In this paper, considering the unique spatial distribution of remote sensing objects, and also the other spatial properties, we propose a novel, compact, and spatial-oriented object detection framework for remote sensing images. The proposed two-stage convolutional neural network (CNN) framework, which we call RSADet (the Remote-sensing Spatial Adaptation DETector), considers the spatial distribution, scale, and orientation/shape varieties of the objects in remote sensing images. In the first stage, each object instance is inferred on the scale-attention boosted CNN heatmaps to generate candidate bounding boxes, instead of using the anchor-based proposal box generation and NMS. In the second stage, deformable convolutions are introduced to adapt to the geometric variations of different object instances and to avoid the impact of complex and changeable backgrounds. A new bounding box confidence (IoU score) prediction branch is introduced as a convenient constraint for eliminating unreliable boxes and improving the performance. Experiments were conducted on a large single-class remote sensing object detection dataset (the Ningbo Pylon dataset) built as part of this study and an open-source extraordinarily large multi-class dataset (the DIOR dataset). Compared with the advanced detectors from both the computer vision and remote sensing communities, the proposed RSADet achieved a state-of-the-art performance on both datasets.
Citing us if you find it is helpful in your study : Dawen Yu, Shunping Ji. A New Spatial-Oriented Object Detection Framework for Remote Sensing Images[J]. IEEE Transactions on Geoscience and Remote Sensing (in press).
Figure. (left) The geographic locations of the images in the Ningbo Pylon dataset, and (right) the scale distribution of the object bounding boxes according to the MS COCO partition criteria.
Download the train dataset (Not cut, VOC annotation style). Click HERE to Download
Download the test dataset (Not cut, VOC annotation style). Click HERE to Download