We create an omnidirectional image dataset of real street scenes called OSV dataset with multi-class annotations for spherical object detection. The original RGB omnidirectional images captured from vehicle-borne PGR's Ladybug3 camera in Kashiwa and Dagong cities, Japan. All the five classes of interested objects, including 1777 lights, 867 cars, 578 traffic signs, 867 crosswalks and 355 crosswalk warning lines, totally 5636 objects, were manually labelled and cross checked in the Driscoll-Healy images.
We provide two formats of omnidirectional images as well as their category and object bounding box annotations. The 2000×1000 pixels omnidirectional images (*.jpg) obtained by equirectangular projection and the corresponding annotation files (*.xml) are stored in directory “equirectangular”. The processed 1000×1000 pixels Driscoll-Healy images and the corresponding annotation files provided in directory “DriscollHealy”. More details can be found in: Yu, Dawen, and Shunping Ji. "Grid Based Spherical CNN for Object Detection from Panoramic Images." Sensors 19.11 (2019): 2622.