Pixel-wise semantic segmentation unifies distinct scene perception tasks in a coherent way, and has catalyzed notable progress in autonomous and assisted navigation, where a whole surrounding perception is vital. However, current mainstream semantic segmenters are normally benchmarked against datasets with narrow Field of View (FoV), and most vision-based navigation systems use only a forward-view camera. In this paper, we propose a Panoramic Annular Semantic Segmentation (PASS) framework to perceive the entire surrounding based on a compact panoramic annular lens system and an online panorama unfolding process. To facilitate the training of PASS models, we leverage conventional FoV imaging datasets, bypassing the effort entailed to create dense panoramic annotations. To consistently exploit the rich contextual cues in the unfolded panorama, we adapt our real-time ERF-PSPNet to predict semantically meaningful feature maps in different segments and fuse them to fulfill smooth and seamless panoramic scene parsing. Beyond the enlarged FoV, we extend focal length-related and style transfer-based data augmentations, to robustify the semantic segmenter against distortions and blurs in panoramic imagery. A comprehensive variety of experiments demonstrates the qualified robustness of our proposal for real-world surrounding understanding.
No information added