A Performance Comparison of Deep Learning Methods for Real-Time Localisation of Vehicle Lights in Video Frames

Chris Rapson¹, Boon-chong Seet², Muhammad Asif Naeem³, Jeong Eun Lee⁴, Reinhard Klette³

¹AUT
²AUCKLAND UNIVERSITY OF TECHNOLOGY
³Auckland University of Technology
⁴University of Auckland

Details

14:15 - 14:30 | Mon 28 Oct | The Great Room II | MoE-T3.2

Session: Regular Session on Object Detection and Classification (III)

Full Text

Abstract

A vehicle's braking lights can help to infer its future trajectory. Visible light communication using vehicle lights can also transmit other safety information to assist drivers with collision avoidance (whether the drivers be human or autonomous). Both these use cases require accurate localisation of vehicle lights by computer vision. Due to the large variation in lighting conditions (day, night, fog, snow, etc), the shape and brightness of the light itself, as well as difficulties with occlusions and perspectives, conventional methods are challenging and deep learning is a promising strategy. This paper presents a comparison of deep learning methods which are selected based on their potential to evaluate real-time video. The detection accuracy is shown to have a strong dependence on the size of the vehicle light within the image. A cascading approach is taken, where a downsampled image is used to detect vehicles, and then a second routine searches for vehicle lights at higher resolution within these Regions of Interest. This approach is demonstrated to improve detection, especially for small objects. Using YOLOv3 for the first stage and Tiny_YOLO for the second stage achieves satisfactory results across a wide range of conditions, and can execute at 37 frames per second. The ground truth for training and evaluating the methods is available for other researchers to use and compare their results.