Focussing Learned Image Compression to Semantic Classes for V2X Applications

Abstract

Cooperative perception with many sensors involved greatly improves the performance of perceptual systems in autonomous vehicles. However, the increasing amount of sensor data leads to a bottleneck due to limited capacity of vehicle-to-X (V2X) communication channels. We leverage lossy learned image compression by means of an autoencoder with adversarial loss function to reduce the overall bitrate. Our key contribution is to focus image compression on regions of interest (ROIs) governed by a binary mask. A transmitter-sided semantic segmentation network extracts semantically important classes being the basis for the generation of a ROI. A second key contribution is that the mask is not transmitted as side information, only the quantized bottleneck data is transmitted. To train the network, we use a loss function operating only on the pixels in the ROI. We report peak-signal-to-noise ratio (PSNR) both in the entire image and only in the ROI, evaluating various fusion architectures and fusion operations involving input image and mask. Showing the high generalizability of our approach, we achieve consistent improvements in the ROI in all experiments on the Cityscapes dataset.

Publication
In Proc. of IEEE Intelligent Vehicles Symposium