TY - GEN
T1 - Multi-scale Context Intertwining for Semantic Segmentation
AU - Lin, Di
AU - Ji, Yuanfeng
AU - Lischinski, Dani
AU - Cohen-Or, Daniel
AU - Huang, Hui
N1 - Publisher Copyright:
© 2018, Springer Nature Switzerland AG.
PY - 2018
Y1 - 2018
N2 - Accurate semantic image segmentation requires the joint consideration of local appearance, semantic information, and global scene context. In today’s age of pre-trained deep networks and their powerful convolutional features, state-of-the-art semantic segmentation approaches differ mostly in how they choose to combine together these different kinds of information. In this work, we propose a novel scheme for aggregating features from different scales, which we refer to as Multi-Scale Context Intertwining (MSCI). In contrast to previous approaches, which typically propagate information between scales in a one-directional manner, we merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. By training the parameters of the LSTM units on the segmentation task, the above approach learns how to extract powerful and effective features for pixel-level semantic segmentation, which are then combined hierarchically. Furthermore, rather than using fixed information propagation routes, we subdivide images into super-pixels, and use the spatial relationship between them in order to perform image-adapted context aggregation. Our extensive evaluation on public benchmarks indicates that all of the aforementioned components of our approach increase the effectiveness of information propagation throughout the network, and significantly improve its eventual segmentation accuracy.
AB - Accurate semantic image segmentation requires the joint consideration of local appearance, semantic information, and global scene context. In today’s age of pre-trained deep networks and their powerful convolutional features, state-of-the-art semantic segmentation approaches differ mostly in how they choose to combine together these different kinds of information. In this work, we propose a novel scheme for aggregating features from different scales, which we refer to as Multi-Scale Context Intertwining (MSCI). In contrast to previous approaches, which typically propagate information between scales in a one-directional manner, we merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. By training the parameters of the LSTM units on the segmentation task, the above approach learns how to extract powerful and effective features for pixel-level semantic segmentation, which are then combined hierarchically. Furthermore, rather than using fixed information propagation routes, we subdivide images into super-pixels, and use the spatial relationship between them in order to perform image-adapted context aggregation. Our extensive evaluation on public benchmarks indicates that all of the aforementioned components of our approach increase the effectiveness of information propagation throughout the network, and significantly improve its eventual segmentation accuracy.
KW - Convolutional neural network
KW - Deep learning
KW - Long short-term memory
KW - Semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85055090814&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-01219-9_37
DO - 10.1007/978-3-030-01219-9_37
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85055090814
SN - 9783030012182
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 622
EP - 638
BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
A2 - Ferrari, Vittorio
A2 - Sminchisescu, Cristian
A2 - Hebert, Martial
A2 - Weiss, Yair
PB - Springer Verlag
T2 - 15th European Conference on Computer Vision, ECCV 2018
Y2 - 8 September 2018 through 14 September 2018
ER -