From MFKP_wiki

Jump to: navigation, search

Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling

Vijay Badrinarayanan, Ankur Handa, Roberto Cipolla

We propose a novel deep architecture, SegNet, for semantic pixel wise image labelling. SegNet has several attractive properties; (i) it only requires forward evaluation of a fully learnt function to obtain smooth label predictions, (ii) with increasing depth, a larger context is considered for pixel labelling which improves accuracy, and (iii) it is easy to visualise the effect of feature activation(s) in the pixel label space at any depth. SegNet is composed of a stack of encoders followed by a corresponding decoder stack which feeds into a soft-max classification layer. The decoders help map low resolution feature maps at the output of the encoder stack to full input image size feature maps. This addresses an important drawback of recent deep learning approaches which have adopted networks designed for object categorization for pixel wise labelling. These methods lack a mechanism to map deep layer feature maps to input dimensions. They resort to ad hoc methods to upsample features, e.g. by replication. This results in noisy predictions and also restricts the number of pooling layers in order to avoid too much upsampling and thus reduces spatial context. SegNet overcomes these problems by learning to map encoder outputs to image pixel labels. We test the performance of SegNet on outdoor RGB scenes from CamVid, KITTI and indoor scenes from the NYU dataset. Our results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.

Key: INRMM:14531671



Article-Level Metrics (Altmetrics)
arXiv code

Available versions (may include free-access full text)

arXiv (abstract), arXiv (PDF)

Versions of the publication are also available in Google Scholar.
Google Scholar code: GScluster:10021085892189469802

Works citing this publication (including grey literature)

An updated list of who cited this publication is available in Google Scholar.
Google Scholar code: GScites:10021085892189469802

Further search for available versions

Search in ResearchGate (or try with a fuzzier search in ResearchGate)
Search in Mendeley (or try with a fuzzier search in Mendeley)

Publication metadata

Bibtex, RIS, RSS/XML feed, Json, Dublin Core

Digital preservation of this INRMM-MiD record

Internet Archive

Meta-information Database (INRMM-MiD).
This database integrates a dedicated meta-information database in CiteULike (the CiteULike INRMM Group) with the meta-information available in Google Scholar, CrossRef and DataCite. The Altmetric database with Article-Level Metrics is also harvested. Part of the provided semantic content (machine-readable) is made even human-readable thanks to the DCMI Dublin Core viewer. Digital preservation of the meta-information indexed within the INRMM-MiD publication records is implemented thanks to the Internet Archive.
The library of INRMM related pubblications may be quickly accessed with the following links.
Search within the whole INRMM meta-information database:
Search only within the INRMM-MiD publication records:
Full-text and abstracts of the publications indexed by the INRMM meta-information database are copyrighted by the respective publishers/authors. They are subject to all applicable copyright protection. The conditions of use of each indexed publication is defined by its copyright owner. Please, be aware that the indexed meta-information entirely relies on voluntary work and constitutes a quite incomplete and not homogeneous work-in-progress.
INRMM-MiD was experimentally established by the Maieutike Research Initiative in 2008 and then improved with the help of several volunteers (with a major technical upgrade in 2011). This new integrated interface is operational since 2014.