Precise detection in densely packed scenes

Eran Goldman, Roei Herzig, Aviv Eisenschtat, Jacob Goldberger, Tal Hassner

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Man-made scenes are often densely packed, containing numerous objects, often identical, positioned in close proximity. We show that precise object detection in such scenes remains a challenging frontier even for state-of-the-art object detectors. We propose a novel, deep-learning based method for precise object detection, designed for such challenging settings. Our contributions include: (1) A layer for estimating the Jaccard index as a detection quality score; (2) a novel EM merging unit, which uses our quality scores to resolve detection overlap ambiguities; finally, (3) an extensive, annotated data set, SKU-110K, representing packed retail environments, released for training and testing under such extreme settings. Detection tests on SKU-110K, and counting tests on the CARPK and PUCPR+, show our method to outperform existing state-of-the-art with substantial margins.

    Original languageEnglish
    Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
    PublisherIEEE Computer Society
    Pages5222-5231
    Number of pages10
    ISBN (Electronic)9781728132938
    DOIs
    StatePublished - Jun 2019
    Event32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States
    Duration: 16 Jun 201920 Jun 2019

    Publication series

    NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
    Volume2019-June
    ISSN (Print)1063-6919

    Conference

    Conference32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
    Country/TerritoryUnited States
    CityLong Beach
    Period16/06/1920/06/19

    Keywords

    • Categorization
    • Recognition: Detection
    • Retrieval

    Fingerprint

    Dive into the research topics of 'Precise detection in densely packed scenes'. Together they form a unique fingerprint.

    Cite this