Cardinal Graph Convolution Framework for Document Information Extraction

Rinon Gal, Shai Ardazi, Roy Shilkrot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Graph Convolutional Networks (GCN) have been recognized as successful for processing pseudo-spatial graph representations of the underlying structure of documents. We present Cardinal Graph Convolutional Networks (CGCN), an efficient and flexible extension of GCNs with cardinal-direction awareness of spatial node arrangement. The formulation of CGCNs retains the traditional GCN permutation invariance, ensuring directional neighbors are involved in learning abstract representations, even in the absence of a proper ordering of the nodes. We show that CGCNs achieve state of the art results on an invoice information extraction task, jointly learning a word-level tagging as well as document meta-level classification and regression. We also present a new multiscale Inception-like CGCN block-layer, as well as Conv-Pool-DeConv-DePool UNet-like architecture, which increase the receptive field. We demonstrate the utility of CGCNs on private and public datasets, with respect to several baseline models: sequential LSTM, transformer classifier, non-cardinal GCNs, and an image-convolutional approach.

Original languageEnglish
Title of host publicationProceedings of the ACM Symposium on Document Engineering, DocEng 2020
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450380003
DOIs
StatePublished - 29 Sep 2020
Externally publishedYes
Event20th ACM Symposium on Document Engineering, DocEng 2020 - Virtual, Online, United States
Duration: 29 Sep 20201 Oct 2020

Publication series

NameProceedings of the ACM Symposium on Document Engineering, DocEng 2020

Conference

Conference20th ACM Symposium on Document Engineering, DocEng 2020
Country/TerritoryUnited States
CityVirtual, Online
Period29/09/201/10/20

Keywords

  • document analysis
  • graph convolution neural networks
  • information extraction

Fingerprint

Dive into the research topics of 'Cardinal Graph Convolution Framework for Document Information Extraction'. Together they form a unique fingerprint.

Cite this