arXiv:2106.11539 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords end-to-end transformer, document understanding, docformer achieves state-of-the-art results, novel multi-modal self-attention layer, shares learned spatial embeddings Tags Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset