SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Extended search

WFRF:(Enflo Kerstin)
 

Search: WFRF:(Enflo Kerstin) > Joint Handwritten T...

Joint Handwritten Text Recognition and Word Classification for Tabular Information Extraction

Blomqvist, Christopher (author)
Lund University
Enflo, Kerstin (author)
Lund University,Lunds universitet,Tillväxt, teknologisk förändring och ojämlikhet,Ekonomisk-historiska institutionen,Ekonomihögskolan,Growth, technological change, and inequality,Department of Economic History,Lund University School of Economics and Management, LUSEM
Jakobsson, Andreas (author)
Lund University,Lunds universitet,Biomedical Modelling and Computation,Forskargrupper vid Lunds universitet,Statistical Signal Processing Group,Matematisk statistik,Matematikcentrum,Institutioner vid LTH,Lunds Tekniska Högskola,LTH profilområde: AI och digitalisering,LTH profilområden,Lund University Research Groups,Mathematical Statistics,Centre for Mathematical Sciences,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH
show more...
Åström, Kalle (author)
Lund University,Lunds universitet,Mathematical Imaging Group,Forskargrupper vid Lunds universitet,Matematik LTH,Matematikcentrum,Institutioner vid LTH,Lunds Tekniska Högskola,Stroke Imaging Research group,LTH profilområde: AI och digitalisering,LTH profilområden,Lund University Research Groups,Mathematics (Faculty of Engineering),Centre for Mathematical Sciences,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH
show less...
 (creator_code:org_t)
2022
2022
English 6 s.
In: 2022 26th International Conference on Pattern Recognition (ICPR). - 9781665490627 - 9781665490634 ; , s. 1564-1570
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • In this paper, we present a system for extracting tabular information from loosely structured handwritten documents. The system consists of three parts, (i) a u-net like CNN-based method for text detection and segmentation, (ii) a new attention-based method for simultaneous text recognition and classification of word-parts, and (iii) a method for matching the word parts into a tabular structure for each entry. A key contribution is the observation that the new attention-based recognition and classification module makes it possible for improved spatial analysis of the tabular information. The method is evaluated on a unique historical document: The Swedish Wealth Tax of 1571, consisting of 11,453 pages of hand-written tax records. The evaluation shows that the system provides a significant improvement to the state-of-the-art to the problem of tabular extraction from loosely structured historical documents.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)
SAMHÄLLSVETENSKAP  -- Ekonomi och näringsliv -- Ekonomisk historia (hsv//swe)
SOCIAL SCIENCES  -- Economics and Business -- Economic History (hsv//eng)

Keyword

Histograms
Image segmentation
Text recognition
Finance
Writing
Information retrieval
Decoding

Publication and Content Type

kon (subject category)
ref (subject category)

Find in a library

To the university's database

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view