Sökning: onr:"swepub:oai:lup.lub.lu.se:b5f50e29-597f-474b-b687-ab45f476d11d" >
Joint Handwritten T...
Joint Handwritten Text Recognition and Word Classification for Tabular Information Extraction
-
- Blomqvist, Christopher (författare)
- Lund University
-
- Enflo, Kerstin (författare)
- Lund University,Lunds universitet,Tillväxt, teknologisk förändring och ojämlikhet,Ekonomisk-historiska institutionen,Ekonomihögskolan,Growth, technological change, and inequality,Department of Economic History,Lund University School of Economics and Management, LUSEM
-
- Jakobsson, Andreas (författare)
- Lund University,Lunds universitet,Biomedical Modelling and Computation,Forskargrupper vid Lunds universitet,Statistical Signal Processing Group,Matematisk statistik,Matematikcentrum,Institutioner vid LTH,Lunds Tekniska Högskola,LTH profilområde: AI och digitalisering,LTH profilområden,Lund University Research Groups,Mathematical Statistics,Centre for Mathematical Sciences,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH
-
visa fler...
-
- Åström, Kalle (författare)
- Lund University,Lunds universitet,Mathematical Imaging Group,Forskargrupper vid Lunds universitet,Matematik LTH,Matematikcentrum,Institutioner vid LTH,Lunds Tekniska Högskola,Stroke Imaging Research group,LTH profilområde: AI och digitalisering,LTH profilområden,Lund University Research Groups,Mathematics (Faculty of Engineering),Centre for Mathematical Sciences,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH
-
visa färre...
-
(creator_code:org_t)
- 2022
- 2022
- Engelska 6 s.
-
Ingår i: 2022 26th International Conference on Pattern Recognition (ICPR). - 9781665490634 - 9781665490627 ; , s. 1564-1570
- Relaterad länk:
-
http://dx.doi.org/10...
-
visa fler...
-
https://lup.lub.lu.s...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- In this paper, we present a system for extracting tabular information from loosely structured handwritten documents. The system consists of three parts, (i) a u-net like CNN-based method for text detection and segmentation, (ii) a new attention-based method for simultaneous text recognition and classification of word-parts, and (iii) a method for matching the word parts into a tabular structure for each entry. A key contribution is the observation that the new attention-based recognition and classification module makes it possible for improved spatial analysis of the tabular information. The method is evaluated on a unique historical document: The Swedish Wealth Tax of 1571, consisting of 11,453 pages of hand-written tax records. The evaluation shows that the system provides a significant improvement to the state-of-the-art to the problem of tabular extraction from loosely structured historical documents.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)
- SAMHÄLLSVETENSKAP -- Ekonomi och näringsliv -- Ekonomisk historia (hsv//swe)
- SOCIAL SCIENCES -- Economics and Business -- Economic History (hsv//eng)
Nyckelord
- Histograms
- Image segmentation
- Text recognition
- Finance
- Writing
- Information retrieval
- Decoding
Publikations- och innehållstyp
- kon (ämneskategori)
- ref (ämneskategori)
Hitta via bibliotek
Till lärosätets databas