SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Hashmi Khurram Azeem)
 

Sökning: WFRF:(Hashmi Khurram Azeem) > Cascade Network wit...

Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images

Hashmi, Khurram Azeem (författare)
Department of Computer Science, Technical University, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Pagani, Alain (författare)
German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Liwicki, Marcus (författare)
Luleå tekniska universitet,EISLAB
visa fler...
Stricker, Didier (författare)
Department of Computer Science, Technical University, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Afzal, Muhammad Zeshan (författare)
Department of Computer Science, Technical University, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
visa färre...
 (creator_code:org_t)
2021-08-19
2021
Engelska.
Ingår i: Applied Sciences. - : MDPI. - 2076-3417. ; 11:16
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • This paper presents a novel architecture for detecting mathematical formulas in document images, which is an important step for reliable information extraction in several domains. Recently, Cascade Mask R-CNN networks have been introduced to solve object detection in computer vision. In this paper, we suggest a couple of modifications to the existing Cascade Mask R-CNN architecture: First, the proposed network uses deformable convolutions instead of conventional convolutions in the backbone network to spot areas of interest better. Second, it uses a dual backbone of ResNeXt-101, having composite connections at the parallel stages. Finally, our proposed network is end-to-end trainable. We evaluate the proposed approach on the ICDAR-2017 POD and Marmot datasets. The proposed approach demonstrates state-of-the-art performance on ICDAR-2017 POD at a higher IoU threshold with an f1-score of 0.917, reducing the relative error by 7.8%. Moreover, we accomplished correct detection accuracy of 81.3% on embedded formulas on the Marmot dataset, which results in a relative error reduction of 30%.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)

Nyckelord

formula detection
Cascade Mask R-CNN
mathematical expression detection
document image analysis
deep neural networks
computer vision
Maskininlärning
Machine Learning

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy