SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Hashmi Khurram Azeem) "

Sökning: WFRF:(Hashmi Khurram Azeem)

  • Resultat 1-10 av 14
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ahmed, Muhammad, et al. (författare)
  • Survey and Performance Analysis of Deep Learning Based Object Detection in Challenging Environments
  • 2021
  • Ingår i: Sensors. - : MDPI. - 1424-8220. ; 21:15
  • Forskningsöversikt (refereegranskat)abstract
    • Recent progress in deep learning has led to accurate and efficient generic object detection networks. Training of highly reliable models depends on large datasets with highly textured and rich images. However, in real-world scenarios, the performance of the generic object detection system decreases when (i) occlusions hide the objects, (ii) objects are present in low-light images, or (iii) they are merged with background information. In this paper, we refer to all these situations as challenging environments. With the recent rapid development in generic object detection algorithms, notable progress has been observed in the field of deep learning-based object detection in challenging environments. However, there is no consolidated reference to cover the state of the art in this domain. To the best of our knowledge, this paper presents the first comprehensive overview, covering recent approaches that have tackled the problem of object detection in challenging environments. Furthermore, we present a quantitative and qualitative performance analysis of these approaches and discuss the currently available challenging datasets. Moreover, this paper investigates the performance of current state-of-the-art generic object detection algorithms by benchmarking results on the three well-known challenging datasets. Finally, we highlight several current shortcomings and outline future directions.
  •  
2.
  • Hashmi, Khurram Azeem, et al. (författare)
  • Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images
  • 2021
  • Ingår i: Applied Sciences. - : MDPI. - 2076-3417. ; 11:16
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a novel architecture for detecting mathematical formulas in document images, which is an important step for reliable information extraction in several domains. Recently, Cascade Mask R-CNN networks have been introduced to solve object detection in computer vision. In this paper, we suggest a couple of modifications to the existing Cascade Mask R-CNN architecture: First, the proposed network uses deformable convolutions instead of conventional convolutions in the backbone network to spot areas of interest better. Second, it uses a dual backbone of ResNeXt-101, having composite connections at the parallel stages. Finally, our proposed network is end-to-end trainable. We evaluate the proposed approach on the ICDAR-2017 POD and Marmot datasets. The proposed approach demonstrates state-of-the-art performance on ICDAR-2017 POD at a higher IoU threshold with an f1-score of 0.917, reducing the relative error by 7.8%. Moreover, we accomplished correct detection accuracy of 81.3% on embedded formulas on the Marmot dataset, which results in a relative error reduction of 30%.
  •  
3.
  • Hashmi, Khurram Azeem, et al. (författare)
  • CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution
  • 2021
  • Ingår i: Journal of Imaging. - : MDPI. - 2313-433X. ; 7:10
  • Tidskriftsartikel (refereegranskat)abstract
    • Table detection is a preliminary step in extracting reliable information from tables in scanned document images. We present CasTabDetectoRS, a novel end-to-end trainable table detection framework that operates on Cascade Mask R-CNN, including Recursive Feature Pyramid network and Switchable Atrous Convolution in the existing backbone architecture. By utilizing a comparativelyightweight backbone of ResNet-50, this paper demonstrates that superior results are attainable without relying on pre- and post-processing methods, heavier backbone networks (ResNet-101, ResNeXt-152), and memory-intensive deformable convolutions. We evaluate the proposed approach on five different publicly available table detection datasets. Our CasTabDetectoRS outperforms the previous state-of-the-art results on four datasets (ICDAR-19, TableBank, UNLV, and Marmot) and accomplishes comparable results on ICDAR-17 POD. Upon comparing with previous state-of-the-art results, we obtain a significant relative error reduction of 56.36%, 20%, 4.5%, and 3.5% on the datasets of ICDAR-19, TableBank, UNLV, and Marmot, respectively. Furthermore, this paper sets a new benchmark by performing exhaustive cross-datasets evaluations to exhibit the generalization capabilities of the proposed method
  •  
4.
  • Hashmi, Khurram Azeem, et al. (författare)
  • Current Status and Performance Analysis of Table Recognition in Document Images with Deep Neural Networks
  • 2021
  • Ingår i: IEEE Access. - : IEEE. - 2169-3536. ; 9, s. 87663-87685
  • Forskningsöversikt (refereegranskat)abstract
    • The first phase of table recognition is to detect the tabular area in a document. Subsequently, the tabular structures are recognized in the second phase in order to extract information from the respective cells. Table detection and structural recognition are pivotal problems in the domain of table understanding. However, table analysis is a perplexing task due to the colossal amount of diversity and asymmetry in tables. Therefore, it is an active area of research in document image analysis. Recent advances in the computing capabilities of graphical processing units have enabled the deep neural networks to outperform traditional state-of-the-art machine learning methods. Table understanding has substantially benefited from the recent breakthroughs in deep neural networks. However, there has not been a consolidated description of the deep learning methods for table detection and table structure recognition. This review paper provides a thorough analysis of the modern methodologies that utilize deep neural networks. Moreover, it presents a comprehensive understanding of the current state-of-the-art and related challenges of table understanding in document images. The leading datasets and their intricacies have been elaborated along with the quantitative results. Furthermore, a brief overview is given regarding the promising directions that can further improve table analysis in document images.
  •  
5.
  • Hashmi, Khurram Azeem, et al. (författare)
  • Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments
  • 2022
  • Ingår i: Sensors. - : MDPI. - 1424-8220. ; 22:10
  • Tidskriftsartikel (refereegranskat)abstract
    • In recent years, due to the advancements in machine learning, object detection has become a mainstream task in the computer vision domain. The first phase of object detection is to find the regions where objects can exist. With the improvements in deep learning, traditional approaches, such as sliding windows and manual feature selection techniques, have been replaced with deep learning techniques. However, object detection algorithms face a problem when performed in low light, challenging weather, and crowded scenes, similar to any other task. Such an environment is termed a challenging environment. This paper exploits pixel-level information to improve detection under challenging situations. To this end, we exploit the recently proposed hybrid task cascade network. This network works collaboratively with detection and segmentation heads at different cascade levels. We evaluate the proposed methods on three complex datasets of ExDark, CURE-TSD, and RESIDE, and achieve a mAP of 0.71, 0.52, and 0.43, respectively. Our experimental results assert the efficacy of the proposed approach.
  •  
6.
  • Hashmi, Khurram Azeem, et al. (författare)
  • Guided Table Structure Recognition through Anchor Optimization
  • 2021
  • Ingår i: IEEE Access. - : IEEE. - 2169-3536. ; 9, s. 113521-113534
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents the novel approach towards table structure recognition by leveraging the guided anchors. The concept differs from current state-of-the-art systems for table structure recognition that naively apply object detection methods. In contrast to prior techniques, first, we estimate the viable anchors for table structure recognition. Subsequently, these anchors are exploited to locate the rows and columns in tabular images. Furthermore, the paper introduces a simple and effective method that improves the results using tabular layouts in realistic scenarios. The proposed method is exhaustively evaluated on the two publicly available datasets of table structure recognition: ICDAR-2013 and TabStructDB. Moreover, we empirically established the validity of our method by implementing it on the previous approaches. We accomplished state-of-the-art results on the ICDAR-2013 dataset with an average F-measure of 94.19% (92.06% for rows and 96.32% for columns). Thus, a relative error reduction of more than 25% is achieved. Furthermore, our proposed post-processing improves the average F-measure to 95.46% that results in a relative error reduction of more than 35%. Moreover, we surpassed the baseline results on the TabStructDB dataset with an average F-measure of 94.57% (94.08% for rows and 95.06% for columns).
  •  
7.
  • Kallempudi, Goutham, et al. (författare)
  • Toward Semi-Supervised Graphical Object Detection in Document Images
  • 2022
  • Ingår i: Future Internet. - : MDPI. - 1999-5903. ; 14:6
  • Tidskriftsartikel (refereegranskat)abstract
    • The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, these models necessitate a substantial amount of labeled data for the training process. This paper presents an end-to-end semi-supervised framework for graphical object detection in scanned document images to address this limitation. Our method is based on a recently proposed Soft Teacher mechanism that examines the effects of small percentage-labeled data on the classification and localization of graphical objects. On both the PubLayNet and the IIIT-AR-13K datasets, the proposed approach outperforms the supervised models by a significant margin in all labeling ratios (1%, 5%, and 10%). Furthermore, the 10% PubLayNet Soft Teacher model improves the average precision of Table, Figure, and List by +5.4,+1.2, and +3.2 points, respectively, with a similar total mAP as the Faster-RCNN baseline. Moreover, our model trained on 10% of IIIT-AR-13K labeled data beats the previous fully supervised method +4.5 points.
  •  
8.
  • Mishra, Shashank, et al. (författare)
  • Towards Robust Object Detection in Floor Plan Images : A Data Augmentation Approach
  • 2021
  • Ingår i: Applied Sciences. - : MDPI. - 2076-3417. ; 11:23
  • Tidskriftsartikel (refereegranskat)abstract
    • Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is challenging. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Prior datasets for object detection in floor plan images are either publicly unavailable or contain few samples. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10,000 images to address this issue. Our proposed method conveniently exceeds the previous state-of-the-art results on the SESYD dataset with an mAP of 98.1%. Moreover, it sets impressive baseline results on our novel SFPI dataset with an mAP of 99.8%. We believe that introducing the modern dataset enables the researcher to enhance the research in this domain.
  •  
9.
  • Muralidhara, Shishir, et al. (författare)
  • Attention-Guided Disentangled Feature Aggregation for Video Object Detection
  • 2022
  • Ingår i: Sensors. - : MDPI. - 1424-8220. ; 22:21
  • Tidskriftsartikel (refereegranskat)abstract
    • Object detection is a computer vision task that involves localisation and classification of objects in an image. Video data implicitly introduces several challenges, such as blur, occlusion and defocus, making video object detection more challenging in comparison to still image object detection, which is performed on individual and independent images. This paper tackles these challenges by proposing an attention-heavy framework for video object detection that aggregates the disentangled features extracted from individual frames. The proposed framework is a two-stage object detector based on the Faster R-CNN architecture. The disentanglement head integrates scale, spatial and task-aware attention and applies it to the features extracted by the backbone network across all the frames. Subsequently, the aggregation head incorporates temporal attention and improves detection in the target frame by aggregating the features of the support frames. These include the features extracted from the disentanglement network along with the temporal features. We evaluate the proposed framework using the ImageNet VID dataset and achieve a mean Average Precision (mAP) of 49.8 and 52.5 using the backbones of ResNet-50 and ResNet-101, respectively. The improvement in performance over the individual baseline methods validates the efficacy of the proposed approach.
  •  
10.
  • Naik, Shivam, et al. (författare)
  • Investigating Attention Mechanism for Page Object Detection in Document Images
  • 2022
  • Ingår i: Applied Sciences. - : MDPI. - 2076-3417. ; 12:15
  • Tidskriftsartikel (refereegranskat)abstract
    • Page object detection in scanned document images is a complex task due to varying document layouts and diverse page objects. In the past, traditional methods such as Optical Character Recognition (OCR)-based techniques have been employed to extract textual information. However, these methods fail to comprehend complex page objects such as tables and figures. This paper addresses the localization problem and classification of graphical objects that visually summarize vital information in documents. Furthermore, this work examines the benefit of incorporating attention mechanisms in different object detection networks to perform page object detection on scanned document images. The model is designed with a Pytorch-based framework called Detectron2. The proposed pipelines can be optimized end-to-end and exhaustively evaluated on publicly available datasets such as DocBank, PublayNet, and IIIT-AR-13K. The achieved results reflect the effectiveness of incorporating the attention mechanism for page object detection in documents.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 14

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy