SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "FÖRF:(Johan Hall) "

Sökning: FÖRF:(Johan Hall)

  • Resultat 1-10 av 51
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Yavariabdi, Amir, et al. (författare)
  • CArDIS : A Swedish Historical Handwritten Character and Word Dataset
  • 2022
  • Ingår i: IEEE Access. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 2169-3536. ; 10, s. 55338-55349
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper introduces a new publicly available image-based Swedish historical handwritten character and word dataset named Character Arkiv Digital Sweden (CArDIS) (https://cardisdataset.github.io/CARDIS/). The samples in CArDIS are collected from 64, 084 Swedish historical documents written by several anonymous priests between 1800 and 1900. The dataset contains 116, 000 Swedish alphabet images in RGB color space with 29 classes, whereas the word dataset contains 30, 000 image samples of ten popular Swedish names as well as 1, 000 region names in Sweden. To examine the performance of different machine learning classifiers on CArDIS dataset, three different experiments are conducted. In the first experiment, classifiers such as Support Vector Machine (SVM), Artificial Neural Networks (ANN), k-Nearest Neighbor (k-NN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Random Forest (RF) are trained on existing character datasets which are Extended Modified National Institute of Standards and Technology (EMNIST), IAM and CVL and tested on CArDIS dataset. In the second and third experiments, the same classifiers as well as two pre-trained VGG-16 and VGG-19 classifiers are trained and tested on CArDIS character and word datasets. The experiments show that the machine learning methods trained on existing handwritten character datasets struggle to recognize characters efficiently on the CArDIS dataset, proving that characters in the CArDIS contain unique features and characteristics. Moreover, in the last two experiments, the deep learning-based classifiers provide the best recognition rates.
  •  
2.
  • Cheddad, Abbas, et al. (författare)
  • SHIBR-The Swedish Historical Birth Records : a semi-annotated dataset
  • 2021
  • Ingår i: Neural Computing & Applications. - : Springer London. - 0941-0643 .- 1433-3058. ; 33:22, s. 15863-15875
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a digital image dataset of historical handwritten birth records stored in the archives of several parishes across Sweden, together with the corresponding metadata that supports the evaluation of document analysis algorithms' performance. The dataset is called SHIBR (the Swedish Historical Birth Records). The contribution of this paper is twofold. First, we believe it is the first and the largest Swedish dataset of its kind provided as open access (15,000 high-resolution colour images of the era between 1800 and 1840). We also perform some data mining of the dataset to uncover some statistics and facts that might be of interest and use to genealogists. Second, we provide a comprehensive survey of contemporary datasets in the field that are open to the public along with a compact review of word spotting techniques. The word transcription file contains 17 columns of information pertaining to each image (e.g., child's first name, birth date, date of baptism, father's first/last name, mother's first/last name, death records, town, job title of the father/mother, etc.). Moreover, we evaluate some deep learning models, pre-trained on two other renowned datasets, for word spotting in SHIBR. However, our dataset proved challenging due to the unique handwriting style. Therefore, the dataset could also be used for competitions dedicated to a large set of document analysis problems, including word spotting.
  •  
3.
  • Kusetogullari, Huseyin, et al. (författare)
  • DIGITNET : A Deep Handwritten Digit Detection and Recognition Method Using a New Historical Handwritten Digit Dataset
  • 2021
  • Ingår i: Big Data Research. - : Elsevier. - 2214-5796 .- 2214-580X. ; 23
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper introduces a novel deep learning architecture, named DIGITNET, and a large-scale digit dataset, named DIDA, to detect and recognize handwritten digits in historical document images written in the nineteen century. To generate the DIDA dataset, digit images are collected from 100,000 Swedish handwritten historical document images, which were written by different priests with different handwriting styles. This dataset contains three sub-datasets including single digit, large-scale bounding box annotated multi-digit, and digit string with 250,000, 25,000, and 200,000 samples in Red-Green-Blue (RGB) color spaces, respectively. Moreover, DIDA is used to train the DIGITNET network, which consists of two deep learning architectures, called DIGITNET-dect and DIGITNET-rec, respectively, to isolate digits and recognize digit strings in historical handwritten documents. In DIGITNET-dect architecture, to extract features from digits, three residual units where each residual unit has three convolution neural network structures are used and then a detection strategy based on You Look Only Once (YOLO) algorithm is employed to detect handwritten digits at two different scales. In DIGITNET-rec, the detected isolated digits are passed through 3 different designed Convolutional Neural Network (CNN) architectures and then the classification results of three different CNNs are combined using a voting scheme to recognize digit strings. The proposed model is also trained with various existing handwritten digit datasets and then validated over historical handwritten digit strings. The experimental results show that the proposed architecture trained with DIDA (publicly available from: https://didadataset.github.io/DIDA/) outperforms the state-of-the-art methods. 
  •  
4.
  • Liang, Xusheng, et al. (författare)
  • Comparative Study of Layout Analysis of Tabulated Historical Documents
  • 2021
  • Ingår i: Big Data Research. - : Elsevier Inc.. - 2214-5796 .- 2214-580X. ; 24
  • Tidskriftsartikel (refereegranskat)abstract
    • Nowadays, the field of multimedia retrieval system has earned a lot of attention as it helps retrieve information more efficiently and accelerates daily tasks. Within this context, image processing techniques such as layout analysis and word recognition play an important role in transcribing content in printed or handwritten documents into digital data that can be further processed. This transcription procedure is called document digitization. This work stems from an industrial need, namely, a Swedish company (ArkivDigital AB) has scanned more than 80 million pages of Swedish historical documents from all over the country and there is a high demand to transcribe the contents into digital data. Such process starts by figuring out text location which, seen from another angle, is merely table layout analysis. In this study, the aim is to reveal the most effective solution to extract document layout w.r.t Swedish handwritten historical documents that are featured by their tabular forms. In short, outcome of public tools (i.e., Breuel's OCRopus method), traditional image processing techniques (e.g., Hessian/Gabor filters, Hough transform, Histograms of oriented gradients -HOG- features), machine learning techniques (e.g., support vector machines, transfer learning) are studied and compared. Results show that the existing OCR tool cannot carry layout analysis task on our Swedish historical handwritten documents. Traditional image processing techniques are mildly capable of extracting the general table layout in these documents, but the accuracy is enhanced by introducing machine learning techniques. The best performing approach will be used in our future document mining research to allow for the development of scalable resource-efficient systems for big data analytics. © 2021 Elsevier Inc.
  •  
5.
  • Kusetogullari, Hüseyin, 1981-, et al. (författare)
  • ARDIS : A Swedish Historical Handwritten Digit Dataset
  • 2020
  • Ingår i: Neural Computing & Applications. - : Springer Nature Switzerland. - 0941-0643 .- 1433-3058. ; 32:21, s. 16505-16518
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper introduces a new image-based handwrittenhistorical digit dataset named ARDIS (Arkiv DigitalSweden). The images in ARDIS dataset are extractedfrom 15,000 Swedish church records which were writtenby different priests with various handwriting styles in thenineteenth and twentieth centuries. The constructed datasetconsists of three single digit datasets and one digit stringsdataset. The digit strings dataset includes 10,000 samplesin Red-Green-Blue (RGB) color space, whereas, the otherdatasets contain 7,600 single digit images in different colorspaces. An extensive analysis of machine learning methodson several digit datasets is examined. Additionally, correlationbetween ARDIS and existing digit datasets ModifiedNational Institute of Standards and Technology (MNIST)and United States Postal Service (USPS) is investigated. Experimental results show that machine learning algorithms,including deep learning methods, provide low recognitionaccuracy as they face difficulties when trained on existingdatasets and tested on ARDIS dataset. Accordingly, ConvolutionalNeural Network (CNN) trained on MNIST andUSPS and tested on ARDIS provide the highest accuracies 58.80% and 35.44%, respectively. Consequently, the resultsreveal that machine learning methods trained on existingdatasets can have difficulties to recognize digits effectivelyon our dataset which proves that ARDIS dataset hasunique characteristics. This dataset is publicly available forthe research community to further advance handwritten digitrecognition algorithms.
  •  
6.
  •  
7.
  •  
8.
  •  
9.
  •  
10.
  • Nivre, Joakim, 1962-, et al. (författare)
  • An Improved Oracle for Dependency Parsing with Online Reordering
  • 2009
  • Ingår i: Proceedings of the 11th International Conference on Parsing Technologies (IWPT). - Stroudsburg, PA, USA : Association for Computational Linguistics. ; , s. 73-76
  • Konferensbidrag (refereegranskat)abstract
    • We present an improved training strategyfor dependency parsers that use online re-ordering to handle non-projective trees.The new strategy improves both efficiency and accuracy by reducing the number of swap operations performed on non-projective trees by up to 80%. We present state-of-the-art results for five languages with the best ever reported results for Czech.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 51
Typ av publikation
konferensbidrag (33)
tidskriftsartikel (7)
annan publikation (5)
rapport (3)
doktorsavhandling (1)
bokkapitel (1)
visa fler...
licentiatavhandling (1)
visa färre...
Typ av innehåll
refereegranskat (37)
övrigt vetenskapligt/konstnärligt (7)
populärvet., debatt m.m. (7)
Författare/redaktör
Hall, Johan (45)
Nivre, Joakim (33)
Nilsson, Jens (31)
Nivre, Joakim, 1962- (8)
Eryigit, Gülsen (6)
Hall, Johan, 1973- (5)
visa fler...
Yavariabdi, Amir (4)
Kübler, Sandra (4)
Marinov, Svetoslav (4)
Cheddad, Abbas (3)
Kusetogullari, Hüsey ... (3)
Nilsson, Jens, 1979- (3)
Nilsson, Mattias (2)
Megyesi, Beata (2)
Kuhlmann, Marco (2)
Yuret, Deniz (2)
Saers, Markus (2)
Lavelli, Alberto (2)
McDonald, Ryan (2)
Marsi, Erwin (2)
Chanev, Atanas (2)
Riedel, Sebastian (2)
Löwe, Welf (1)
Grahn, Håkan (1)
Lavesson, Niklas, Pr ... (1)
Lenci, Alessandro (1)
Nivre, Joakim, Profe ... (1)
Hilmkil, Agrin (1)
Sundin, Lena (1)
Aouache, Mustapha (1)
Löwe, Welf, Professo ... (1)
Nivre, Joakim, Profe ... (1)
Löwe, Welf, Professo ... (1)
Volk, Martin, Profes ... (1)
Kübler, Sandra, Assi ... (1)
Kusetogullari, Husey ... (1)
Johan, Hall (1)
Liang, Xusheng (1)
Celik, Turgay (1)
Simi, Maria (1)
Bosco, Cristina (1)
Montemagni, Simonett ... (1)
Mazzei, Alessandro (1)
Lombardo, Vincenzo (1)
Dell'Orletta, Felice (1)
Lesmo, Leonardo (1)
Attardi, Giuseppe (1)
Thummanapally, Shiva ... (1)
Rijwan, Sakib (1)
visa färre...
Lärosäte
Linnéuniversitetet (26)
Uppsala universitet (21)
Blekinge Tekniska Högskola (5)
Högskolan i Skövde (2)
Linköpings universitet (1)
Jönköping University (1)
Språk
Engelska (51)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (38)
Teknik (1)
Medicin och hälsovetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy