SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:uu-474816"
 

Sökning: id:"swepub:oai:DiVA.org:uu-474816" > Arabic named entity...

Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism

Benali, B. Ait (författare)
Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco.
Mihi, S. (författare)
Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco.
Ait-Mlouk, Addi (författare)
Uppsala universitet,Institutionen för informationsteknologi
visa fler...
El Bazi, I (författare)
Sultan Moulay Slimane Univ, Natl Sch Business & Management, Beni Mellal, Morocco.
Laachfoubi, N. (författare)
Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco.
visa färre...
Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco Institutionen för informationsteknologi (creator_code:org_t)
IOS Press, 2022
2022
Engelska.
Ingår i: Journal of Intelligent & Fuzzy Systems. - : IOS Press. - 1064-1246 .- 1875-8967. ; 42:6, s. 5427-5436
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • Named Entity Recognition (NER) is a vitally important task of Natural Language Processing (NLP), which aims at finding named entities in natural language text and classifying them into predefined categories such as persons (PER), places (LOC), organizations (ORG), and so on. In the Arabic context, the current NER approaches based on deep learning are mainly based on word embedding or character-level embedding as input. However, using a single granularity representation has problems with out-of-vocabulary (OOV), word embedding errors, and relatively simple semantic content. This paper presents a multi-headed self-attention mechanism implemented in the BiLSTM-CRF neural network structure to recognize Arabic named entities on social media using two embeddings. Unlike other state-of-the-art approaches, this approach combines character and word embedding at the embedding layer, and the attention mechanism calculates the similarity over the entire sequence of characters and captures local context information. The proposed approach better recognized NEs in Dialect Arabic, reaching an F1 value of 74.15% on Darwish's dataset (a publicly available Arabic NER benchmark for social media). According to our knowledge, our findings outperform the current state-of-the-art models for Arabic Named Entity Recognition on social media.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Language Technology (hsv//eng)

Nyckelord

Arabic named entity recognition (ANER)
natural language processing (NLP)
multi-head self-attention
BiLSTM
CRF
dialect arabic
social media

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy