Exploring natural language processing for single-word and multi-word lexical complexity from a second language learner perspective

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Sökning: onr:"swepub:oai:gup.ub.gu.se/304548" > Exploring natural l...

1 av 1
Föregående post
Nästa post
Till träfflistan

Alfter, David,1986Gothenburg University,Göteborgs universitet,Institutionen för svenska språket,Department of Swedish (författare)

Exploring natural language processing for single-word and multi-word lexical complexity from a second language learner perspective

BokEngelska2021

Förlag, utgivningsår, omfång ...

Göteborgs universitet,2021

Nummerbeteckningar

LIBRIS-ID:oai:gup.ub.gu.se/304548
ISBN:9789187850790
2077/66861hdl
https://gup.ub.gu.se/publication/304548URI

Kompletterande språkuppgifter

Språk:engelska

Ingår i deldatabas

SwePubSwePub

Klassifikation

Ämneskategori:vet swepub-contenttype
Ämneskategori:dok swepub-publicationtype

Serie

Data linguistica,0347-948X

Anmärkningar

In this thesis, we investigate how natural language processing (NLP) tools and techniques can be applied to vocabulary aimed at second language learners of Swedish in order to classify vocabulary items into different proficiency levels suitable for learners of different levels. In the first part, we use feature-engineering to represent words as vectors and feed these vectors into machine learning algorithms in order to (1) learn CEFR labels from the input data and (2) predict the CEFR level of unseen words. Our experiments corroborate the finding that feature-based classification models using 'traditional' machine learning still outperform deep learning architectures in the task of deciding how complex a word is. In the second part, we use crowdsourcing as a technique to generate ranked lists of multi-word expressions using both experts and non-experts (i.e. language learners). Our experiment shows that non-expert and expert rankings are highly correlated, suggesting that non-expert intuition can be seen as on-par with expert knowledge, at least in the chosen experimental configuration. The main practical output of this research comes in two forms: prototypes and resources. We have implemented various prototype applications for (1) the automatic prediction of words based on the feature-engineering machine learning method, (2) language learning applications using graded word lists, and (3) an annotation tool for the manual annotation of expressions across a variety of linguistic factors.

Ämnesord och genrebeteckningar

NATURVETENSKAP Data- och informationsvetenskap Språkteknologi hsv//swe
NATURAL SCIENCES Computer and Information Sciences Language Technology hsv//eng
HUMANIORA Språk och litteratur Jämförande språkvetenskap och allmän lingvistik hsv//swe
HUMANITIES Languages and Literature General Language Studies and Linguistics hsv//eng
natural language processing
lexical complexity
CEFR
second language learning
machine learning
crowdsourcing

Biuppslag (personer, institutioner, konferenser, titlar ...)

Göteborgs universitetInstitutionen för svenska språket (creator_code:org_t)

Internetlänk

https://gup.ub.gu.se/publication/304548

Hitta via bibliotek

Exploring natural language processing for single-word and multi-word lexical com... (Sök publikationen i LIBRIS)

Till lärosätets databas

1 av 1
Föregående post
Nästa post
Till träfflistan

Hitta mer i SwePub

Av författaren/redakt...: Alfter, David, 1 ...

Om ämnet

NATURVETENSKAP: NATURVETENSKAP; och Data och informa ...; och Språkteknologi

HUMANIORA: HUMANIORA; och Språk och litter ...; och Jämförande språk ...

Delar i serien: Data linguistica ...

Av lärosätet: Göteborgs universitet

Sök utanför SwePub

Sök vidare i:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se