SwePub
Sök i LIBRIS databas

  Extended search

id:"swepub:oai:gup.ub.gu.se/304548"
 

Search: id:"swepub:oai:gup.ub.gu.se/304548" > Exploring natural l...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist
  • Alfter, David,1986Gothenburg University,Göteborgs universitet,Institutionen för svenska språket,Department of Swedish (author)

Exploring natural language processing for single-word and multi-word lexical complexity from a second language learner perspective

  • BookEnglish2021

Publisher, publication year, extent ...

  • Göteborgs universitet,2021

Numbers

  • LIBRIS-ID:oai:gup.ub.gu.se/304548
  • ISBN:9789187850790
  • 2077/66861hdl
  • https://gup.ub.gu.se/publication/304548URI

Supplementary language notes

  • Language:English

Part of subdatabase

Classification

  • Subject category:vet swepub-contenttype
  • Subject category:dok swepub-publicationtype

Series

  • Data linguistica,0347-948X

Notes

  • In this thesis, we investigate how natural language processing (NLP) tools and techniques can be applied to vocabulary aimed at second language learners of Swedish in order to classify vocabulary items into different proficiency levels suitable for learners of different levels. In the first part, we use feature-engineering to represent words as vectors and feed these vectors into machine learning algorithms in order to (1) learn CEFR labels from the input data and (2) predict the CEFR level of unseen words. Our experiments corroborate the finding that feature-based classification models using 'traditional' machine learning still outperform deep learning architectures in the task of deciding how complex a word is. In the second part, we use crowdsourcing as a technique to generate ranked lists of multi-word expressions using both experts and non-experts (i.e. language learners). Our experiment shows that non-expert and expert rankings are highly correlated, suggesting that non-expert intuition can be seen as on-par with expert knowledge, at least in the chosen experimental configuration. The main practical output of this research comes in two forms: prototypes and resources. We have implemented various prototype applications for (1) the automatic prediction of words based on the feature-engineering machine learning method, (2) language learning applications using graded word lists, and (3) an annotation tool for the manual annotation of expressions across a variety of linguistic factors.

Subject headings and genre

Added entries (persons, corporate bodies, meetings, titles ...)

  • Göteborgs universitetInstitutionen för svenska språket (creator_code:org_t)

Internet link

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Alfter, David, 1 ...
About the subject
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
and Language Technol ...
HUMANITIES
HUMANITIES
and Languages and Li ...
and General Language ...
Parts in the series
Data linguistica ...
By the university
University of Gothenburg

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view