Sökning: onr:"swepub:oai:gup.ub.gu.se/336384" >
Did the Names I Use...
Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring
-
- Muñoz Sánchez, Ricardo, 1992 (författare)
- Gothenburg University,Göteborgs universitet,Språkbanken Text, Institutionen för svenska, flerspråkighet och språkteknologi,Institutionen för svenska, flerspråkighet och språkteknologi,Språkbanken Text, Department of Swedish, multilingualism, language technology,Department of Swedish, Multilingualism, Language Technology
-
- Dobnik, Simon, 1977 (författare)
- Gothenburg University,Göteborgs universitet,Institutionen för filosofi, lingvistik och vetenskapsteori,Department of Philosophy, Linguistics and Theory of Science
-
- Szawerna, Maria Irena (författare)
- Gothenburg University,Göteborgs universitet,Språkbanken Text, Institutionen för svenska, flerspråkighet och språkteknologi,Institutionen för svenska, flerspråkighet och språkteknologi,Språkbanken Text, Department of Swedish, multilingualism, language technology,Department of Swedish, Multilingualism, Language Technology
-
visa fler...
-
Lindström Tiedemann, Therese, 1976 (författare)
-
- Volodina, Elena, 1973 (författare)
- Gothenburg University,Göteborgs universitet,Språkbanken Text, Institutionen för svenska, flerspråkighet och språkteknologi,Institutionen för svenska, flerspråkighet och språkteknologi,Språkbanken Text, Department of Swedish, multilingualism, language technology,Department of Swedish, Multilingualism, Language Technology
-
visa färre...
-
(creator_code:org_t)
- Association for Computational Linguistics, 2024
- 2024
- Engelska.
-
Ingår i: Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024). - : Association for Computational Linguistics.
- Relaterad länk:
-
https://gup.ub.gu.se...
Abstract
Ämnesord
Stäng
- Automated essay scoring (AES) of second-language learner essays is a high-stakes task as it can affect the job and educational opportunities a student may have access to. Thus, it becomes imperative to make sure that the essays are graded based on the students’ language proficiency as opposed to other reasons, such as personal names used in the text of the essay. Moreover, most of the research data for AES tends to contain personal identifiable information. Because of that, pseudonymization becomes an important tool to make sure that this data can be freely shared. Thus, our systems should not grade students based on which given names were used in the text of the essay, both for fairness and for privacy reasons. In this paper we explore how given names affect the CEFR level classification of essays of second language learners of Swedish. We use essays containing just one personal name and substitute it for names from lists of given names from four different ethnic origins, namely Swedish, Finnish, Anglo-American, and Arabic. We find that changing the names within the essays has no apparent effect on the classification task, regardless of whether a feature-based or a transformer-based model is used.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Nyckelord
- bias and fairness
- nlp
- natural language processing
- pseudonymization
- automated essay scoring
- second language assessment
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)