Search: onr:"swepub:oai:DiVA.org:liu-117547" >
Classifying easy-to...
Classifying easy-to-read texts without parsing
-
- Falkenjack, Johan, 1986- (author)
- Linköpings universitet,Institutionen för datavetenskap,Tekniska högskolan
-
- Jönsson, Arne (author)
- Linköpings universitet,Institutionen för datavetenskap,Tekniska högskolan
-
(creator_code:org_t)
- Association for Computational Linguistics, 2014
- 2014
- English.
-
In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR). - : Association for Computational Linguistics. - 9781937284916 ; , s. 114-122
- Related links:
-
https://urn.kb.se/re...
Abstract
Subject headings
Close
- Document classification using automated linguistic analysis and machine learning (ML) has been shown to be a viable road forward for readability assessment. The best models can be trained to decide if a text is easy to read or not with very high accuracy, e.g. a model using 117 parameters from shallow, lexical, morphological and syntactic analyses achieves 98,9% accuracy. In this paper we compare models created by parameter optimization over subsets of that total model to find out to which extent different high-performing models tend to consist of the same parameters and if it is possible to find models that only use features not requiring parsing. We used a genetic algorithm to systematically optimize parameter sets of fixed sizes using accuracy of a Support Vector Machine classi- fier as fitness function. Our results show that it is possible to find models almost as good as the currently best models while omitting parsing based features.
Subject headings
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Keyword
- Readability
- Readability Assessment
- Genetic optimization
- Machine Learning
- Support Vector Machine
Publication and Content Type
- ref (subject category)
- kon (subject category)
Find in a library
To the university's database