2. |
- Falkenjack, Johan, et al.
(author)
-
Implicit readability ranking using the latent variable of a Bayesian Probit model
- 2016
-
In: CL4LC 2016 - Computational Linguistics for Linguistic Complexity. - : Uppsala universitet Humanistisk-samhällsvetenskapliga vetenskapsområdet. - 9784879747099 ; , s. 104-112
-
Conference paper (peer-reviewed)abstract
- Data driven approaches to readability analysis for languages other than English has been plagued by a scarcity of suitable corpora. Often, relevant corpora consist only of easy-to-read texts with no rank information or empirical readability scores, making only binary approaches, such as classification, applicable. We propose a Bayesian, latent variable, approach to get the most out of these kinds of corpora. In this paper we present results on using such a model for readability ranking. The model is evaluated on a preliminary corpus of ranked student texts with encourag- ing results. We also assess the model by showing that it performs readability classification on par with a state of the art classifier while at the same being transparent enough to allow more sophisticated interpretations.
|
|