SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Linusson Henrik) "

Sökning: WFRF:(Linusson Henrik)

  • Resultat 1-10 av 21
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ahlberg, Ernst, et al. (författare)
  • Using conformal prediction to prioritize compound synthesis in drug discovery
  • 2017
  • Ingår i: Proceedings of Machine Learning Research. - Stockholm : Machine Learning Research. ; , s. 174-184
  • Konferensbidrag (refereegranskat)abstract
    • The choice of how much money and resources to spend to understand certain problems is of high interest in many areas. This work illustrates how computational models can be more tightly coupled with experiments to generate decision data at lower cost without reducing the quality of the decision. Several different strategies are explored to illustrate the trade off between lowering costs and quality in decisions.AUC is used as a performance metric and the number of objects that can be learnt from is constrained. Some of the strategies described reach AUC values over 0.9 and outperforms strategies that are more random. The strategies that use conformal predictor p-values show varying results, although some are top performing.The application studied is taken from the drug discovery process. In the early stages of this process compounds, that potentially could become marketed drugs, are being routinely tested in experimental assays to understand the distribution and interactions in humans.
  •  
2.
  • Boström, Henrik, et al. (författare)
  • Accelerating difficulty estimation for conformal regression forests
  • 2017
  • Ingår i: Annals of Mathematics and Artificial Intelligence. - : Springer Netherlands. - 1012-2443 .- 1573-7470. ; 81:1-2, s. 125-144
  • Tidskriftsartikel (refereegranskat)abstract
    • The conformal prediction framework allows for specifying the probability of making incorrect predictions by a user-provided confidence level. In addition to a learning algorithm, the framework requires a real-valued function, called nonconformity measure, to be specified. The nonconformity measure does not affect the error rate, but the resulting efficiency, i.e., the size of output prediction regions, may vary substantially. A recent large-scale empirical evaluation of conformal regression approaches showed that using random forests as the learning algorithm together with a nonconformity measure based on out-of-bag errors normalized using a nearest-neighbor-based difficulty estimate, resulted in state-of-the-art performance with respect to efficiency. However, the nearest-neighbor procedure incurs a significant computational cost. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. The evaluation moreover shows that the computational cost of the variance-based measure is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. The use of out-of-bag instances for calibration does, however, result in nonconformity scores that are distributed differently from those obtained from test instances, questioning the validity of the approach. An adjustment of the variance-based measure is presented, which is shown to be valid and also to have a significant positive effect on the efficiency. For conformal regression forests, the variance-based nonconformity measure is hence a computationally efficient and theoretically well-founded alternative to the nearest-neighbor procedure.
  •  
3.
  • Boström, Henrik, et al. (författare)
  • Evaluation of a variance-based nonconformity measure for regression forests
  • 2016
  • Ingår i: 5th International Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2016. - Cham : Springer. - 9783319333946 - 9783319333953 ; , s. 75-89
  • Konferensbidrag (refereegranskat)abstract
    • In a previous large-scale empirical evaluation of conformal regression approaches, random forests using out-of-bag instances for calibration together with a k-nearest neighbor-based nonconformity measure, was shown to obtain state-of-the-art performance with respect to efficiency, i.e., average size of prediction regions. However, the use of the nearest-neighbor procedure not only requires that all training data have to be retained in conjunction with the underlying model, but also that a significant computational overhead is incurred, during both training and testing. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. Moreover, the evaluation shows that state-of-theart performance is achieved by the variance-based measure at a computational cost that is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. 
  •  
4.
  • Boström, Henrik, et al. (författare)
  • Mondrian Predictive Systems for Censored Data
  • 2023
  • Ingår i: Proceedings of the 12th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2023. - : ML Research Press. ; , s. 399-412
  • Konferensbidrag (refereegranskat)abstract
    • Conformal predictive systems output predictions in the form of well-calibrated cumulative distribution functions (conformal predictive distributions). In this paper, we apply conformal predictive systems to the problem of time-to-event prediction, where the conformal predictive distribution for a test object may be used to obtain the expected time until an event occurs, as well as p-values for an event to take place earlier (or later) than some specified time points. Specifically, we target right-censored time-to-event prediction tasks, i.e., situations in which the true time-to-event for a particular training example may be unknown due to observation of the example ending before any event occurs. By leveraging the Kaplan-Meier estimator, we develop a procedure for constructing Mondrian predictive systems that are able to produce well-calibrated cumulative distribution functions for right-censored time-to-event prediction tasks. We show that the proposed procedure is guaranteed to produce conservatively valid predictive distributions, and provide empirical support using simulated censoring on benchmark data. The proposed approach is contrasted with established techniques for survival analysis, including random survival forests and censored quantile regression forests, using both synthetic and non-synthetic censoring.
  •  
5.
  • Carlsson, Lars, et al. (författare)
  • Modifications to p-Values of conformal predictors
  • 2015
  • Ingår i: Statistical learning and data sciences. - Cham : Springer. - 9783319170909 - 9783319170916 ; , s. 251-259
  • Konferensbidrag (refereegranskat)abstract
    • The original definition of a p-value in a conformal predictor can sometimes lead to too conservative prediction regions when the number of training or calibration examples is small. The situation can be improved by using a modification to define an approximate p-value. Two modified p-values are presented that converges to the original p-value as the number of training or calibration examples goes to infinity. Numerical experiments empirically support the use of a p-value we call the interpolated p-value for conformal prediction. The interpolated p-value seems to be producing prediction sets that have an error rate which corresponds well to the prescribed significance level.
  •  
6.
  • Johansson, Ulf, et al. (författare)
  • Efficient Venn predictors using random forests
  • 2019
  • Ingår i: Machine Learning. - : SPRINGER. - 0885-6125 .- 1573-0565. ; 108:3, s. 535-550
  • Tidskriftsartikel (refereegranskat)abstract
    • Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. In addition, a probabilistic classifier must, of course, also be as accurate as possible. In this paper, Venn predictors, and its special case Venn-Abers predictors, are evaluated for probabilistic classification, using random forests as the underlying models. Venn predictors output multiple probabilities for each label, i.e., the predicted label is associated with a probability interval. Since all Venn predictors are valid in the long run, the size of the probability intervals is very important, with tighter intervals being more informative. The standard solution when calibrating a classifier is to employ an additional step, transforming the outputs from a classifier into probability estimates, using a labeled data set not employed for training of the models. For random forests, and other bagged ensembles, it is, however, possible to use the out-of-bag instances for calibration, making all training data available for both model learning and calibration. This procedure has previously been successfully applied to conformal prediction, but was here evaluated for the first time for Venn predictors. The empirical investigation, using 22 publicly available data sets, showed that all four versions of the Venn predictors were better calibrated than both the raw estimates from the random forest, and the standard techniques Platt scaling and isotonic regression. Regarding both informativeness and accuracy, the standard Venn predictor calibrated on out-of-bag instances was the best setup evaluated. Most importantly, calibrating on out-of-bag instances, instead of using a separate calibration set, resulted in tighter intervals and more accurate models on every data set, for both the Venn predictors and the Venn-Abers predictors.
  •  
7.
  • Johansson, Ulf, et al. (författare)
  • Handling small calibration sets in mondrian inductive conformal regressors
  • 2015
  • Ingår i: Statistical Learning and Data Sciences. - Cham : Springer. - 9783319170909 ; , s. 271-280
  • Konferensbidrag (refereegranskat)abstract
    • In inductive conformal prediction, calibration sets must contain an adequate number of instances to support the chosen confidence level. This problem is particularly prevalent when using Mondrian inductive conformal prediction, where the input space is partitioned into independently valid prediction regions. In this study, Mondrian conformal regressors, in the form of regression trees, are used to investigate two problematic aspects of small calibration sets. If there are too few calibration instances to support the significance level, we suggest using either extrapolation or altering the model. In situations where the desired significance level is between two calibration instances, the standard procedure is to choose the more nonconforming one, thus guaranteeing validity, but producing conservative conformal predictors. The suggested solution is to use interpolation between calibration instances. All proposed techniques are empirically evaluated and compared to the standard approach on 30 benchmark data sets. The results show that while extrapolation often results in invalid models, interpolation works extremely well and provides increased efficiency with preserved empirical validity.
  •  
8.
  • Johansson, Ulf, et al. (författare)
  • Interpretable regression trees using conformal prediction
  • 2018
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 97, s. 394-404
  • Tidskriftsartikel (refereegranskat)abstract
    • A key property of conformal predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset level of confidence. For regression, this is achieved by turning the point predictions of the underlying model into prediction intervals. Thus, the most important performance metric for evaluating conformal regressors is not the error rate, but the size of the prediction intervals, where models generating smaller (more informative) intervals are said to be more efficient. State-of-the-art conformal regressors typically utilize two separate predictive models: the underlying model providing the center point of each prediction interval, and a normalization model used to scale each prediction interval according to the estimated level of difficulty for each test instance. When using a regression tree as the underlying model, this approach may cause test instances falling into a specific leaf to receive different prediction intervals. This clearly deteriorates the interpretability of a conformal regression tree compared to a standard regression tree, since the path from the root to a leaf can no longer be translated into a rule explaining all predictions in that leaf. In fact, the model cannot even be interpreted on its own, i.e., without reference to the corresponding normalization model. Current practice effectively presents two options for constructing conformal regression trees: to employ a (global) normalization model, and thereby sacrifice interpretability; or to avoid normalization, and thereby sacrifice both efficiency and individualized predictions. In this paper, two additional approaches are considered, both employing local normalization: the first approach estimates the difficulty by the standard deviation of the target values in each leaf, while the second approach employs Mondrian conformal prediction, which results in regression trees where each rule (path from root node to leaf node) is independently valid. An empirical evaluation shows that the first approach is as efficient as current state-of-the-art approaches, thus eliminating the efficiency vs. interpretability trade-off present in existing methods. Moreover, it is shown that if a validity guarantee is required for each single rule, as provided by the Mondrian approach, a penalty with respect to efficiency has to be paid, but it is only substantial at very high confidence levels.
  •  
9.
  • Johansson, Ulf, et al. (författare)
  • Regression conformal prediction with random forests
  • 2014
  • Ingår i: Machine Learning. - : Springer-Verlag New York. - 0885-6125 .- 1573-0565. ; 97:1-2, s. 155-176
  • Tidskriftsartikel (refereegranskat)abstract
    • Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof- the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.
  •  
10.
  • Johansson, Ulf, et al. (författare)
  • Regression Trees for Streaming Data with Local Performance Guarantees
  • 2014
  • Konferensbidrag (refereegranskat)abstract
    • Online predictive modeling of streaming data is a key task for big data analytics. In this paper, a novel approach for efficient online learning of regression trees is proposed, which continuously updates, rather than retrains, the tree as more labeled data become available. A conformal predictor outputs prediction sets instead of point predictions; which for regression translates into prediction intervals. The key property of a conformal predictor is that it is always valid, i.e., the error rate, on novel data, is bounded by a preset significance level. Here, we suggest applying Mondrian conformal prediction on top of the resulting models, in order to obtain regression trees where not only the tree, but also each and every rule, corresponding to a path from the root node to a leaf, is valid. Using Mondrian conformal prediction, it becomes possible to analyze and explore the different rules separately, knowing that their accuracy, in the long run, will not be below the preset significance level. An empirical investigation, using 17 publicly available data sets, confirms that the resulting rules are independently valid, but also shows that the prediction intervals are smaller, on average, than when only the global model is required to be valid. All-in-all, the suggested method provides a data miner or a decision maker with highly informative predictive models of streaming data.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 21

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy