SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:0957 4174 OR L773:1873 6793 "

Sökning: L773:0957 4174 OR L773:1873 6793

  • Resultat 1-10 av 108
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Afzal, Wasif, et al. (författare)
  • On the application of genetic programming for software engineering predictive modeling : A systematic review
  • 2011
  • Ingår i: Expert Systems with Applications. - : Pergamon-Elsevier Science Ltd. - 0957-4174 .- 1873-6793. ; 38:9, s. 11984-11997
  • Forskningsöversikt (refereegranskat)abstract
    • The objective of this paper is to investigate the evidence for symbolic regression using genetic programming (GP) being an effective method for prediction and estimation in software engineering, when compared with regression/machine learning models and other comparison groups (including comparisons with different improvements over the standard GP algorithm). We performed a systematic review of literature that compared genetic programming models with comparative techniques based on different independent project variables. A total of 23 primary studies were obtained after searching different information sources in the time span 1995-2008. The results of the review show that symbolic regression using genetic programming has been applied in three domains within software engineering predictive modeling: (i) Software quality classification (eight primary studies). (ii) Software cost/effort/size estimation (seven primary studies). (iii) Software fault prediction/software reliability growth modeling (eight primary studies). While there is evidence in support of using genetic programming for software quality classification, software fault prediction and software reliability growth modeling: the results are inconclusive for software cost/effort/size estimation.
  •  
2.
  • Aler, Ricardo, et al. (författare)
  • Study of Hellinger Distance as a splitting metric for Random Forests in balanced and imbalanced classification datasets
  • 2020
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 149
  • Tidskriftsartikel (refereegranskat)abstract
    • Hellinger Distance (HD) is a splitting metric that has been shown to have an excellent performance for imbalanced classification problems for methods based on Bagging of trees, while also showing good performance for balanced problems. Given that Random Forests (RF) use Bagging as one of two fundamental techniques to create diversity in the ensemble, it could be expected that HD is also effective for this ensemble method. The main aim of this article is to carry out an extensive investigation on important aspects about the use of HD in RF, including handling of multi-class problems, hyper-parameter optimization, metrics comparison, probability estimation, and metrics combination. In particular, HD is compared to other commonly used splitting metrics (Gini and Gain Ratio) in several contexts: balanced/imbalanced and two-class/multi-class. Two aspects related to classification problems are assessed: classification itself and probability estimation. HD is defined for two-class problems, but there are several ways in which it can be extended to deal with multi-class and this article studies the performance of the available options. Finally, even though HD can be used as an alternative to other splitting metrics, there is no reason to limit RF to use just one of them. Therefore, the final study of this article is to determine whether selecting the splitting metric using cross-validation on the training data can improve results further. Results show HD to be a robust measure for RF, with some weakness for balanced multi-class datasets (especially for probability estimation). Combination of metrics is able to result in a more robust performance. However, experiments of HD with text datasets show Gini to be more suitable than HD for this kind of problems.
  •  
3.
  • Altarabichi, Mohammed Ghaith, 1981-, et al. (författare)
  • Fast Genetic Algorithm for feature selection — A qualitative approximation approach
  • 2023
  • Ingår i: Expert systems with applications. - Oxford : Elsevier. - 0957-4174 .- 1873-6793. ; 211
  • Tidskriftsartikel (refereegranskat)abstract
    • Evolutionary Algorithms (EAs) are often challenging to apply in real-world settings since evolutionary computations involve a large number of evaluations of a typically expensive fitness function. For example, an evaluation could involve training a new machine learning model. An approximation (also known as meta-model or a surrogate) of the true function can be used in such applications to alleviate the computation cost. In this paper, we propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection in a wrapper setting for large datasets. We define “Approximation Usefulness” to capture the necessary conditions to ensure correctness of the EA computations when an approximation is used. Based on this definition, we propose a procedure to construct a lightweight qualitative meta-model by the active selection of data instances. We then use a meta-model to carry out the feature selection task. We apply this procedure to the GA-based algorithm CHC (Cross generational elitist selection, Heterogeneous recombination and Cataclysmic mutation) to create a Qualitative approXimations variant, CHCQX. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy (as compared to CHC), particularly for large datasets with over 100K instances. We also demonstrate the applicability of the thinking behind our approach more broadly to Swarm Intelligence (SI), another branch of the Evolutionary Computation (EC) paradigm with results of PSOQX, a qualitative approximation adaptation of the Particle Swarm Optimization (PSO) method. A GitHub repository with the complete implementation is available. © 2022 The Author(s)
  •  
4.
  • Argyrou, Argyris, et al. (författare)
  • A semi-supervised tool for clustering accounting databases with applications to internal controls
  • 2011
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 38:9, s. 11176-11181
  • Tidskriftsartikel (refereegranskat)abstract
    • A considerable body of literature attests to the significance of internal controls; however, little is known on how the clustering of accounting databases can function as an internal control procedure. To explore this issue further, this paper puts forward a semi-supervised tool that is based on self-organizing map and the IASB XBRL Taxonomy. The paper validates the proposed tool via a series of experiments on an accounting database provided by a shipping company. Empirical results suggest the tool can cluster accounting databases in homogeneous and well-separated clusters that can be interpreted within an accounting context. Further investigations reveal that the tool can compress a large number of similar transactions, and also provide information comparable to that of financial statements. The findings demonstrate that the tool can be applied to verify the processing of accounting transactions as well as to assess the accuracy of financial statements, and thus supplement internal controls.
  •  
5.
  • Bacauskiene, Marija, et al. (författare)
  • Random forests based monitoring of human larynx using questionnaire data
  • 2012
  • Ingår i: Expert systems with applications. - Amsterdam : Elsevier. - 0957-4174 .- 1873-6793. ; 39:5, s. 5506-5512
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper is concerned with soft computing techniques-based noninvasive monitoring of human larynx using subject’s questionnaire data. By applying random forests (RF), questionnaire data are categorized into a healthy class and several classes of disorders including: cancerous, noncancerous, diffuse, nodular, paralysis, and an overall pathological class. The most important questionnaire statements are determined using RF variable importance evaluations. To explore data represented by variables used by RF, the t-distributed stochastic neighbor embedding (t-SNE) and the multidimensional scaling (MDS) are applied to the RF data proximity matrix. When testing the developed tools on a set of data collected from 109 subjects, the 100% classification accuracy was obtained on unseen data in binary classification into the healthy and pathological classes. The accuracy of 80.7% was achieved when classifying the data into the healthy, cancerous, noncancerous classes. The t-SNE and MDS mapping techniques applied allow obtaining two-dimensional maps of data and facilitate data exploration aimed at identifying subjects belonging to a “risk group”. It is expected that the developed tools will be of great help in preventive health care in laryngology.
  •  
6.
  • Bagloee, S. A., et al. (författare)
  • A hybrid machine-learning and optimization method for contraflow design in post-disaster cases and traffic management scenarios
  • 2019
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 124, s. 67-81
  • Tidskriftsartikel (refereegranskat)abstract
    • The growing number of man-made and natural disasters in recent years has made the disaster management a focal point of interest and research. To assist and streamline emergency evacuation, changing the directions of the roads (called contraflow, a traffic control measure) is proven to be an effective, quick and affordable scheme in the action list of the disaster management. The contraflow is computationally a challenging problem (known as NP-hard), hence developing an efficient method applicable to real-world and large-sized cases is a significant challenge in the literature. To cope with its complexities and to tailor to practical applications, a hybrid heuristic method based on a machine-learning model and bilevel optimization is developed. The idea is to try and test several contraflow scenarios providing a training dataset for a supervised learning (regression) model which is then used in an optimization framework to find a better scenario in an iterative process. This method is coded as a single computer program synchronized with GAMS (for optimization), MATLAB (for machine learning), EMME3 (for traffic simulation), MS-Access (for data storage) and MS-Excel (as an interface), and it is tested using a real dataset from Winnipeg, and Sioux-Falls as benchmarks. The algorithm managed to find globally optimal solutions for the Sioux-Falls example and improved accessibility to the dense and congested central areas of Winnipeg just by changing the direction of some roads.
  •  
7.
  • Bahnsen, Alejandro Correa, et al. (författare)
  • Example-dependent cost-sensitive decision trees
  • 2015
  • Ingår i: Expert systems with applications. - : Elsevier BV. - 0957-4174 .- 1873-6793. ; 42:19, s. 6609-6619
  • Tidskriftsartikel (refereegranskat)abstract
    • Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. State-of-the-art example-dependent cost-sensitive techniques only introduce the cost to the algorithm, either before or after training, therefore, leaving opportunities to investigate the potential impact of algorithms that take into account the real financial example-dependent costs during an algorithm training. In this paper, we propose an example-dependent cost-sensitive decision tree algorithm, by incorporating the different example-dependent costs into a new cost-based impurity measure and a new cost-based pruning criteria. Then, using three different databases, from three real-world applications: credit card fraud detection, credit scoring and direct marketing, we evaluate the proposed method. The results show that the proposed algorithm is the best performing method for all databases. Furthermore, when compared against a standard decision tree, our method builds significantly smaller trees in only a fifth of the time, while having a superior performance measured by cost savings, leading to a method that not only has more business-oriented results, but also a method that creates simpler models that are easier to analyze. 
  •  
8.
  • Bahnsen, Alejandro Correa, et al. (författare)
  • Feature engineering strategies for credit card fraud detection
  • 2016
  • Ingår i: Expert systems with applications. - : Elsevier BV. - 0957-4174 .- 1873-6793. ; 51, s. 134-142
  • Tidskriftsartikel (refereegranskat)abstract
    • Every year billions of Euros are lost worldwide due to credit card fraud. Thus, forcing financial institutions to continuously improve their fraud detection systems. In recent years, several studies have proposed the use of machine learning and data mining techniques to address this problem. However, most studies used some sort of misclassification measure to evaluate the different solutions, and do not take into account the actual financial costs associated with the fraud detection process. Moreover, when constructing a credit card fraud detection model, it is very important how to extract the right features from the transactional data. This is usually done by aggregating the transactions in order to observe the spending behavioral patterns of the customers. In this paper we expand the transaction aggregation strategy, and propose to create a new set of features based on analyzing the periodic behavior of the time of a transaction using the von Mises distribution. Then, using a real credit card fraud dataset provided by a large European card processing company, we compare state-of-the-art credit card fraud detection models, and evaluate how the different sets of features have an impact on the results. By including the proposed periodic features into the methods, the results show an average increase in savings of 13%. (C) 2016 Elsevier Ltd. All rights reserved.
  •  
9.
  • Bandaru, Sunith, et al. (författare)
  • Data mining methods for knowledge discovery in multi-objective optimization : Part A - Survey
  • 2017
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 70, s. 139-159
  • Forskningsöversikt (refereegranskat)abstract
    • Real-world optimization problems typically involve multiple objectives to be optimized simultaneously under multiple constraints and with respect to several variables. While multi-objective optimization itself can be a challenging task, equally difficult is the ability to make sense of the obtained solutions. In this two-part paper, we deal with data mining methods that can be applied to extract knowledge about multi-objective optimization problems from the solutions generated during optimization. This knowledge is expected to provide deeper insights about the problem to the decision maker, in addition to assisting the optimization process in future design iterations through an expert system. The current paper surveys several existing data mining methods and classifies them by methodology and type of knowledge discovered. Most of these methods come from the domain of exploratory data analysis and can be applied to any multivariate data. We specifically look at methods that can generate explicit knowledge in a machine-usable form. A framework for knowledge-driven optimization is proposed, which involves both online and offline elements of knowledge discovery. One of the conclusions of this survey is that while there are a number of data mining methods that can deal with data involving continuous variables, only a few ad hoc methods exist that can provide explicit knowledge when the variables involved are of a discrete nature. Part B of this paper proposes new techniques that can be used with such datasets and applies them to discrete variable multi-objective problems related to production systems. 
  •  
10.
  • Bandaru, Sunith, et al. (författare)
  • Data mining methods for knowledge discovery in multi-objective optimization : Part B - New developments and applications
  • 2017
  • Ingår i: Expert systems with applications. - : Elsevier. - 0957-4174 .- 1873-6793. ; 70, s. 119-138
  • Tidskriftsartikel (refereegranskat)abstract
    • The first part of this paper served as a comprehensive survey of data mining methods that have been used to extract knowledge from solutions generated during multi-objective optimization. The current paper addresses three major shortcomings of existing methods, namely, lack of interactiveness in the objective space, inability to handle discrete variables and inability to generate explicit knowledge. Four data mining methods are developed that can discover knowledge in the decision space and visualize it in the objective space. These methods are (i) sequential pattern mining, (ii) clustering-based classification trees, (iii) hybrid learning, and (iv) flexible pattern mining. Each method uses a unique learning strategy to generate explicit knowledge in the form of patterns, decision rules and unsupervised rules. The methods are also capable of taking the decision maker's preferences into account to generate knowledge unique to preferred regions of the objective space. Three realistic production systems involving different types of discrete variables are chosen as application studies. A multi-objective optimization problem is formulated for each system and solved using NSGA-II to generate the optimization datasets. Next, all four methods are applied to each dataset. In each application, the methods discover similar knowledge for specified regions of the objective space. Overall, the unsupervised rules generated by flexible pattern mining are found to be the most consistent, whereas the supervised rules from classification trees are the most sensitive to user-preferences. 
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 108
Typ av publikation
tidskriftsartikel (103)
forskningsöversikt (5)
Typ av innehåll
refereegranskat (107)
övrigt vetenskapligt/konstnärligt (1)
Författare/redaktör
Verikas, Antanas, 19 ... (9)
Bacauskiene, Marija (8)
Nikolakopoulos, Geor ... (7)
Gelzinis, Adas (7)
Tiwari, Prayag, 1991 ... (5)
Boström, Henrik (4)
visa fler...
Nowaczyk, Sławomir, ... (3)
Johnsson, Magnus (3)
Verikas, Antanas (3)
Vaiciukynas, Evaldas (3)
Gil, David (3)
Ng, Amos H. C. (2)
Papapetrou, Panagiot ... (2)
Ottersten, Björn, 19 ... (2)
Andersson, Karl, 197 ... (2)
Lavesson, Niklas (2)
Vinuesa, Ricardo (2)
Mansouri, Sina Shari ... (2)
Kanellakis, Christof ... (2)
Hilletofth, Per (2)
Johansson, Ulf (2)
Englund, Cristofer (2)
Aouada, Djamila (2)
Boldt, Martin (2)
Borg, Anton (2)
Barua, Shaibal (2)
Pashami, Sepideh, 19 ... (2)
Lundström, Jens, 198 ... (2)
Bouguelia, Mohamed-R ... (2)
Patriksson, Michael, ... (2)
Sheikholharam Mashha ... (2)
Bandaru, Sunith (2)
Rydén, Patrik (2)
Deb, Kalyanmoy (2)
Lindström, Erik (2)
Papadimitriou, Andre ... (2)
Asadi, M. (2)
Kourentzes, Nikolaos (2)
Uloza, Virgilijus (2)
Bagloee, S. A. (2)
Bahnsen, Alejandro C ... (2)
Nystrup, Peter (2)
Eivazi, Hamidreza (2)
Löfström, Tuwe, 1977 ... (2)
Koval, Anton (2)
Deegalla, Sampath (2)
Walgama, Keerthi (2)
Saberi-Movahed, Fari ... (2)
Fries, Niklas (2)
Hosseini, Ahmad (2)
visa färre...
Lärosäte
Högskolan i Halmstad (22)
Kungliga Tekniska Högskolan (13)
Luleå tekniska universitet (11)
Jönköping University (10)
Chalmers tekniska högskola (9)
Mälardalens universitet (7)
visa fler...
Linköpings universitet (7)
Lunds universitet (7)
Umeå universitet (6)
Högskolan i Skövde (6)
Göteborgs universitet (5)
Blekinge Tekniska Högskola (5)
RISE (4)
Uppsala universitet (3)
Stockholms universitet (2)
Högskolan i Gävle (2)
Örebro universitet (2)
Malmö universitet (2)
Högskolan i Borås (2)
Karlstads universitet (2)
Sveriges Lantbruksuniversitet (2)
Handelshögskolan i Stockholm (1)
Mittuniversitetet (1)
Linnéuniversitetet (1)
Högskolan Dalarna (1)
VTI - Statens väg- och transportforskningsinstitut (1)
visa färre...
Språk
Engelska (108)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (74)
Teknik (42)
Samhällsvetenskap (9)
Medicin och hälsovetenskap (3)
Lantbruksvetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy