SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Ericsson Morgan Docent 1973 ) "

Sökning: WFRF:(Ericsson Morgan Docent 1973 )

  • Resultat 1-10 av 33
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Olsson, Tobias, 1974- (författare)
  • Incremental Clustering of Source Code : a Machine Learning Approach
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Technical debt at the architectural level is a severe threat to software development projects. Uncontrolled technical debt that is allowed to accumulate will undoubtedly hinder speedy development and maintenance, introduce bugs and problems in the software product, and may ultimately result in the abandonment of the source code. It is possible to detect debt accumulation by analyzing the source code and intended modules in the software architecture. However, this is seldom done in practice since it requires a correct and up-to-date mapping from source code to intended modules in the architecture. This mapping requires significant manual effort to create and maintain, something often considered too costly and laborsome. We investigate how to automate the mapping from source code to intended modules. The state-of-the-art considers it an incremental clustering problem, where source code entities should be clustered to the intended modules based on some similarity measure. As the system evolves and source code entities are added or modified, the clustering needs to be updated. The state-of-the-art techniques determine similarity based on either syntactic or semantic features, e.g., dependencies or identifier names. Large sets of parameters modify these features, e.g., weights for various types of dependencies. These parameters have a significant impact on how well the clustering performs. Unfortunately, we have not been able to identify any heuristics to help human experts determine a good set of parameters for a given system. Based on the parameters determined by, e.g., genetic optimization, it seems unlikely that general heuristics exist.Instead, we compute the similarity using a multinomial na\"ive Bayes text classifier trained on tokens from the source code entities. We also include a novel feature that captures dependencies as text to add syntactic features. Our classifier, which relies on significantly fewer parameters, outperforms the state-of-the-art techniques, with their parameters set to near-optimal values.We find that machine learning provides better mapping performance with fewer required parameters. We can successfully combine syntactic information with semantic information without additional parameters. We provide an open-source tool suite with a reference implementation of different techniques and a curated set of systems that can act as a ground truth benchmark.
  •  
2.
  • Ambrosius, Robin, et al. (författare)
  • Interviews Aided with Machine Learning
  • 2018
  • Ingår i: Perspectives in Business Informatics Research. BIR 2018. - Cham : Springer. - 9783319999500 - 9783319999517 ; , s. 202-216
  • Konferensbidrag (refereegranskat)abstract
    • We have designed and implemented a Computer Aided Personal Interview (CAPI) system that learns from expert interviews and can support less experienced interviewers by for example suggesting questions to ask or skip. We were particularly interested to streamline the due diligence process when estimating the value for software startups. For our design we evaluated some machine learning algorithms and their trade-offs, and in a small case study we evaluates their implementation and performance. We find that while there is room for improvement, the system can learn and recommend questions. The CAPI system can in principle be applied to any domain in which long interview sessions should be shortened without sacrificing the quality of the assessment.
  •  
3.
  • Ericsson, Morgan, Docent, 1973-, et al. (författare)
  • TDMentions : A Dataset of Technical Debt Mentions in Online Posts
  • 2019
  • Ingår i: 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON TECHNICAL DEBT (TECHDEBT 2019). - : IEEE. - 9781728133713 ; , s. 123-124
  • Konferensbidrag (refereegranskat)abstract
    • The term technical debt is easy to understand as a metaphor, but can quickly grow complex in practice. We contribute with a dataset, TDMentions, that enables researchers to study how developers and end users use the term technical debt in online posts and discussions. The dataset consists of posts from news aggregators and Q&A-sites, blog posts, and issues and commits on GitHub.
  •  
4.
  • Hönel, Sebastian, et al. (författare)
  • Activity-Based Detection of (Anti-)Patterns : An Embedded Case Study of the Fire Drill
  • 2024
  • Ingår i: e-Informatica Software Engineering Journal. - : Wroclaw University of Science and Technology. - 1897-7979 .- 2084-4840. ; 18:1
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Nowadays, expensive, error-prone, expert-based evaluations are needed to identify and assess software process anti-patterns. Process artifacts cannot be automatically used to quantitatively analyze and train prediction models without exact ground truth. Aim: Develop a replicable methodology for organizational learning from process (anti-)patterns, demonstrating the mining of reliable ground truth and exploitation of process artifacts. Method: We conduct an embedded case study to find manifestations of the Fire Drill anti-pattern in n = 15 projects. To ensure quality, three human experts agree. Their evaluation and the process’ artifacts are utilized to establish a quantitative understanding and train a prediction model. Results: Qualitative review shows many project issues. (i) Expert assessments consistently provide credible ground truth. (ii) Fire Drill phenomenological descriptions match project activity time (for example, development). (iii) Regression models trained on ≈ 12–25 examples are sufficiently stable. Conclusion: The approach is data source-independent (source code or issue-tracking). It allows leveraging process artifacts for establishing additional phenomenon knowledge and training robust predictive models. The results indicate the aptness of the methodology for the identification of the Fire Drill and similar anti-pattern instances modeled using activities. Such identification could be used in post mortem process analysis supporting organizational learning for improving processes.
  •  
5.
  • Hönel, Sebastian, et al. (författare)
  • Bayesian Regression on segmented data using Kernel Density Estimation
  • 2019
  • Ingår i: 5th annual Big Data Conference. - : Zenodo.
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • The challenge of having to deal with dependent variables in classification and regression using techniques based on Bayes' theorem is often avoided by assuming a strong independence between them, hence such techniques are said to be naive. While analytical solutions supporting classification on arbitrary amounts of discrete and continuous random variables exist, practical solutions are scarce. We are evaluating a few Bayesian models empirically and consider their computational complexity. To overcome the often assumed independence, those models attempt to resolve the dependencies using empirical joint conditional probabilities and joint conditional probability densities. These are obtained by posterior probabilities of the dependent variable after segmenting the dataset for each random variable's value. We demonstrate the advantages of these models, such as their nature being deterministic (no randomization or weights required), that no training is required, that each random variable may have any kind of probability distribution, how robustness is upheld without having to impute missing data, and that online learning is effortlessly possible. We compare such Bayesian models against well-established classifiers and regression models, using some well-known datasets. We conclude that our evaluated models can outperform other models in certain settings, using classification. The regression models deliver respectable performance, without leading the field.
  •  
6.
  • Hönel, Sebastian, et al. (författare)
  • Contextual Operationalization of Metrics as Scores : Is My Metric Value Good?
  • 2022
  • Ingår i: Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS). - : IEEE. - 9781665477048 ; , s. 333-343
  • Konferensbidrag (refereegranskat)abstract
    • Software quality models aggregate metrics to indicate quality. Most metrics reflect counts derived from events or attributes that cannot directly be associated with quality. Worse, what constitutes a desirable value for a metric may vary across contexts. We demonstrate an approach to transforming arbitrary metrics into absolute quality scores by leveraging metrics captured from similar contexts. In contrast to metrics, scores represent freestanding quality properties that are also comparable. We provide a web-based tool for obtaining contextualized scores for metrics as obtained from one’s software. Our results indicate that significant differences among various metrics and contexts exist. The suggested approach works with arbitrary contexts. Given sufficient contextual information, it allows for answering the question of whether a metric value is good/bad or common/extreme.
  •  
7.
  • Hönel, Sebastian (författare)
  • Efficient Automatic Change Detection in Software Maintenance and Evolutionary Processes
  • 2020
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Software maintenance is such an integral part of its evolutionary process that it consumes much of the total resources available. Some estimate the costs of maintenance to be up to 100 times the amount of developing a software. A software not maintained builds up technical debt, and not paying off that debt timely will eventually outweigh the value of the software, if no countermeasures are undertaken. A software must adapt to changes in its environment, or to new and changed requirements. It must further receive corrections for emerging faults and vulnerabilities. Constant maintenance can prepare a software for the accommodation of future changes.While there may be plenty of rationale for future changes, the reasons behind historical changes may not be accessible longer. Understanding change in software evolution provides valuable insights into, e.g., the quality of a project, or aspects of the underlying development process. These are worth exploiting, for, e.g., fault prediction, managing the composition of the development team, or for effort estimation models. The size of software is a metric often used in such models, yet it is not well-defined. In this thesis, we seek to establish a robust, versatile and computationally cheap metric, that quantifies the size of changes made during maintenance. We operationalize this new metric and exploit it for automated and efficient commit classification.Our results show that the density of a commit, that is, the ratio between its net- and gross-size, is a metric that can replace other, more expensive metrics in existing classification models. Models using this metric represent the current state of the art in automatic commit classification. The density provides a more fine-grained and detailed insight into the types of maintenance activities in a software project.Additional properties of commits, such as their relation or intermediate sojourn-times, have not been previously exploited for improved classification of changes. We reason about the potential of these, and suggest and implement dependent mixture- and Bayesian models that exploit joint conditional densities, models that each have their own trade-offs with regard to computational cost and complexity, and prediction accuracy. Such models can outperform well-established classifiers, such as Gradient Boosting Machines.All of our empirical evaluation comprise large datasets, software and experiments, all of which we have published alongside the results as open-access. We have reused, extended and created datasets, and released software packages for change detection and Bayesian models used for all of the studies conducted.
  •  
8.
  • Hönel, Sebastian, et al. (författare)
  • Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities
  • 2019
  • Ingår i: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). - : IEEE. - 9781728139272 - 9781728139289 ; , s. 109-120
  • Konferensbidrag (refereegranskat)abstract
    • Commit classification, the automatic classification of the purpose of changes to software, can support the understanding and quality improvement of software and its development process. We introduce code density of a commit, a measure of the net size of a commit, as a novel feature and study how well it is suited to determine the purpose of a change. We also compare the accuracy of code-density-based classifications with existing size-based classifications. By applying standard classification models, we demonstrate the significance of code density for the accuracy of commit classification. We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Such highly accurate classification of the purpose of software changes helps to improve the confidence in software (process) quality analyses exploiting this classification information.
  •  
9.
  • Hönel, Sebastian, et al. (författare)
  • Metrics As Scores : A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions
  • 2023
  • Ingår i: Journal of Open Source Software. - : Open Journals. - 2475-9066. ; 8:88
  • Tidskriftsartikel (refereegranskat)abstract
    • Metrics As Scores can be thought of as an interactive, multiple analysis of variance (abbr. "ANOVA," Chambers et al., 2017). An ANOVA might be used to estimate the goodness-of-fit of a statistical model. Beyond ANOVA, which is used to analyze the differences among hypothesized group means for a single quantity (feature), Metrics As Scores seeks to answer the question of whether a sample of a certain feature is more or less common across groups. This approach to data visualization and -exploration has been used previously (e.g., Jiang etal., 2022). Beyond this, Metrics As Scores can determine what might constitute a good/bad, acceptable/alarming, or common/extreme value, and how distant the sample is from that value, for each group. This is expressed in terms of a percentile (a standardized scale of [0, 1]), which we call score. Considering all available features among the existing groups furthermore allows the user to assess how different the groups are from each other, or whether they are indistinguishable from one another. The name Metrics As Scores was derived from its initial application: examining differences of software metrics across application domains (Hönel et al., 2022). A software metric is an aggregation of one or more raw features according to some well-defined standard, method, or calculation. In software processes, such aggregations are often counts of events or certain properties (Florac & Carleton, 1999). However, without the aggregation that is done in a quality model, raw data (samples) and software metrics are rarely of great value to analysts and decision-makers. This is because quality models are conceived to establish a connection between software metrics and certain quality goals (Kaner & Bond, 2004). It is, therefore, difficult to answer the question "is my metric value good?". With Metrics As Scores we present an approach that, given some ideal value, can transform any sample into a score, given a sample of sufficiently many relevant values. While such ideal values for software metrics were previously attempted to be derived from, e.g., experience or surveys (Benlarbi et al., 2000), benchmarks (Alves et al., 2010), or by setting practical values (Grady, 1992), with Metrics As Scores we suggest deriving ideal values additionally in non-parametric, statistical ways. To do so, data first needs to be captured in a relevant context (group). A feature value might be good in one context, while it is less so in another. Therefore, we suggest generalizing and contextualizing the approach taken by Ulan et al. (2021), in which a score is defined to always have a range of [0, 1] and linear behavior. This means that scores can now also be compared and that a fixed increment in any score is equally valuable among scores. This is not the case for raw features, otherwise. Metrics As Scores consists of a tool- and analysis suite and an interactive application that allows researchers to explore and understand differences in scores across groups. The operationalization of features as scores lies in gathering values that are context-specific (group-typical), determining an ideal value non-parametrically or by user preference, and then transforming the observed values into distances. Metrics As Scores enables this procedure by unifying the way of obtaining probability densities/masses and conducting appropriate statistical tests. More than 120 different parametric distributions (approx. 20 of which are discrete) are fitted through a common interface. Those distributions are part of the scipy package for the Python programming language, which Metrics As Scores makes extensive use of (Virtanen et al., 2020). While fitting continuous distributions is straightforward using maximum likelihood estimation, many discrete distributions have integral parameters. For these, Metrics As Scores solves a mixed-variable global optimization problem using a genetic algorithm in pymoo (Blank& Deb, 2020). Additionally to that, empirical distributions (continuous and discrete) and smooth approximate kernel density estimates are available. Applicable statistical tests for assessing the goodness-of-fit are automatically performed. These tests are used to select some best-fitting random variable in the interactive web application. As an application written in Python, Metrics As Scores is made available as a package that is installable using the PythonPackage Index (PyPI): pip install metrics-as-scores. As such, the application can be used in a stand-alone manner and does not require additional packages, such as a web server or third-party libraries.
  •  
10.
  • Hönel, Sebastian (författare)
  • Quantifying Process Quality : The Role of Effective Organizational Learning in Software Evolution
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues.Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively captured digital artifacts from application lifecycle management, combined with qualitative analysis, for efficient organizational learning. A new language-independent metric is proposed to robustly capture the size of changes, significantly improving the accuracy of change nature determination. The classified changes are then used to explore, visualize, and suggest maintenance activities, enabling solid prediction of malpractice presence and -severity, even with limited data. Finally, parts of the automatic quantitative analysis are made accessible, potentially replacing expert-based qualitative analysis in parts.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 33

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy