SwePub - search: WFRF:(Gay Gregory)

Enumeration	Reference	Cover	Find
1.	Almulla, Hussein, et al. (author) Generating Diverse Test Suites for Gson Through Adaptive Fitness Function Selection 2020 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. - 9783030597610 - 9783030597627 ; SSBSE 2020, s. 246-252 Conference paper (peer-reviewed)abstract Many fitness functions - such as those targeting test suite diversity—do not yield sufficient feedback to drive test generation. We propose that diversity can instead be improved through adaptive fitness function selection (AFFS), an approach that varies the fitness functions used throughout the generation process in order to strategically increase diversity. We have evaluated our AFFS framework, EvoSuiteFIT, on a set of 18 real faults from Gson, a JSON (de)serialization library. Ultimately, we find that AFFS creates test suites that are more diverse than those created using static fitness functions. We also observe that increased diversity may lead to small improvements in the likelihood of fault detection.
2.	Almulla, H., et al. (author) Learning how to search: generating effective test cases through adaptive fitness function selection 2022 In: Empirical Software Engineering. - : Springer Science and Business Media LLC. - 1382-3256 .- 1573-7616. ; 27:2 Journal article (peer-reviewed)abstract Search-based test generation is guided by feedback from one or more fitness functions-scoring functions that judge solution optimality. Choosing informative fitness functions is crucial to meeting the goals of a tester. Unfortunately, many goals-such as forcing the class-under-test to throw exceptions, increasing test suite diversity, and attaining Strong Mutation Coverage-do not have effective fitness function formulations. We propose that meeting such goals requires treating fitness function identification as a secondary optimization step. An adaptive algorithm that can vary the selection of fitness functions could adjust its selection throughout the generation process to maximize goal attainment, based on the current population of test suites. To test this hypothesis, we have implemented two reinforcement learning algorithms in the EvoSuite unit test generation framework, and used these algorithms to dynamically set the fitness functions used during generation for the three goals identified above. We have evaluated our framework, EvoSuiteFIT, on a set of Java case examples. EvoSuiteFIT techniques attain significant improvements for two of the three goals, and show limited improvements on the third when the number of generations of evolution is fixed. Additionally, for two of the three goals, EvoSuiteFIT detects faults missed by the other techniques. The ability to adjust fitness functions allows strategic choices that efficiently produce more effective test suites, and examining these choices offers insight into how to attain our testing goals. We find that adaptive fitness function selection is a powerful technique to apply when an effective fitness function does not already exist for achieving a testing goal.
3.	Almulla, Hussein, et al. (author) Learning How to Search: Generating Exception-Triggering Tests Through Adaptive Fitness Function Selection 2020 In: Proceedings - 2020 IEEE 13th International Conference on Software Testing, Verification and Validation, ICST 2020. - Porto, Portugal : IEEE. ; , s. 63-73 Conference paper (peer-reviewed)abstract Search-based test generation is guided by feedback from one or more fitness functions—scoring functions that judge solution optimality. Choosing informative fitness functions is crucial to meeting the goals of a tester. Unfortunately, many goals—such as forcing the class-under-test to throw exceptions— do not have a known fitness function formulation. We propose that meeting such goals requires treating fitness function identification as a secondary optimization step. An adaptive algorithm that can vary the selection of fitness functions could adjust its selection throughout the generation process to maximize goal attainment, based on the current population of test suites. To test this hypothesis, we have implemented two reinforcement learning algorithms in the EvoSuite framework, and used these algorithms to dynamically set the fitness functions used during generation.We have evaluated our framework, EvoSuiteFIT, on a set of 386 real faults. EvoSuiteFIT discovers and retains more exception-triggering input and produces suites that detect a variety of faults missed by the other techniques. The ability to adjust fitness functions allows EvoSuiteFIT to make strategic choices that efficiently produce more effective test suites.
4.	Bauer, Andreas (author) Towards Collaborative GUI-based Testing 2023 Licentiate thesis (other academic/artistic)abstract Context:Contemporary software development is a socio-technical activity requiring extensive collaboration among individuals with diverse expertise.Software testing is an integral part of software development that also depends on various expertise.GUI-based testing allows assessing a system’s GUI and its behavior through its graphical user interface.Collaborative practices in software development, like code reviews, not only improve software quality but also promote knowledge exchange within teams.Similar benefits could be extended to other areas of software engineering, such as GUI-based testing.However, collaborative practices for GUI-based testing necessitate a unique approach since general software development practices, perceivably, can not be directly transferred to software testing.Goal:This thesis contributes towards a tool-supported approach enabling collaborative GUI-based testing.Our distinct goals are (1) to identify processes and guidelines to enable collaboration on GUI-based testing artifacts and (2) to operationalize tool support to aid this collaboration.Method:We conducted a systematic literature review identifying code review guidelines for GUI-based testing.Further, we conducted a controlled experiment to assess the efficiency and potential usability issues of Augmented Testing.Results:We provided guidelines for reviewing GUI-based testing artifacts, which aid contributors and reviewers during code reviews.We further provide empirical evidence that Augmented Testing is not only an efficient approach to GUI-based testing but also usable for non-technical users, making it a promising subject for further research in collaborative GUI-based testing.Conclusion:Code review guidelines aid collaboration through discussions, and a suitable testing approach can serve as a platform to operationalize collaboration.Collaborative GUI-based testing has the potential to improve the efficiency and effectiveness of such testing.
5.	Berglund, Lukas, et al. (author) Test Maintenance for Machine Learning Systems: A Case Study in the Automotive Industry 2023 In: 2023 IEEE Conference on Software Testing, Verification and Validation (ICST). - 2159-4848. - 9781665456661 ; , s. 410-421 Conference paper (peer-reviewed)abstract Machine Learning (ML) systems have seen widespread use for automated decision making. Testing is essential to ensure the quality of these systems, especially safety-critical autonomous systems in the automotive domain. ML systems introduce new challenges with the potential to affect test maintenance, the process of updating test cases to match the evolving system. We conducted an exploratory case study in the automotive domain to identify factors that affect test maintenance for ML systems, as well as to make recommendations to improve the maintenance process. Based on interview and artifact analysis, we identified 14 factors affecting maintenance, including five especially relevant for ML systems—with the most important relating to non-determinism and large input spaces. We also proposed ten recommendations for improving test maintenance, including four targeting ML systems—in particular, emphasizing the use of test oracles tolerant to acceptable non-determinism. The study’s findings expand our knowledge of test maintenance for an emerging class of systems, benefiting the practitioners testing these systems.
6.	Bisht, Rohini, et al. (author) Identifying Redundancies and Gaps Across Testing Levels During Verification of Automotive Software 2023 In: 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). - 2159-4848. - 9798350333350 ; , s. 131-139 Conference paper (peer-reviewed)abstract Testing of automotive systems usually follows the V-Model, a process where sequential testing activities progress from low-level code structures to high-level integrated systems. In theory, the V-Model should reduce redundant testing and prevent gaps in verification. To assess whether such benefits translate in practice, in a case study at Scania CV AB, we have developed a framework to identify redundancies and gaps in test cases across V-model test levels.Our framework identified both redundancies and gaps in Sca-nia’s scripted testing efforts. Deviating cases were also identified where, e.g., requirements were outdated or contained incorrect details. Factors contributing to redundancy include re-verification in a new context, difficulties mapping requirements across levels, and lack of test case documentation. Both redundancies and gaps result from a lack of communication and traceability of test results across test levels. We recommend active collaboration across levels, as well as use of coverage matrices to alleviate these issues. We offer our framework to help refine testing practices and to inspire process improvements.
7.	Bollina, Srujana, et al. (author) Bytecode-Based Multiple Condition Coverage: An Initial Investigation 2020 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. - 9783030597627 ; SSBSE 2020, s. 220-236 Conference paper (peer-reviewed)abstract Masking occurs when one condition prevents another from influencing the output of a Boolean expression. Adequacy criteria such as Multiple Condition Coverage (MCC) overcome masking within one expression, but offer no guarantees about subsequent expressions. As a result, a Boolean expression written as a single complex statement will yield more effective test cases than when written as a series of simple expressions. Many approaches to automated test case generation for Java operate not on the source code, but on bytecode. The transformation to bytecode simplifies complex expressions into multiple expressions, introducing masking. We propose Bytecode-MCC, a new adequacy criterion designed to group bytecode expressions and reformulate them into complex expressions. Bytecode-MCC should produce test obligations that are more likely to reveal faults in program logic than tests covering the simplified bytecode. A preliminary study shows potential improvements from attaining Bytecode-MCC coverage. However, Bytecode-MCC is difficult to optimize, and means of increasing coverage are needed before the technique can make a difference in practice. We propose potential methods to improve coverage.
8.	Borg, Markus, et al. (author) Summary of the 4th International Workshop on Requirements Engineering and Testing (RET 2017) 2018 In: Software Engineering Notes. - : Association for Computing Machinery (ACM). - 0163-5948 .- 1943-5843. ; 42:4, s. 28-31 Journal article (other academic/artistic)abstract The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e., RET. The 4th workshop was co-located with the 25th International Requirements Engineering Conference (RE'17) in Lisbon, Portugal and attracted about 20 participants. In line with the previous workshop instances, RET 2017 o ered an interactive setting with a keynote, an invited talk, paper presentations, and a concluding hands-on exercise.
9.	Ebadi, Hamid, et al. (author) Efficient and Effective Generation of Test Cases for Pedestrian Detection - Search-based Software Testing of Baidu Apollo in SVL 2021 In: 2021 IEEE International Conference on Artificial Intelligence Testing (AITest). - : IEEE. - 9781665434812 ; , s. 103-110 Conference paper (peer-reviewed)abstract With the growing capabilities of autonomous vehicles, there is a higher demand for sophisticated and pragmatic quality assurance approaches for machine learning-enabled systems in the automotive AI context. The use of simulation-based prototyping platforms provides the possibility for early-stage testing, enabling inexpensive testing and the ability to capture critical corner-case test scenarios. Simulation-based testing properly complements conventional on-road testing. However, due to the large space of test input parameters in these systems, the efficient generation of effective test scenarios leading to the unveiling of failures is a challenge. This paper presents a study on testing pedestrian detection and emergency braking system of the Baidu Apollo autonomous driving platform within the SVL simulator. We propose an evolutionary automated test generation technique that generates failure-revealing scenarios for Apollo in the SVL environment. Our approach models the input space using a generic and flexible data structure and benefits a multi-criteria safety-based heuristic for the objective function targeted for optimization. This paper presents the results of our proposed test generation technique in the 2021 IEEE Autonomous Driving AI Test Challenge. In order to demonstrate the efficiency and effectiveness of our approach, we also report the results from a baseline random generation technique. Our evaluation shows that the proposed evolutionary test case generator is more effective at generating failure-revealing test cases and provides higher diversity between the generated failures than the random baseline.
10.	Enoiu, Eduard Paul, PhD, et al. (author) Understanding Problem Solving in Software Testing : An Exploration of Tester Routines and Behavior 2023 In: Lecture Notes Computer Science. - : Springer Science and Business Media Deutschland GmbH. - 9783031432392 ; 14131 LNCS, s. 143-159 Conference paper (peer-reviewed)abstract Software testing is a difficult, intellectual activity performed in a social environment. Naturally, testers use and allocate multiple cognitive resources towards this task. The goal of this study is to understand better the routine and behaviour of human testers and their mental models when performing testing. We investigate this topic by surveying 38 software testers and developers in Sweden. The survey explores testers’ cognitive processes when performing testing by investigating the knowledge they bring, the activities they select and perform, and the challenges they face in their routine. By analyzing the survey results, we provide a characterization of tester practices and identify insights regarding the problem-solving process. We use these descriptions to further enhance a cognitive model of software testing. © 2023, IFIP International Federation for Information Processing.
11.	Fontes, Afonso, 1987, et al. (author) Automated Support forUnit Test Generation 2023 In: Natural Computing Series. - Singapore : Springer. - 1619-7127. ; , s. 179-219 Book chapter (peer-reviewed)abstract Unit testing is a stage of testing where the smallest segment of code that can be tested in isolation from the rest of the system—often a class—is tested. Unit tests are typically written as executable code, often in a format provided by a unit testing framework such as pytest for Python. Creating unit tests is a time and effort-intensive process with many repetitive, manual elements. To illustrate how AI can support unit testing, this chapter introduces the concept of search-based unit test generation. This technique frames the selection of test input as an optimization problem—we seek a set of test cases that meet some measurable goal of a tester—and unleashes powerful metaheuristic search algorithms to identify the best possible test cases within a restricted timeframe. This chapter introduces two algorithms that can generate pytest-formatted unit tests, tuned towards coverage of source code statements. The chapter concludes by discussing more advanced concepts and gives pointers to further reading for how artificial intelligence can support developers and testers when unit testing software.
12.	Fontes, Afonso, 1987, et al. (author) The integration of machine learning into automated test generation: A systematic mapping study 2023 In: Software Testing Verification and Reliability. - 0960-0833 .- 1099-1689. ; 33:4 Journal article (peer-reviewed)abstract Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property-based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q-learning—are common, and some publications also employ unsupervised or semi-supervised learning. (Semi-/Un-)Supervised approaches are evaluated using both traditional testing metrics and ML-related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work-to-date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.
13.	Fontes, Afonso, 1987, et al. (author) Using Machine Learning to Generate Test Oracles: A Systematic Literature Review 2021 In: TORACLE 2021 - Proceedings of the 1st International Workshop on Test Oracles, co-located with ESEC/FSE 2021. - New York, NY, USA : ACM. - 9781450386265 ; , s. 1-10 Conference paper (peer-reviewed)abstract Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field. Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.
14.	Gay, Gregory, 1987, et al. (author) Defects4J as a Challenge Case for the Search-Based Software Engineering Community 2020 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. - 9783030597627 ; 12420 LNCS, s. 255-261 Conference paper (peer-reviewed)abstract Defects4J is a collection of reproducible bugs, extracted from real-world Java software systems, together with a supporting infrastructure for using these bugs. Defects4J has been widely used to evaluate software engineering research, including research on automated test generation, program repair, and fault localization. Defects4J has recently grown substantially, both in number of software systems and number of bugs. This report proposes that Defects4J can serve as a benchmark for Search-Based Software Engineering (SBSE) research as well as a catalyst for new innovations. Specifically, it outlines the current Defects4J dataset and infrastructure, and details how it can serve as a challenge case to support SBSE research and to expand Defects4J itself.
15.	Gay, Gregory, 1987, et al. (author) How Closely are Common Mutation Operators Coupled to Real Faults? 2023 In: Proceedings - 2023 IEEE 16th International Conference on Software Testing, Verification and Validation, ICST 2023. ; , s. 129-140 Conference paper (peer-reviewed)abstract In mutation testing, faulty versions of a program are generated through automated modifications of source code. These mutants are used to assess and improve test suite quality, under the assumption that detection of mutants is indicative of a test suite's ability to detect real faults - i.e., that mutants and faults have a semantic relationship. Improving the effectiveness - in both cost and quality - of mutation testing may lie in better understanding this relationship, in particular with regard to how individual mutation operators (types) couple to real faults.In this study, we examine coupling between 32,002 mutants produced by 31 mutation operators and 144 real faults, using a scale based on number of failing tests and reasons for failure. Ultimately, we observed that 9.92% of the mutants are strongly coupled to real faults, and 51.03% of the faults have at least one strongly coupled mutant. We identify and examine mutation operators with the highest median coupling, as well as the operators that tend to produce non-compiling mutants, undetected mutants, and mutants that cause tests other than those that detect the actual fault to fail. We also examine how coupling could be used to filter the set of operators employed, leading to potentially significant cost savings during mutation testing. Our findings could lead to improvements in how mutation testing is applied, improved implementation of specific mutation operators, and inspiration for new mutation operators.
16.	Gay, Gregory, 1987 (author) Improving the Readability of Generated Tests Using GPT-4 and ChatGPT Code Interpreter 2024 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 1611-3349 .- 0302-9743. ; 14415 LNCS, s. 140-146 Conference paper (peer-reviewed)abstract A major challenge in automated test generation is the readability of generated tests. Emerging large language models (LLMs) excel at language analysis and transformation tasks. We propose that improving test readability is such a task and explore the capabilities of the GPT-4 LLM in improving readability of tests generated by the Pynguin search-based generation framework. Our initial results are promising. However, there are remaining research and technical challenges.
17.	Gay, Gregory, 1987, et al. (author) Special Section on the 2019 Symposium on Search-Based Software Engineering 2021 In: Information and Software Technology. - : Elsevier BV. - 0950-5849. ; 130 Journal article (other academic/artistic)
18.	Gereziher, Teklit Berihu, et al. (author) Search-Based Test Generation Targeting Non-Functional Quality Attributes of Android Apps 2023 In: GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference. - 9798400701191 - 9798400701191 Conference paper (peer-reviewed)abstract Mobile apps form a major proportion of the software marketplace and it is crucial to ensure that they meet both functional and nonfunctional quality thresholds. Automated test input generation can reduce the cost of the testing process. However, existing Android test generation approaches are focused on code coverage and cannot be customized to a tester's diverse goals---in particular, quality attributes such as resource use. We propose a flexible multi-objective search-based test generation framework for interface testing of Android apps---STGFA-SMOG. This framework allows testers to target a variety of fitness functions, corresponding to different software quality attributes, code coverage, and other test case properties. We find that STGFA-SMOG outperforms random test generation in exposing potential quality issues and triggering crashes. Our study also offers insights on how different combinations of fitness functions can affect test generation for Android apps.
19.	Götharsson, Malte, et al. (author) Exploring the Role of Automation in Duplicate Bug Report Detection: An Industrial Case Study 2024 In: Proceedings - 2024 IEEE/ACM International Conference on Automation of Software Test, AST 2024. ; , s. 193-203 Conference paper (peer-reviewed)abstract Duplicate bug reports can increase technical debt and tester work-load in long-running software projects. Many automated techniques have been proposed to detect potential duplicate reports. However, such techniques have not seen widespread industrial adoption. Our objective in this study is to better understand how automated techniques could effectively be employed within a tester's duplicate detection workflow. We are particularly interested in exploring the potential of a human-in-The-loop scenario where tools and humans work together to make duplicate determinations.We have conducted an industrial case study where we characterize the current tester workflow. Based on this characterization, we have developed Bugle-An automated technique based on a complex language model that suggests potential duplicates to testers based on an input bug description that can be freely reformulated if the initial suggestions are irrelevant. We compare the assessments of Bugle and testers of varying experience, capturing how often-And why-opinions might differ between the two, and comparing the strengths and limitations of automated techniques to the current tester workflow. We additionally examine the influence of knowledge and biases on accuracy, the suitability of language models, and the limitations affecting duplicate detection techniques.
20.	Habibullah, Khan Mohammad, et al. (author) Non-Functional Requirements for Machine Learning: An Exploration of System Scope and Interest 2022 In: Proceedings - Workshop on Software Engineering for Responsible AI, SE4RAI 2022. - New York, NY, USA : ACM. ; , s. 29-36 Conference paper (peer-reviewed)abstract Systems that rely on Machine Learning (ML systems) have differing demands on quality - non-functional requirements (NFRs) - compared to traditional systems. NFRs for ML systems may differ in their definition, scope, and importance. Despite the importance of NFRs for ML systems, our understanding of their definitions and scope - and of the extent of existing research - is lacking compared to our understanding in traditional domains.Building on an investigation into importance and treatment of ML system NFRs in industry, we make three contributions towards narrowing this gap: (1) we present clusters of ML system NFRs based on shared characteristics, (2) we use Scopus search results - as well as inter-coder reliability on a sample of NFRs - to estimate the number of relevant studies on a subset of the NFRs, and (3), we use our initial reading of titles and abstracts in each sample to define the scope of NFRs over parts of the system (e.g., training data, ML model). These initial findings form the groundwork for future research in this emerging domain.CCS CONCEPTS • Software and its engineering → Extra-functional properties; Requirements analysis; • Computing methodologies → Machine learning.
21.	Habibullah, Khan Mohammad, et al. (author) Non-functional requirements for machine learning: understanding current use and challenges among practitioners 2023 In: Requirements Engineering. - : Springer Science and Business Media LLC. - 0947-3602 .- 1432-010X. ; 28, s. 283-316 Journal article (peer-reviewed)abstract Systems that rely on Machine Learning (ML systems) have differing demands on quality—known as non-functional requirements (NFRs)—from traditional systems. NFRs for ML systems may differ in their definition, measurement, scope, and comparative importance. Despite the importance of NFRs in ensuring the quality ML systems, our understanding of all of these aspects is lacking compared to our understanding of NFRs in traditional domains. We have conducted interviews and a survey to understand how NFRs for ML systems are perceived among practitioners from both industry and academia. We have identified the degree of importance that practitioners place on different NFRs, including cases where practitioners are in agreement or have differences of opinion. We explore how NFRs are defined and measured over different aspects of a ML system (i.e., model, data, or whole system). We also identify challenges associated with NFR definition and measurement. Finally, we explore differences in perspective between practitioners in industry, academia, or a blended context. This knowledge illustrates how NFRs for ML systems are treated in current practice, and helps to guide future RE for ML efforts.
22.	Habibullah, Khan Mohammad, et al. (author) Requirements and software engineering for automotive perception systems: an interview study 2024 In: REQUIREMENTS ENGINEERING. - 0947-3602 .- 1432-010X. ; 29:1, s. 25-48 Journal article (peer-reviewed)abstract Driving automation systems, including autonomous driving and advanced driver assistance, are an important safety-critical domain. Such systems often incorporate perception systems that use machine learning to analyze the vehicle environment. We explore new or differing topics and challenges experienced by practitioners in this domain, which relate to requirements engineering (RE), quality, and systems and software engineering. We have conducted a semi-structured interview study with 19 participants across five companies and performed thematic analysis of the transcriptions. Practitioners have difficulty specifying upfront requirements and often rely on scenarios and operational design domains (ODDs) as RE artifacts. RE challenges relate to ODD detection and ODD exit detection, realistic scenarios, edge case specification, breaking down requirements, traceability, creating specifications for data and annotations, and quantifying quality requirements. Practitioners consider performance, reliability, robustness, user comfort, and-most importantly-safety as important quality attributes. Quality is assessed using statistical analysis of key metrics, and quality assurance is complicated by the addition of ML, simulation realism, and evolving standards. Systems are developed using a mix of methods, but these methods may not be sufficient for the needs of ML. Data quality methods must be a part of development methods. ML also requires a data-intensive verification and validation process, introducing data, analysis, and simulation challenges. Our findings contribute to understanding RE, safety engineering, and development methodologies for perception systems. This understanding and the collected challenges can drive future research for driving automation and other ML systems.
23.	Habibullah, Khan Mohammad, et al. (author) Requirements Engineering for Automotive Perception Systems: an Interview Study 2023 In: Requirements Engineering: Foundation for Software Quality 29th International Working Conference, REFSQ 2023. - : Springer. Conference paper (peer-reviewed)
24.	Habibullah, Khan Mohammad, 1990, et al. (author) Requirements Engineering for Automotive Perception Systems: An Interview Study 2023 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 1611-3349 .- 0302-9743. - 9783031297854 ; 13975 LNCS, s. 189-205 Conference paper (peer-reviewed)abstract Background: Driving automation systems (DAS), including autonomous driving and advanced driver assistance, are an important safety-critical domain. DAS often incorporate perceptions systems that use machine learning (ML) to analyze the vehicle environment. Aims: We explore new or differing requirements engineering (RE) topics and challenges that practitioners experience in this domain. Method: We have conducted an interview study with 19 participants across five companies and performed thematic analysis. Results: Practitioners have difficulty specifying upfront requirements, and often rely on scenarios and operational design domains (ODDs) as RE artifacts. Challenges relate to ODD detection and ODD exit detection, realistic scenarios, edge case specification, breaking down requirements, traceability, creating specifications for data and annotations, and quantifying quality requirements. Conclusions: Our findings contribute to understanding how RE is practiced for DAS perception systems and the collected challenges can drive future research for DAS and other ML-enabled systems.
25.	Habibullah, Khan Mohammad, et al. (author) Scoping of Non-Functional Requirements for Machine Learning Systems 2024 In: Proceedings of the IEEE International Conference on Requirements Engineering. - 2332-6441 .- 1090-705X. ; , s. 496-497 Conference paper (peer-reviewed)abstract Machine Learning (ML) systems increasingly perform complex decision-making and prediction tasks - e.g., in autonomous driving - based on patterns inferred from large quantities of data. The inclusion of ML increases the capabilities of software systems, but also introduces or exacerbates challenges. ML systems can be more complex, time-consuming and expensive to specify, develop, and test than traditional systems, and can suffer from issues related to safety, lack of explainability, limited maintainability, and bias [1], [2]. As in other domains, ML systems must satisfy certain quality requirements - known as non-functional requirements (NFRs) - to be considered fit for purpose [1].
26.	Istanbuly, Dia, et al. (author) How Do Different Types ofTesting Goals Affect Test Case Design? 2023 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 0302-9743 .- 1611-3349. - 9783031432392 Conference paper (peer-reviewed)abstract Test cases are designed in service of goals, e.g., functional correctness or performance. Unfortunately, we lack a clear understanding of how specific goal types influence test design. In this study, we explore this relationship through interviews and a survey with software developers, with a focus on identification and importance of goal types, quantitative relations between goals and tests, and personal, organizational, methodological, and technological factors. We identify nine goal types and their importance, and perform further analysis of three—correctness, reliability, and quality. We observe that test design for correctness forms a “default” design process that is modified when pursuing other goals. For the examined goal types, test cases tend to be simple, with many tests targeting a single goal and each test focusing on 1–2 goals at a time. We observe differences in testing practices, tools, and targeted system types between goal types. In addition, we observe that test design can be influenced by organization, process, and team makeup. This study provides a foundation for future research on test design and testing goals.
27.	Lyu, Haozhou, et al. (author) Developer Views on Software Carbon Footprint and Its Potential for Automated Reduction 2023 In: Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415. Conference paper (peer-reviewed)abstract Reducing software carbon footprint could contribute to efforts to avert climate change. Past research indicates that developers lack knowledge on energy consumption and carbon footprint, and existing reduction guidelines are difficult to apply. Therefore, we propose that automated reduction methods should be explored, e.g., through genetic improvement. However, such tools must be voluntarily adopted and regularly used to have an impact. In this study, we have conducted interviews and a survey (a) to explore developers’ existing opinions, knowledge, and practices with regard to carbon footprint and energy consumption, and (b), to identify the requirements that automated reduction tools must meet to ensure adoption. Our findings offer a foundation for future research on practices, guidelines, and automated tools that address software carbon footprint.
28.	Lyu, Haozhou, et al. (author) Exploring Genetic Improvement oftheCarbon Footprint ofWeb Pages 2023 In: Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415.. Conference paper (peer-reviewed)abstract In this study, we explore automated reduction of the carbon footprint of web pages through genetic improvement, a process that produces alternative versions of a program by applying program transformations intended to optimize qualities of interest. We introduce a prototype tool that imposes transformations to HTML, CSS, and JavaScript code, as well as image resources, that minimize the quantity of data transferred and memory usage while also minimizing impact to the user experience (measured through loading time and number of changes imposed).
29.	Lyu, Haozhou, et al. (author) Exploring Genetic Improvement of the Carbon Footprint of Web Pages 2024 In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 1611-3349 .- 0302-9743. ; 14415 LNCS, s. 67-83 Conference paper (peer-reviewed)abstract In this study, we explore automated reduction of the carbon footprint of web pages through genetic improvement, a process that produces alternative versions of a program by applying program transformations intended to optimize qualities of interest. We introduce a prototype tool that imposes transformations to HTML, CSS, and JavaScript code, as well as image resources, that minimize the quantity of data transferred and memory usage while also minimizing impact to the user experience (measured through loading time and number of changes imposed).
30.	Meng, Ying, et al. (author) Understanding The Impact of Solver Choice in Model-Based Test Generation 2020 In: International Symposium on Empirical Software Engineering and Measurement. - New York, NY, USA : ACM. - 1949-3789 .- 1949-3770. ; ESEM '20 Conference paper (peer-reviewed)abstract Background: In model-based test generation, SMT solvers explore the state-space of the model in search of violations of specified properties. If the solver finds that a predicate can be violated, it produces a partial test specification demonstrating the violation. Aims: The choice of solvers is important, as each may produce differing counterexamples. We aim to understand how solver choice impacts the effectiveness of generated test suites at finding faults. Method: We have performed experiments examining the impact of solver choice across multiple dimensions, examining the ability to attain goal satisfaction and fault detection when satisfaction is achieved---varying the source of test goals, data types of model input, and test oracle. Results: The results of our experiment show that solvers vary in their ability to produce counterexamples, and---for models where all solvers achieve goal satisfaction---in the resulting fault detection of the generated test suites. The choice of solver has an impact on the resulting test suite, regardless of the oracle, model structure, or source of testing goals. Conclusions: The results of this study identify factors that impact fault-detection effectiveness, and advice that could improve future approaches to model-based test generation.
31.	Orgard, Jonathan, et al. (author) Mutation Testing in Continuous Integration: An Exploratory Industrial Case Study 2023 In: Proceedings - 2023 IEEE 16th International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2023. - 9798350333350 ; , s. 324-333 Conference paper (peer-reviewed)abstract Despite its potential quality benefits, the cost of mutation testing and the immaturity of mutation tools for many languages have led to a lack of adoption in industrial software development. In an exploratory case study at Zenseact - a company in the automotive domain - we have explored how mutation testing could be effectively applied in a typical Continuous Integration-based workflow. We evaluated the capabilities of C++ mutation tools, and demonstrate their use in GitHub Actionsbased CI workflows. Our investigation reveals that Dextool and Mull could be used in a CI workflow. Additionally, we conducted an interview study to understand how developers would use mutation testing in their CI workflows. Based on our qualitative analysis and practices proposed in literature, we discuss recommendations to integrate mutation testing in a CI workflow. For instance, visualising trends in the mutation score enable practitioners to understand how test quality is evolving. Moreover, tools should have a balance between offering fast feedback and keeping or flagging relevant mutants. Lastly, practitioners raised the need that the mutation should be applied at commit level, and that developers inexperienced with mutation testing should be trained in the implications of the practice.
32.	Salahirad, Alireza, et al. (author) Mapping the structure and evolution of software testing research over the past three decades 2023 In: Journal of Systems and Software. - 0164-1212. ; 195 Journal article (peer-reviewed)abstract Background: The field of software testing is growing and rapidly-evolving. Aims: Based on keywords assigned to publications, we seek to identify predominant research topics and understand how they are connected and have evolved. Methods: We apply co-word analysis to map the topology of testing research as a network where author-assigned keywords are connected by edges indicating co-occurrence in publications. Keywords are clustered based on edge density and frequency of connection. We examine the most popular keywords, summarize clusters into high-level research topics examine how topics connect, and examine how the field is changing. Results: Testing research can be divided into 16 high-level topics and 18 subtopics. Creation guidance, automated test generation, evolution and maintenance, and test oracles have particularly strong connections to other topics, highlighting their multidisciplinary nature. Emerging keywords relate to web and mobile apps, machine learning, energy consumption, automated program repair and test generation, while emerging connections have formed between web apps, test oracles, and machine learning with many topics. Random and requirements-based testing show potential decline. Conclusions: Our observations, advice, and map data offer a deeper understanding of the field and inspiration regarding challenges and connections to explore. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
33.	Salahirad, Alireza, et al. (author) Mapping the structure and evolution of software testing research over the past three decades 2023 In: Journal of Systems and Software. - : Elsevier BV. - 0164-1212. ; 195 Journal article (peer-reviewed)abstract Background: The field of software testing is growing and rapidly-evolving. Aims: Based on keywords assigned to publications, we seek to identify predominant research topics and understand how they are connected and have evolved. Methods: We apply co-word analysis to map the topology of testing research as a network where author-assigned keywords are connected by edges indicating co-occurrence in publications. Keywords are clustered based on edge density and frequency of connection. We examine the most popular keywords, summarize clusters into high-level research topics examine how topics connect, and examine how the field is changing. Results: Testing research can be divided into 16 high-level topics and 18 subtopics. Creation guidance, automated test generation, evolution and maintenance, and test oracles have particularly strong connections to other topics, highlighting their multidisciplinary nature. Emerging keywords relate to web and mobile apps, machine learning, energy consumption, automated program repair and test generation, while emerging connections have formed between web apps, test oracles, and machine learning with many topics. Random and requirements-based testing show potential decline. Conclusions: Our observations, advice, and map data offer a deeper understanding of the field and inspiration regarding challenges and connections to explore. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
34.	Staron, Miroslaw, 1977, et al. (author) Testing, Debugging, and Log Analysis With Modern AI Tools 2024 In: IEEE Software. - 0740-7459 .- 1937-4194. ; 41, s. 99-102 Journal article (peer-reviewed)abstract This edition of the Practitioners Digest covers recent papers employing generative artificial intelligence in support of testing, debugging, and log analysis that were presented at the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023) and the 16th IEEE International Conference on Software, Testing, Verification and Validation (ICST 2023). Feedback or suggestions are welcome. In addition, if you try or adopt any of the practices included in the column, please send us and the authors of the paper(s) a note about your experiences.
35.	Unterkalmsteiner, Michael, et al. (author) Summary of the 5th International Workshop on Requirements Engineering and Testing (RET 2018) 2019 In: Software Engineering Notes. - : Association for Computing Machinery. - 0163-5948 .- 1943-5843. ; 44:1, s. 31-34 Journal article (pop. science, debate, etc.)abstract The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate elds of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e. RET. The 5th workshop was held in colocation with ICSE 2018 in Gothenburg, Sweden. The workshop continued in the same interactive vein as the predecessors. We introduced a new format for the presentations in which the paper authors had the opportunity to interact extensively with the audience. Each author was supported by a member of the organization committee to prepare either an extensive demo, collect more data in form of a questionnaire or perform a hands-on tutorial. We named this new format X-ray session". In order to create an RET knowledge base, this cross-cutting area elicits contributions from both RE and Testing, and from both researchers and practitioners. A range of papers were presented from short positions papers to full research papers that cover connections between the two elds. The workshop attracted 27 participants and the positive feedback on the new format encourages us to organize the workshop the next year again.
36.	Niemi, MEK, et al. (author) 2021 swepub:Mat__t
37.	Kanai, M, et al. (author) 2023 swepub:Mat__t

Skapa referenser, mejla, bekava och länka

Permalink

Träfflista för sökning "WFRF:(Gay Gregory) "

Refine your search

Year