SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Munappy Aiswarya Raj 1990) "

Sökning: WFRF:(Munappy Aiswarya Raj 1990)

  • Resultat 1-14 av 14
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • From Ad-Hoc Data Analytics to DataOps
  • 2020
  • Ingår i: Proceedings - 2020 IEEE/ACM International Conference on Software and System Processes, ICSSP 2020. - New York, NY, USA : ACM. ; , s. 165-174
  • Konferensbidrag (refereegranskat)abstract
    • The collection of high-quality data provides a key competitive advantage to companies in their decision-making process. It helps to understand customer behavior and enables the usage and deployment of new technologies based on machine learning. However, the process from collecting the data, to clean and process it to be used by data scientists and applications is often manual, non-optimized and error-prone. This increases the time that the data takes to deliver value for the business. To reduce this time companies are looking into automation and validation of the data processes. Data processes are the operational side of data analytic workflow.DataOps, a recently coined term by data scientists, data analysts and data engineers refer to a general process aimed to shorten the end-to-end data analytic life-cycle time by introducing automation in the data collection, validation, and verification process. Despite its increasing popularity among practitioners, research on this topic has been limited and does not provide a clear definition for the term or how a data analytic process evolves from ad-hoc data collection to fully automated data analytics as envisioned by DataOps.This research provides three main contributions. First, utilizing multi-vocal literature we provide a definition and a scope for the general process referred to as DataOps. Second, based on a case study with a large mobile telecommunication organization, we analyze how multiple data analytic teams evolve their infrastructure and processes towards DataOps. Also, we provide a stairway showing the different stages of the evolution process. With this evolution model, companies can identify the stage which they belong to and also, can try to move to the next stage by overcoming the challenges they encounter in the current stage.
  •  
2.
  • Dakkak, Anas, et al. (författare)
  • Customer Support In The Era of Continuous Deployment: A Software-Intensive Embedded Systems Case Study
  • 2022
  • Ingår i: Proceedings - 2022 IEEE 46th Annual Computers, Software, and Applications Conference, COMPSAC 2022. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 914-923
  • Konferensbidrag (refereegranskat)abstract
    • Supporting customers after they acquire the prod-uct is essential for companies producing and selling software-intensive embedded systems products. Generally, customer sup-port is the first interaction point between the product users and the product vendor. Customer support is often engaged with answering customers' questions, troubleshooting, fault identification, and fixing product faults. While continuous deployment advocates for closer cooperation between the ones operating the software and the ones developing it, the means of such collaboration in general and the role of customer support, in particular, has not been addressed in the context of software-intensive embedded systems. Therefore, to better understand the impact that continuous deployment has on customer support and the role customer support should play in this context, we conducted a case study at a multinational company developing and selling telecommunications networks infrastructure. We focused on the 4th and 5th Generation (4G and 5G) Radio Access Networks (RAN) products, which can be considered a high volume product as they cover more than 80% of the world's population. Our study reveals that customer support needs to transition from a transaction-based and passive function triggered by customer support requests, to take an active role characterized by being proactive and preemptive to cope with the shorter operational time of a software version introduced by continuous deployment. In addition, customer support plays an essential role in making the feedback actionable by aggregating and consolidating feedback data to the R&D organization.
  •  
3.
  • Lwakatare, Lucy, 1987, et al. (författare)
  • A taxonomy of software engineering challenges for machine learning systems: An empirical investigation
  • 2019
  • Ingår i: Lecture Notes in Business Information Processing. - Cham : Springer International Publishing. - 1865-1356 .- 1865-1348. ; 355, s. 227-243, s. 227-243
  • Konferensbidrag (refereegranskat)abstract
    • Artificial intelligence enabled systems have been an inevitable part of everyday life. However, efficient software engineering principles and processes need to be considered and extended when developing AI- enabled systems. The objective of this study is to identify and classify software engineering challenges that are faced by different companies when developing software-intensive systems that incorporate machine learning components. Using case study approach, we explored the development of machine learning systems from six different companies across various domains and identified main software engineering challenges. The challenges are mapped into a proposed taxonomy that depicts the evolution of use of ML components in software-intensive system in industrial settings. Our study provides insights to software engineering community and research to guide discussions and future research into applied machine learning.
  •  
4.
  • Lwakatare, Lucy, 1987, et al. (författare)
  • Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions
  • 2020
  • Ingår i: Information and Software Technology. - : Elsevier BV. - 0950-5849 .- 1873-6025. ; 127
  • Tidskriftsartikel (refereegranskat)abstract
    • Background : Developing and maintaining large scale machine learning (ML) based software systems in an in-dustrial setting is challenging. There are no well-established development guidelines, but the literature contains reports on how companies develop and maintain deployed ML-based software systems. Objective : This study aims to survey the literature related to development and maintenance of large scale ML -based systems in industrial settings in order to provide a synthesis of the challenges that practitioners face. In addition, we identify solutions used to address some of these challenges. Method : A systematic literature review was conducted and we identified 72 papers related to development and maintenance of large scale ML-based software systems in industrial settings. The selected articles were qualita-tively analyzed by extracting challenges and solutions. The challenges and solutions were thematically synthe-sized into four quality attributes: adaptability, scalability, safety and privacy. The analysis was done in relation to ML workflow, i.e. data acquisition, training, evaluation, and deployment. Results : We identified a total of 23 challenges and 8 solutions related to development and maintenance of large scale ML-based software systems in industrial settings including six different domains. Challenges were most often reported in relation to adaptability and scalability. Safety and privacy challenges had the least reported solutions. Conclusion : The development and maintenance on large-scale ML-based systems in industrial settings introduce new challenges specific for ML, and for the known challenges characteristic for these types of systems, require new methods in overcoming the challenges. The identified challenges highlight important concerns in ML system development practice and the lack of solutions point to directions for future research.
  •  
5.
  • Munappy, Aiswarya Raj, 1990 (författare)
  • Data management and Data Pipelines: An empirical investigation in the embedded systems domain
  • 2021
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Context: Companies are increasingly collecting data from all possible sources to extract insights that help in data-driven decision-making. Increased data volume, variety, and velocity and the impact of poor quality data on the development of data products are leading companies to look for an improved data management approach that can accelerate the development of high-quality data products. Further, AI is being applied in a growing number of fields, and thus it is evolving as a horizontal technology. Consequently, AI components are increasingly been integrated into embedded systems along with electronics and software. We refer to these systems as AI-enhanced embedded systems. Given the strong dependence of AI on data, this expansion also creates a new space for applying data management techniques. Objective: The overall goal of this thesis is to empirically identify the data management challenges encountered during the development and maintenance of AI-enhanced embedded systems, propose an improved data management approach and empirically validate the proposed approach. Method: To achieve the goal, we conducted this research in close collaboration with Software Center companies using a combination of different empirical research methods: case studies, literature reviews, and action research. Results and conclusions: This research provides five main results. First, it identifies key data management challenges specific to Deep Learning models developed at embedded system companies. Second, it examines the practices such as DataOps and data pipelines that help to address data management challenges. We observed that DataOps is the best data management practice that improves the data quality and reduces the time tdevelop data products. The data pipeline is the critical component of DataOps that manages the data life cycle activities. The study also provides the potential faults at each step of the data pipeline and the corresponding mitigation strategies. Finally, the data pipeline model is realized in a small piece of data pipeline and calculated the percentage of saved data dumps through the implementation. Future work: As future work, we plan to realize the conceptual data pipeline model so that companies can build customized robust data pipelines. We also plan to analyze the impact and value of data pipelines in cross-domain AI systems and data applications. We also plan to develop AI-based fault detection and mitigation system suitable for data pipelines.
  •  
6.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Data Management Challenges for Deep Learning
  • 2019
  • Ingår i: Proceedings - 45th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2019. - : IEEE. ; , s. 140-147
  • Konferensbidrag (refereegranskat)abstract
    • © 2019 IEEE. Deep learning is one of the most exciting and fast-growing techniques in Artificial Intelligence. The unique capacity of deep learning models to automatically learn patterns from the data differentiates it from other machine learning techniques. Deep learning is responsible for a significant number of recent breakthroughs in AI. However, deep learning models are highly dependent on the underlying data. So, consistency, accuracy, and completeness of data is essential for a deep learning model. Thus, data management principles and practices need to be adopted throughout the development process of deep learning models. The objective of this study is to identify and categorise data management challenges faced by practitioners in different stages of end-to-end development. In this paper, a case study approach is employed to explore the data management issues faced by practitioners across various domains when they use real-world data for training and deploying deep learning models. Our case study is intended to provide valuable insights to the deep learning community as well as for data scientists to guide discussion and future research in applied deep learning with real-world data.
  •  
7.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Data management for production quality deep learning models: Challenges and solutions
  • 2022
  • Ingår i: Journal of Systems and Software. - : Elsevier BV. - 0164-1212 .- 1873-1228. ; 191
  • Tidskriftsartikel (refereegranskat)abstract
    • Deep learning (DL) based software systems are difficult to develop and maintain in industrial settings due to several challenges. Data management is one of the most prominent challenges which complicates DL in industrial deployments. DL models are data-hungry and require high-quality data. Therefore, the volume, variety, velocity, and quality of data cannot be compromised. This study aims to explore the data management challenges encountered by practitioners developing systems with DL components, identify the potential solutions from the literature and validate the solutions through a multiple case study. We identified 20 data management challenges experienced by DL practitioners through a multiple interpretive case study. Further, we identified 48 articles through a systematic literature review that discuss the solutions for the data management challenges. With the second round of multiple case study, we show that many of these solutions have limitations and are not used in practice due to a combination of four factors: high cost, lack of skill-set and infrastructure, inability to solve the problem completely, and incompatibility with certain DL use cases. Thus, data management for data-intensive DL models in production is complicated. Although the DL technology has achieved very promising results, there is still a significant need for further research in the field of data management to build high-quality datasets and streams that can be used for building production-ready DL systems. Furthermore, we have classified the data management challenges into four categories based on the availability of the solutions.(c) 2022 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
  •  
8.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Data Pipeline Management in Practice: Challenges and Opportunities
  • 2020
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer International Publishing. - 1611-3349 .- 0302-9743. ; 12562, s. 168-184, s. 168-184
  • Konferensbidrag (refereegranskat)abstract
    • Data pipelines involve a complex chain of interconnected activities that starts with a data source and ends in a data sink. Data pipelines are important for data-driven organizations since a data pipeline can process data in multiple formats from distributed data sources with minimal human intervention, accelerate data life cycle activities, and enhance productivity in data-driven enterprises. However, there are challenges and opportunities in implementing data pipelines but practical industry experiences are seldom reported. The findings of this study are derived by conducting a qualitative multiple-case study and interviews with the representatives of three companies. The challenges include data quality issues, infrastructure maintenance problems, and organizational barriers. On the other hand, data pipelines are implemented to enable traceability, fault-tolerance, and reduce human errors through maximizing automation thereby producing high-quality data. Based on multiple-case study research with five use cases from three case companies, this paper identifies the key challenges and benefits associated with the implementation and use of data pipelines.
  •  
9.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Maturity Assessment Model for Industrial Data Pipelines
  • 2023
  • Ingår i: Proceedings - Asia-Pacific Software Engineering Conference, APSEC. - : IEEE Computer Society Digital Library. - 1530-1362. ; 30th Asia-Pacific Software Engineering Conference, APSEC 2023, s. 503-513
  • Konferensbidrag (refereegranskat)abstract
    • Data pipelines can be defined as a complex chain of interconnected activities that starts with a data source and ends in a data sink. They can process data in multiple formats from various data sources with minimal human intervention, speed up data life cycle operations, and enhance productivity in data-driven organizations. As a result, companies place a high value on strengthening the maturity of their data pipelines. The available literature, on the other hand, is significantly insufficient in terms of providing a comprehensive roadmap to guide companies in assessing the maturity of their data pipelines. Therefore, this case study focuses on developing a data pipeline maturity assessment model that can evaluate the maturity of data pipelines in a staged manner from maturity level 1 to maturity level 5. We conducted empirical research in order to develop the maturity assessment model on the basis of five different determinants to address the specific needs of each data pipeline maturity level. Accordingly, it aims to support organizations in assessing their current data pipeline maturity, determining challenges at each stage, and preparing an extensive roadmap and suggestions for data pipeline maturity improvement. In future work, we plan to employ the maturity model in different companies as a case study to evaluate its applicability and usefulness.
  •  
10.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Modelling Data Pipelines
  • 2020
  • Ingår i: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020. - : IEEE. ; , s. 13-20
  • Konferensbidrag (refereegranskat)abstract
    • Data is the new currency and key to success. However, collecting high-quality data from multiple distributed sources requires much effort. In addition, there are several other challenges involved while transporting data from its source to the destination. Data pipelines are implemented in order to increase the overall efficiency of data-flow from the source to the destination since it is automated and reduces the human involvement which is required otherwise. Despite existing research on ETL (Extract-Transform-Load) and ELT (Extract-Load-Transform) pipelines, the research on this topic is limited. ETL/ELT pipelines are abstract representations of the end-to-end data pipelines. To utilize the full potential of the data pipeline, we should understand the activities in it and how they are connected in an end-to-end data pipeline. This study gives an overview of how to design a conceptual model of data pipeline which can be further used as a language of communication between different data teams. Furthermore, it can be used for automation of monitoring, fault detection, mitigation and alarming at different steps of data pipeline.
  •  
11.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • On the Impact of ML use cases on Industrial Data Pipelines
  • 2021
  • Ingår i: Proceedings - Asia-Pacific Software Engineering Conference, APSEC. - : IEEE. - 1530-1362. ; 2021-December, s. 463-472, s. 463-472
  • Konferensbidrag (refereegranskat)abstract
    • The impact of the Artificial Intelligence revolution is undoubtedly substantial in our society, life, firms, and employment. With data being a critical element, organizations are working towards obtaining high-quality data to train their AI models. Although data, data management, and data pipelines are part of industrial practice even before the introduction of ML models, the significance of data increased further with the advent of ML models, which force data pipeline developers to go beyond the traditional focus on data quality. The objective of this study is to analyze the impact of ML use cases on data pipelines. We assume that the data pipelines that serve ML models are given more importance compared to the conventional data pipelines. We report on a study that we conducted by observing software teams at three companies as they develop both conventional(Non-ML) data pipelines and data pipelines that serve ML-based applications. We study six data pipelines from three companies and categorize them based on their criticality and purpose. Further, we identify the determinants that can be used to compare the development and maintenance of these data pipelines. Finally, we map these factors in a two-dimensional space to illustrate their importance on a scale of low, moderate, and high.
  •  
12.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • On the Trade-off Between Robustness and Complexity in Data Pipelines
  • 2021
  • Ingår i: Quality of Information and Communications Technology. - Cham : Springer. - 9783030853464 - 9783030853471 ; 1439 CCIS, s. 401-415
  • Konferensbidrag (refereegranskat)abstract
    • Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, transforming, combining, validating, and loading data for further analysis and visualization. However, the robustness of data pipelines is equally important since unhealthy data pipelines can add more noise to the input data. This paper identifies the essential elements for a robust data pipeline and analyses the trade-off between data pipeline robustness and complexity.
  •  
13.
  • Munappy, Aiswarya Raj, 1990 (författare)
  • Synergizing Data Management, DataOps, and Data Pipelines for AI-Enhanced Embedded Systems
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Context: Data management is a critical aspect of any artificial intelligence (AI) initiative, playing a pivotal role in the development, training, and deployment of AI models. A well-structured approach to data management ensures that AI models are trained on reliable data, comply with ethical standards, and contribute positively to decision-making processes in embedded systems. Objectives: This thesis is structured around three primary objectives. The first objective is to comprehensively understand and address the data management challenges associated with embedded systems. Building upon this understanding, the second objective is to explore the data management practices that can help alleviate the challenges of data management. Finally, the third objective aims to develop and validate the implementation approaches for enhanced data management. Method: To achieve the objectives, we conducted research in close collaboration with industry and used a combination of different empirical research  methods like interpretive case studies, literature reviews, and action research.  Results: This thesis presents six main results. First, it identifies and categorizes data management challenges, solutions, and limitations. Second, it presents a stairway model delineating the stages of the evolution towards DataOps. Third, it proposes a model for evaluating the maturity of data pipelines and identifies determinants to assess the impact of machine learning (ML) on data pipelines. Fourth, it identifies the differences between unidirectional and bidirectional data pipelines and the significance, benefits, and challenges of bidirectional data pipelines. The thesis also provides a roadmap for the smooth migration from unidirectional to bidirectional data pipelines. Fifth, it presents and validates the conceptual model of an end-to-end data pipeline for ML/DL models. Finally, it presents and validates fault-tolerant data pipelines and an AI-powered 4-stage model for automated fault recovery in data pipelines. Conclusion: In conclusion, this thesis demonstrates a well-structured approach to data management in AI-enhanced embedded systems, supported by  innovative practices and robust implementation approaches, that is essential for ensuring the reliability, and effectiveness of data in decision-making processes.
  •  
14.
  • Munappy, Aiswarya Raj, 1990, et al. (författare)
  • Towards automated detection of data pipeline faults
  • 2020
  • Ingår i: Proceedings - Asia-Pacific Software Engineering Conference, APSEC. - : IEEE. - 1530-1362. ; 2020-December, s. 346-355, s. 346-355
  • Konferensbidrag (refereegranskat)abstract
    • Data pipelines play an important role throughout the data management process. It automates the steps ranging from data generation to data reception thereby reducing the human intervention. A failure or fault in a single step of a data pipeline has cascading effects that might result in hours of manual intervention and clean-up. Data pipeline failure due to faults at different stages of data pipelines is a common challenge that eventually leads to significant performance degradation of data-intensive systems. To ensure early detection of these faults and to increase the quality of the data products, continuous monitoring and fault detection mechanism should be included in the data pipeline. In this study, we have explored the need for incorporating automated fault detection mechanisms and mitigation strategies at different stages of the data pipeline. Further, we identified faults at different stages of the data pipeline and possible mitigation strategies that can be adopted for reducing the impact of data pipeline faults thereby improving the quality of data products. The idea of incorporating fault detection and mitigation strategies is validated by realizing a small part of the data pipeline using action research in the analytics team at a large software-intensive organization within the telecommunication domain.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-14 av 14

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy