SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Utökad sökning

AMNE:(NATURVETENSKAP) AMNE:(Data och informationsvetenskap) AMNE:(Programvaruteknik)
 

Sökning: AMNE:(NATURVETENSKAP) AMNE:(Data och informationsvetenskap) AMNE:(Programvaruteknik) > DQSOps :

DQSOps : Data Quality Scoring Operations Framework for Data-Driven Applications

Bayram, Firas (författare)
Karlstads universitet,Institutionen för matematik och datavetenskap (from 2013)
Ahmed, Bestoun S., 1982- (författare)
Karlstads universitet,Institutionen för matematik och datavetenskap (from 2013)
Hallin, Erik (författare)
Uddeholms AB, Sweden
visa fler...
Engman, Anton (författare)
Uddeholms AB, Sweden
visa färre...
 (creator_code:org_t)
Association for Computing Machinery (ACM), 2023
2023
Engelska.
Ingår i: EASE '23: Proceedings of the 27<sup>th</sup> International Conference on Evaluation and Assessment in Software Engineering. - : Association for Computing Machinery (ACM). - 9798400700446 ; , s. 32-41
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • Data quality assessment has become a prominent component in the successful execution of complex data-driven artificial intelligence (AI) software systems. In practice, real-world applications generate huge volumes of data at speeds. These data streams require analysis and preprocessing before being permanently stored or used in a learning task. Therefore, significant attention has been paid to the systematic management and construction of high-quality datasets. Nevertheless, managing voluminous and high-velocity data streams is usually performed manually (i.e. offline), making it an impractical strategy in production environments. To address this challenge, DataOps has emerged to achieve life-cycle automation of data processes using DevOps principles. However, determining the data quality based on a fitness scale constitutes a complex task within the framework of DataOps. This paper presents a novel Data Quality Scoring Operations (DQSOps) framework that yields a quality score for production data in DataOps workflows. The framework incorporates two scoring approaches, an ML prediction-based approach that predicts the data quality score and a standard-based approach that periodically produces the ground-truth scores based on assessing several data quality dimensions. We deploy the DQSOps framework in a real-world industrial use case. The results show that DQSOps achieves significant computational speedup rates compared to the conventional approach of data quality scoring while maintaining high prediction performance.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Programvaruteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Software Engineering (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)

Nyckelord

Data reduction
Quality control
Automated data
Automated data scoring
Data assessment
Data quality; Data quality dimensions
Data stream
Data-driven applications
Dataops
Mutation testing
Real-world
Life cycle
Computer Science
Datavetenskap

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy