Sökning: onr:"swepub:oai:lup.lub.lu.se:4e3a4332-ab61-45bb-aa7f-016634d7520b" >
Autonomous Monitors...
Autonomous Monitors for Detecting Failures Early and Reporting Interpretable Alerts in Cloud Operations
-
- Hrusto, Adha (författare)
- Lund University,Lunds universitet,Programvarusystem,Institutionen för datavetenskap,Institutioner vid LTH,Lunds Tekniska Högskola,LTH profilområde: AI och digitalisering,LTH profilområden,Software Engineering Research Group,Department of Computer Science,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH
-
- Runeson, Per (författare)
- Lund University,Lunds universitet,Programvarusystem,Institutionen för datavetenskap,Institutioner vid LTH,Lunds Tekniska Högskola,LTH profilområde: AI och digitalisering,LTH profilområden,LU profilområde: Naturlig och artificiell kognition,Lunds universitets profilområden,Software Engineering Research Group,Department of Computer Science,Departments at LTH,Faculty of Engineering, LTH,LTH Profile Area: AI and Digitalization,LTH Profile areas,Faculty of Engineering, LTH,LU Profile Area: Natural and Artificial Cognition,Lund University Profile areas
-
- Ohlsson, Magnus C (författare)
- System Verification Sweden AB
-
(creator_code:org_t)
- 2024
- 2024
- Engelska 11 s.
-
Ingår i: 46th International Conference on Software Engineering: Software Engineering in Practice. - 9798400705014
- Relaterad länk:
-
https://portal.resea... (primary) (free)
-
visa fler...
-
http://dx.doi.org/10... (free)
-
https://lup.lub.lu.s...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- Detecting failures early in cloud-based software systems is highly significant as it can reduce operational costs, enhance service reliability, and improve user experience. Many existing approaches include anomaly detection in metrics or a blend of metric and log features. However, such approaches tend to be very complex and hardly explainable, and consequently non-trivial for implementation and evaluation in industrial contexts. In collaboration with a case company and their cloud-based system in the domain of PIM (Product Information Management), we propose and implement autonomous monitors for proactive monitoring across multiple services of distributed software architecture, fused with anomaly detection in performance metrics and log analysis using GPT-3. We demonstrated that operations engineers tend to be more efficient by having access to interpretable alert notifications based on detected anomalies that contain information about implications and potential root causes. Additionally, proposed autonomous monitors turned out to be beneficial for the timely identification and revision of potential issues before they propagate and cause severe consequences.
Ämnesord
- TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
- ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)
Publikations- och innehållstyp
- kon (ämneskategori)
- ref (ämneskategori)
Hitta via bibliotek
Till lärosätets databas