SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Stadler Rolf Prof.) "

Sökning: WFRF:(Stadler Rolf Prof.)

  • Resultat 1-10 av 29
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Yanggratoke, Rerngvit, 1983- (författare)
  • Data-driven Performance Prediction and Resource Allocation for Cloud Services
  • 2016
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Cloud services, which provide online entertainment, enterprise resource management, tax filing, etc., are becoming essential for consumers, businesses, and governments. The key functionalities of such services are provided by backend systems in data centers. This thesis focuses on three fundamental problems related to management of backend systems. We address these problems using data-driven approaches: triggering dynamic allocation by changes in the environment, obtaining configuration parameters from measurements, and learning from observations. The first problem relates to resource allocation for large clouds with potentially hundreds of thousands of machines and services. We developed and evaluated a generic gossip protocol for distributed resource allocation. Extensive simulation studies suggest that the quality of the allocation is independent of the system size for the management objectives considered.The second problem focuses on performance modeling of a distributed key-value store, and we study specifically the Spotify backend for streaming music. We developed analytical models for system capacity under different data allocation policies and for response time distribution. We evaluated the models by comparing model predictions with measurements from our lab testbed and from the Spotify operational environment. We found the prediction error to be below 12% for all investigated scenarios.The third problem relates to real-time prediction of service metrics, which we address through statistical learning. Service metrics are learned from observing device and network statistics. We performed experiments on a server cluster running video streaming and key-value store services. We showed that feature set reduction significantly improves the prediction accuracy, while simultaneously reducing model computation time. Finally, we designed and implemented a real-time analytics engine, which produces model predictions through online learning.
  •  
2.
  • Gonzalez Prieto, Alberto, 1977- (författare)
  • Adaptive Real-time Monitoring for Large-scale Networked Systems
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Large-scale networked systems, such as the Internet and server clusters, are omnipresent today. They increasingly deliver services that are critical to both businesses and the society at large, and therefore their continuous and correct operation must be guaranteed. Achieving this requires the realization of adaptive management systems, which continuously reconfigure such large-scale dynamic systems, in order to maintain their state near a desired operating point, despite changes in the networking conditions.The focus of this thesis is continuous real-time monitoring, which is essential for the realization of adaptive management systems in large-scale dynamic environments. Real-time monitoring provides the necessary input to the decision-making process of network management, enabling management systems to perform self-configuration and self-healing tasks.We have developed, implemented, and evaluated a design for real-time continuous monitoring of global metrics with performance objectives, such as monitoring overhead and estimation accuracy. Global metrics describe the state of the system as a whole, in contrast to local metrics, such as device counters or local protocol states, which capture the state of a local entity. Global metrics are computed from local metrics using aggregation functions, such as SUM, AVERAGE and MAX.Our approach is based on in-network aggregation, where global metrics are incrementally computed using spanning trees. Performance objectives are achieved through filtering updates to local metrics that are sent along that tree. A key part in the design is a model for the distributed monitoring process that relates performance metrics to parameters that tune the behavior of a monitoring protocol. The model allows us to describe the behavior of individual nodes in the spanning tree in their steady state. The model has been instrumental in designing a monitoring protocol that is controllable and achieves given performance objectives.We have evaluated our protocol, called A-GAP, experimentally, through simulation and testbed implementation. It has proved to be effective in meeting performance objectives, efficient, adaptive to changes in the networking conditions, controllable along different performance dimensions, and scalable. We have implemented a prototype on a testbed of commercial routers. The testbed measurements are consistent with simulation studies we performed for different topologies and network sizes. This proves the feasibility of the design, and, more generally, the feasibility of effective and efficient real-time monitoring in large network environments.
  •  
3.
  • Ahmed, J., et al. (författare)
  • Automated diagnostic of virtualized service performance degradation
  • 2018
  • Ingår i: Proceedings 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018. - New York : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 1-9
  • Konferensbidrag (refereegranskat)abstract
    • Service assurance for cloud applications is a challenging task and is an active area of research for academia and industry. One promising approach is to utilize machine learning for service quality prediction and fault detection so that suitable mitigation actions can be executed. In our previous work, we have shown how to predict service-level metrics in real-time just from operational data gathered at the server side. This gives the service provider early indications on whether the platform can support the current load demand. This paper provides the logical next step where we extend our work by proposing an automated detection and diagnostic capability for the performance faults manifesting themselves in cloud and datacenter environments. This is a crucial task to maintain the smooth operation of running services and minimizing downtime. We demonstrate the effectiveness of our approach which exploits the interpretative capabilities of Self- Organizing Maps (SOMs) to automatically detect and localize different performance faults for cloud services.
  •  
4.
  • Burgess, Mark, et al. (författare)
  • Network patterns in cfengine and scalable data aggregation
  • 2007
  • Ingår i: USENIX ASSOCIATION PROCEEDING OF THE 21ST LARGE INSTALLATION SYSTEMS ADMINISTRATION CONFERENCE. - : USENIX ASSOC. ; , s. 275-
  • Konferensbidrag (refereegranskat)abstract
    • Network patterns are based on generic algorithms that execute on tree-based overlays. A set of such patterns has been developed at KTH to support distributed monitoring in networks with non-trivial topologies. We consider the use of this approach in logical peer networks in cfengine as a way of scaling aggregation of data to large organizations. Use of 'deep' network structures can lead to temporal anomalies. We show how to minimize temporal fragmentation during data aggregation by using time offsets and what effect these choices might have on power consumption. We offer proof of concept for this technology to initiate either multicast or inverse multicast pulses through sensor networks.
  •  
5.
  • Chemouil, Prosper, et al. (författare)
  • Special Issue on Advances in Artificial Intelligence and Machine Learning for Networking
  • 2020
  • Ingår i: IEEE Journal on Selected Areas in Communications. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 0733-8716 .- 1558-0008. ; 38:10, s. 2229-2233
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • Artificial Intelligence (AI) and Machine Learning (ML) approaches have emerged in the networking domain with great expectation. They can be broadly divided into AI/ML techniques for network engineering and management, network designs for AI/ML applications, and system concepts. AI/ML techniques for networking and management improve the way we address networking. They support efficient, rapid, and trustworthy engineering, operations, and management. As such, they meet the current interest in softwarization and network programmability that fuels the need for improved network automation in agile infrastructures, including edge and fog environments. Network design and optimization for AI/ML applications addresses the complementary topic of supporting AI/ML-based systems through novel networking techniques, including new architectures and algorithms. The third topic area is system implementation and open-source software development.
  •  
6.
  •  
7.
  • Di Fatta, G., et al. (författare)
  • Preface
  • 2011
  • Ingår i: IEEE International Conference on Data Mining. Proceedings. - : Institute of Electrical and Electronics Engineers (IEEE). - 1550-4786. ; , s. xlviii-xlvix
  • Tidskriftsartikel (refereegranskat)
  •  
8.
  • Hammar, Kim, et al. (författare)
  • A System for Interactive Examination of Learned Security Policies
  • 2022
  • Ingår i: Proceedings of the IEEE/IFIP Network Operations and Management Symposium 2022. - : IEEE.
  • Konferensbidrag (refereegranskat)abstract
    • We present a system for interactive examination of learned security policies. It allows a user to traverse episodes of Markov decision processes in a controlled manner and to track the actions triggered by security policies. Similar to a software debugger, a user can continue or or halt an episode at any time step and inspect parameters and probability distributions of interest. The system enables insight into the structure of a given policy and in the behavior of a policy in edge cases. We demonstrate the system with a network intrusion use case. We examine the evolution of an IT infrastructure's state and the actions prescribed by security policies while an attack occurs. The policies for the demonstration have been obtained through a reinforcement learning approach that includes a simulation system where policies are incrementally learned and an emulation system that produces statistics that drive the simulation runs.
  •  
9.
  • Hammar, Kim, et al. (författare)
  • An Online Framework for Adapting Security Policies in Dynamic IT Environments
  • 2022
  • Ingår i: 2022 18Th International Conference On Network And Service Management (CNSM 2022). - : IEEE.
  • Konferensbidrag (refereegranskat)abstract
    • We present an online framework for learning and updating security policies in dynamic IT environments. It includes three components: a digital twin of the target system, which continuously collects data and evaluates learned policies; a system identification process, which periodically estimates system models based on the collected data; and a policy learning process that is based on reinforcement learning. To evaluate our framework, we apply it to an intrusion prevention use case that involves a dynamic IT infrastructure. Our results demonstrate that the framework automatically adapts security policies to changes in the IT infrastructure and that it outperforms a stateof-the-art method.
  •  
10.
  • Hammar, Kim, et al. (författare)
  • Digital Twins for Security Automation
  • 2023
  • Ingår i: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023. - : Institute of Electrical and Electronics Engineers (IEEE).
  • Konferensbidrag (refereegranskat)abstract
    • We present a novel emulation system for creating high-fidelity digital twins of IT infrastructures. The digital twins replicate key functionality of the corresponding infrastructures and allow to play out security scenarios in a safe environment. We show that this capability can be used to automate the process of finding effective security policies for a target infrastructure. In our approach, a digital twin of the target infrastructure is used to run security scenarios and collect data. The collected data is then used to instantiate simulations of Markov decision processes and learn effective policies through reinforcement learning, whose performances are validated in the digital twin. This closed-loop learning process executes iteratively and provides continuously evolving and improving security policies. We apply our approach to an intrusion response scenario. Our results show that the digital twin provides the necessary evaluative feedback to learn near-optimal intrusion response policies.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 29

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy