SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Ali Eldin Ahmed) "

Sökning: WFRF:(Ali Eldin Ahmed)

  • Resultat 1-50 av 51
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ademuyiwa, Adesoji O., et al. (författare)
  • Determinants of morbidity and mortality following emergency abdominal surgery in children in low-income and middle-income countries
  • 2016
  • Ingår i: BMJ Global Health. - : BMJ Publishing Group Ltd. - 2059-7908. ; 1:4
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Child health is a key priority on the global health agenda, yet the provision of essential and emergency surgery in children is patchy in resource-poor regions. This study was aimed to determine the mortality risk for emergency abdominal paediatric surgery in low-income countries globally.Methods: Multicentre, international, prospective, cohort study. Self-selected surgical units performing emergency abdominal surgery submitted prespecified data for consecutive children aged <16 years during a 2-week period between July and December 2014. The United Nation's Human Development Index (HDI) was used to stratify countries. The main outcome measure was 30-day postoperative mortality, analysed by multilevel logistic regression.Results: This study included 1409 patients from 253 centres in 43 countries; 282 children were under 2 years of age. Among them, 265 (18.8%) were from low-HDI, 450 (31.9%) from middle-HDI and 694 (49.3%) from high-HDI countries. The most common operations performed were appendectomy, small bowel resection, pyloromyotomy and correction of intussusception. After adjustment for patient and hospital risk factors, child mortality at 30 days was significantly higher in low-HDI (adjusted OR 7.14 (95% CI 2.52 to 20.23), p<0.001) and middle-HDI (4.42 (1.44 to 13.56), p=0.009) countries compared with high-HDI countries, translating to 40 excess deaths per 1000 procedures performed.Conclusions: Adjusted mortality in children following emergency abdominal surgery may be as high as 7 times greater in low-HDI and middle-HDI countries compared with high-HDI countries. Effective provision of emergency essential surgery should be a key priority for global child health agendas.
  •  
2.
  • Thomas, HS, et al. (författare)
  • 2019
  • swepub:Mat__t
  •  
3.
  •  
4.
  • Naghavi, Mohsen, et al. (författare)
  • Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990-2013: a systematic analysis for the Global Burden of Disease Study 2013
  • 2015
  • Ingår i: The Lancet. - 1474-547X .- 0140-6736. ; 385:9963, s. 117-171
  • Tidskriftsartikel (refereegranskat)abstract
    • Background Up-to-date evidence on levels and trends for age-sex-specifi c all-cause and cause-specifi c mortality is essential for the formation of global, regional, and national health policies. In the Global Burden of Disease Study 2013 (GBD 2013) we estimated yearly deaths for 188 countries between 1990, and 2013. We used the results to assess whether there is epidemiological convergence across countries. Methods We estimated age-sex-specifi c all-cause mortality using the GBD 2010 methods with some refinements to improve accuracy applied to an updated database of vital registration, survey, and census data. We generally estimated cause of death as in the GBD 2010. Key improvements included the addition of more recent vital registration data for 72 countries, an updated verbal autopsy literature review, two new and detailed data systems for China, and more detail for Mexico, UK, Turkey, and Russia. We improved statistical models for garbage code redistribution. We used six different modelling strategies across the 240 causes; cause of death ensemble modelling (CODEm) was the dominant strategy for causes with sufficient information. Trends for Alzheimer's disease and other dementias were informed by meta-regression of prevalence studies. For pathogen-specifi c causes of diarrhoea and lower respiratory infections we used a counterfactual approach. We computed two measures of convergence (inequality) across countries: the average relative difference across all pairs of countries (Gini coefficient) and the average absolute difference across countries. To summarise broad findings, we used multiple decrement life-tables to decompose probabilities of death from birth to exact age 15 years, from exact age 15 years to exact age 50 years, and from exact age 50 years to exact age 75 years, and life expectancy at birth into major causes. For all quantities reported, we computed 95% uncertainty intervals (UIs). We constrained cause-specific fractions within each age-sex-country-year group to sum to all-cause mortality based on draws from the uncertainty distributions. Findings Global life expectancy for both sexes increased from 65.3 years (UI 65.0-65.6) in 1990, to 71.5 years (UI 71.0-71.9) in 2013, while the number of deaths increased from 47.5 million (UI 46.8-48.2) to 54.9 million (UI 53.6-56.3) over the same interval. Global progress masked variation by age and sex: for children, average absolute diff erences between countries decreased but relative diff erences increased. For women aged 25-39 years and older than 75 years and for men aged 20-49 years and 65 years and older, both absolute and relative diff erences increased. Decomposition of global and regional life expectancy showed the prominent role of reductions in age-standardised death rates for cardiovascular diseases and cancers in high-income regions, and reductions in child deaths from diarrhoea, lower respiratory infections, and neonatal causes in low-income regions. HIV/AIDS reduced life expectancy in southern sub-Saharan Africa. For most communicable causes of death both numbers of deaths and age-standardised death rates fell whereas for most non-communicable causes, demographic shifts have increased numbers of deaths but decreased age-standardised death rates. Global deaths from injury increased by 10.7%, from 4.3 million deaths in 1990 to 4.8 million in 2013; but age-standardised rates declined over the same period by 21%. For some causes of more than 100 000 deaths per year in 2013, age-standardised death rates increased between 1990 and 2013, including HIV/AIDS, pancreatic cancer, atrial fibrillation and flutter, drug use disorders, diabetes, chronic kidney disease, and sickle-cell anaemias. Diarrhoeal diseases, lower respiratory infections, neonatal causes, and malaria are still in the top five causes of death in children younger than 5 years. The most important pathogens are rotavirus for diarrhoea and pneumococcus for lower respiratory infections. Country-specific probabilities of death over three phases of life were substantially varied between and within regions. Interpretation For most countries, the general pattern of reductions in age-sex specifi c mortality has been associated with a progressive shift towards a larger share of the remaining deaths caused by non-communicable disease and injuries. Assessing epidemiological convergence across countries depends on whether an absolute or relative measure of inequality is used. Nevertheless, age-standardised death rates for seven substantial causes are increasing, suggesting the potential for reversals in some countries. Important gaps exist in the empirical data for cause of death estimates for some countries; for example, no national data for India are available for the past decade.
  •  
5.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • How will your workload look like in 6 years? : Analyzing Wikimedia's workload
  • 2014
  • Ingår i: Proceedings of the 2014 IEEE International Conference on Cloud Engineering (IC2E 2014). - : IEEE Computer Society. - 9781479937660 ; , s. 349-354
  • Konferensbidrag (refereegranskat)abstract
    • Accurate understanding of workloads is key to efficient cloud resource management as well as to the design of large-scale applications. We analyze and model the workload of Wikipedia, one of the world's largest web sites. With descriptive statistics, time-series analysis, and polynomial splines, we study the trend and seasonality of the workload, its evolution over the years, and also investigate patterns in page popularity. Our results indicate that the workload is highly predictable with a strong seasonality. Our short term prediction algorithm is able to predict the workload with a Mean Absolute Percentage Error of around 2%.
  •  
6.
  • Rahmanian, Ali, 1989-, et al. (författare)
  • CVF : Cross-Video Filtration on the edge
  • 2024
  • Ingår i: MMSys '24. - : Association for Computing Machinery (ACM). - 9798400704123 ; , s. 231-242
  • Konferensbidrag (refereegranskat)abstract
    • Many edge applications rely on expensive Deep-Neural-Network (DNN) inference-based video analytics. Typically, a single instance of an inference service analyzes multiple real-time camera streams concurrently. In many cases,  only a fraction of these streams contain objects-of-interest at a given time. Hence, it is a waste of computational resources to process all frames from all cameras using the DNNs. On-camera filtration of frames has been suggested as a possible solution to improve the system efficiency and reduce resource wastage. However, many cameras do not have on-camera processing or filtering capabilities. In addition, filtration can be enhanced if frames across the different feeds are selected and prioritized for processing based on the system load and the available resource capacity. This paper introduces CVF, a Cross-video Filtration framework designed around video content and resource constraints. The CVF pipeline leverages compressed-domain data from encoded video formats, lightweight binary classification models, and an efficient prioritization algorithm. This enables the effective filtering of cross-camera frames from multiple sources, processing only a fraction of frames using resource-intensive DNN models. Our experiments show that CVF is capable of reducing the overall response time of video analytics pipelines by up to 50% compared to state-of-the-art solutions while increasing the throughput by up to 120%.
  •  
7.
  • Rahmanian, Ali, 1989- (författare)
  • Edge orchestration for latency-sensitive applications
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The emerging edge computing infrastructure provides distributed and heterogeneous resources closer to where data is generated and where end-users are located, thereby significantly reducing latency. With the recent advances in telecommunication systems, software architecture, and machine learning, there is a noticeable increase in applications that require processing times within tight latency constraints, i.e. latency-sensitive applications. For instance, numerous video analytics applications, such as traffic control systems, necessitate real-time processing capabilities. Orchestrating such applications at the edge offers numerous advantages, including lower latency, optimized bandwidth utilization, and enhanced scalability. However, despite its potential, effectively managing such latency-sensitive applications at the edge poses several challenges such as constrained compute resources, which holds back the full promise of edge computing.This thesis proposes approaches to efficiently deploy latency-sensitive applications on the edge infrastructure. It partly addresses general applications with microservice architectures and party addresses the increasingly more important video analytics applications for the edge. To do so, this thesis proposes various application- and system-level solutions aiming to efficiently utilize constrained compute capacity on the edge while meeting prescribed latency constraints. These solutions primarily focus on effective resource management approaches and optimizing incoming workload inputs, considering the constrained compute capacity of edge resources. Additionally, the thesis explores the synergy effects of employing both application- and system-level resource optimization approaches together.The results demonstrate  the effectiveness of the proposed solutions in enhancing the utilization of edge resources for latency-sensitive applications while adhering to application constraints. The proposed resource management solutions, alongside application-level optimization techniques, significantly improve resource efficiency while satisfying application requirements. Our results show that our solutions for microservice architectures significantly improve end-to-end latency by up to 800% while minimizing edge resource usage. Additionally, the results indicate that our application- and system-level optimizations for orchestrating edge resources for video analytics applications can increase the overall throughput by up to 60%. 
  •  
8.
  • Rahmanian, Ali, et al. (författare)
  • Microsplit : efficient splitting of microservices on edge clouds
  • 2022
  • Ingår i: 2022 IEEE/ACM 7th Symposium on Edge Computing (SEC). - : IEEE. - 9781665486118 - 9781665486125 ; , s. 252-264
  • Konferensbidrag (refereegranskat)abstract
    • Edge cloud systems reduce the latency between users and applications by offloading computations to a set of small-scale computing resources deployed at the edge of the network. However, since edge resources are constrained, they can become saturated and bottlenecked due to increased load, resulting in an exponential increase in response times or failures. In this paper, we argue that an application can be split between the edge and the cloud, allowing for better performance compared to full migration to the cloud, releasing precious resources at the edge. We model an application's internal call-Graph as a Directed-Acyclic-Graph. We use this model to develop MicroSplit, a tool for efficient splitting of microservices between constrained edge resources and large-scale distant backend clouds. MicroSplit analyzes the dependencies between the microservices of an application, and using the Louvain method for community detection---a popular algorithm from Network Science---decides how to split the microservices between the constrained edge and distant data centers. We test MicroSplit with four microservice based applications in various realistic cloud-edge settings. Our results show that Microsplit migrates up to 60% of the microservices of an application with a slight increase in the mean-response time compared to running on the edge, and a latency reduction of up to 800% compared to migrating the entire application to the cloud. Compared to other methods from the State-of-the-Art, MicroSplit reduces the total number of services on the edge by up to five times, with minimal reduction in response times.
  •  
9.
  • Rahmanian, Ali, et al. (författare)
  • RAVAS: interference-aware model selection and resource allocation for live edge video analytics
  • 2023
  • Ingår i: 2023 IEEE/ACM Symposium on Edge Computing (SEC). - : Institute of Electrical and Electronics Engineers (IEEE). - 9798400701238 ; , s. 27-39, s. 27-39
  • Konferensbidrag (refereegranskat)abstract
    • Numerous edge applications that rely on video analytics demand precise, low-latency processing of multiple video streams from cameras. When these cameras are mobile, such as when mounted on a car or a robot, the processing load on the shared edge GPU can vary considerably. Provisioning the edge with GPUs for the worst-case load can be expensive and, for many applications, not feasible. In this paper, we introduce RAVAS, a Real-time Adaptive stream Video Analytics System that enables efficient edge GPU sharing for processing streams from various mobile cameras. RAVAS uses Q-Learning to choose between a set of Deep Neural Network (DNN) models with varying accuracy and processing requirements based on the current GPU utilization and workload. RAVAS employs an innovative resource allocation strategy to mitigate interference during concurrent GPU execution. Compared to state-of-the-art approaches, our results show that RAVAS incurs 57% less compute overhead, achieves 41% improvement in latency, and 43% savings in total GPU usage for a single video stream. Processing multiple concurrent video streams results in up to 99% and 40% reductions in latency and overall GPU usage, respectively, while meeting the accuracy constraints.
  •  
10.
  •  
11.
  • Zhang, Huaifeng, 1994, et al. (författare)
  • Machine learning systems are bloated and vulnerable
  • 2024
  • Ingår i: Proceedings of the ACM on Measurement and Analysis of Computing Systems. - 2476-1249. ; 8:1
  • Tidskriftsartikel (refereegranskat)abstract
    • Today's software is bloated with both code and features that are not used by most users. This bloat is prevalent across the entire software stack, from operating systems and applications to containers. Containers are lightweight virtualization technologies used to package code and dependencies, providing portable, reproducible and isolated environments. For their ease of use, data scientists often utilize machine learning containers to simplify their workflow. However, this convenience comes at a cost: containers are often bloated with unnecessary code and dependencies, resulting in very large sizes. In this paper, we analyze and quantify bloat in machine learning containers.We develop MMLB, a framework for analyzing bloat in software systems, focusing on machine learning containers. MMLB measures the amount of bloat at both the container and package levels, quantifying the sources of bloat. In addition, MMLB integrates with vulnerability analysis tools and performs package dependency analysis to evaluate the impact of bloat on container vulnerabilities. Through experimentation with 15 machine learning containers from TensorFlow, PyTorch, and Nvidia, we show that bloat accounts for up to 80% of machine learning container sizes, increasing container provisioning times by up to 370% and exacerbating vulnerabilities by up to 99%.
  •  
12.
  • Zhang, Huaifeng, 1994, et al. (författare)
  • Machine learning systems are bloated and vulnerable
  • 2024
  • Ingår i: SIGMETRICS/PERFORMANCE 2024 - Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems. ; , s. 37-38
  • Konferensbidrag (refereegranskat)abstract
    • Today's software is bloated with both code and features that are not used by most users. This bloat is prevalent across the entire software stack, from operating systems and applications to containers. Containers are lightweight virtualization technologies used to package code and dependencies, providing portable, reproducible and isolated environments. For their ease of use, data scientists often utilize machine learning containers to simplify their workflow. However, this convenience comes at a cost: containers are often bloated with unnecessary code and dependencies, resulting in very large sizes. In this paper, we analyze and quantify bloat in machine learning containers. We develop MMLB, a framework for analyzing bloat in software systems, focusing on machine learning containers. MMLB measures the amount of bloat at both the container and package levels, quantifying the sources of bloat. In addition, MMLB integrates with vulnerability analysis tools and performs package dependency analysis to evaluate the impact of bloat on container vulnerabilities. Through experimentation with 15 machine learning containers from TensorFlow, PyTorch, and Nvidia, we show that bloat accounts for up to 80% of machine learning container sizes, increasing container provisioning times by up to 370% and exacerbating vulnerabilities by up to 99%. For more detail, see the full paper, Huaifeng Zhang, Mohannad Alhanahnah, Fahmi Abdulqadir Ahmed, Dyako Fatih, Philipp Leitner, and Ahmed Ali-Eldin. 2024. Machine Learning Systems are Bloated and Vulnerable. Proceedings of the ACM on Measurement and Analysis of Computing Systems, Vol. 8, 1 (2024), 1--30. io .
  •  
13.
  • Zhang, Huaifeng, 1994, et al. (författare)
  • Machine Learning Systems are Bloated and Vulnerable
  • 2024
  • Ingår i: Performance Evaluation Review. - 0163-5999. ; 52:1, s. 37-38
  • Tidskriftsartikel (refereegranskat)abstract
    • Today's software is bloated with both code and features that are not used by most users. This bloat is prevalent across the entire software stack, from operating systems and applications to containers. Containers are lightweight virtualization technologies used to package code and dependencies, providing portable, reproducible and isolated environments. For their ease of use, data scientists often utilize machine learning containers to simplify their workflow. However, this convenience comes at a cost: containers are often bloated with unnecessary code and dependencies, resulting in very large sizes. In this paper, we analyze and quantify bloat in machine learning containers. We develop MMLB, a framework for analyzing bloat in software systems, focusing on machine learning containers. MMLB measures the amount of bloat at both the container and package levels, quantifying the sources of bloat. In addition, MMLB integrates with vulnerability analysis tools and performs package dependency analysis to evaluate the impact of bloat on container vulnerabilities. Through experimentation with 15 machine learning containers from TensorFlow, PyTorch, and Nvidia, we show that bloat accounts for up to 80% of machine learning container sizes, increasing container provisioning times by up to 370% and exacerbating vulnerabilities by up to 99%. For more detail, see the full paper, [15].
  •  
14.
  • Ali-Eldin, Ahmed, et al. (författare)
  • An adaptive hybrid elasticity controller for cloud infrastructures
  • 2012
  • Ingår i: 2012 IEEE Network operations and managent symposium (NOMS). - : IEEE Communications Society. - 9781467302685 ; , s. 204-212
  • Konferensbidrag (refereegranskat)abstract
    • Cloud elasticity is the ability of the cloud infrastructure to rapidly change the amount of resources allocated to a service in order to meet the actual varying demands on the service while enforcing SLAs. In this paper, we focus on horizontal elasticity, the ability of the infrastructure to add or remove virtual machines allocated to a service deployed in the cloud. We model a cloud service using queuing theory. Using that model we build two adaptive proactive controllers that estimate the future load on a service. We explore the different possible scenarios for deploying a proactive elasticity controller coupled with a reactive elasticity controller in the cloud. Using simulation with workload traces from the FIFA world-cup web servers, we show that a hybrid controller that incorporates a reactive controller for scale up coupled with our proactive controllers for scale down decisions reduces SLA violations by a factor of 2 to 10 compared to a regression based controller or a completely reactive controller.
  •  
15.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • Analysis and characterization of a Video-on-Demand service workload
  • 2015
  • Ingår i: Proceedings of the 6th ACM Multimedia Systems Conference, MMSys 2015. - New York, NY, USA : ACM Digital Library. - 9781450333511 ; , s. 189-200
  • Konferensbidrag (refereegranskat)abstract
    • Video-on-Demand (VoD) and video sharing services accountfor a large percentage of the total downstream Internet traf-fic. In order to provide a better understanding of the loadon these services, we analyze and model a workload tracefrom a VoD service provided by a major Swedish TV broad-caster. The trace contains over half a million requests gener-ated by more than 20000 unique users. Among other things,we study the request arrival rate, the inter-arrival time, thespikes in the workload, the video popularity distribution, thestreaming bit-rate distribution and the video duration distri-bution. Our results show that the user and the session ar-rival rates for the TV4 workload does not follow a Poissonprocess. The arrival rate distribution is modeled using a log-normal distribution while the inter-arrival time distributionis modeled using a stretched exponential distribution. Weobserve the “impatient user” behavior where users abandonstreaming sessions after minutes or even seconds of startingthem. Both very popular videos and non-popular videos areparticularly affected by impatient users. We investigate ifthis behavior is an invariant for VoD workloads.
  •  
16.
  • Ali-Eldin, Ahmed, 1985- (författare)
  • Capacity Scaling for Elastic Compute Clouds
  • 2013
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • AbstractCloud computing is a computing model that allows better management, higher utiliza-tion and reduced operating costs for datacenters while providing on demand resourceprovisioning for different customers. Data centers are often enormous in size andcomplexity. In order to fully realize the cloud computing model, efficient cloud man-agement software systems that can deal with the datacenter size and complexity needto be designed and built.This thesis studies automated cloud elasticity management, one of the main andcrucial datacenter management capabilities. Elasticity can be defined as the abilityof cloud infrastructures to rapidly change the amount of resources allocated to anapplication in the cloud according to its demand. This work introduces algorithms,techniques and tools that a cloud provider can use to automate dynamic resource pro-visioning allowing the provider to better manage the datacenter resources. We designtwo automated elasticity algorithms for cloud infrastructures that predict the futureload for an application running on the cloud. It is assumed that a request is either ser-viced or dropped after one time unit, that all requests are homogeneous and that it takesone time unit to add or remove resources. We discuss the different design approachesfor elasticity controllers and evaluate our algorithms using real workload traces. Wecompare the performance of our algorithms with a state-of-the-art controller. We ex-tend on the design of the best performing controller out of our two controllers anddrop the assumptions made during the first design. The controller is evaluated with aset of different real workloads.All controllers are designed using certain assumptions on the underlying systemmodel and operating conditions. This limits a controller’s performance if the modelor operating conditions change. With this as a starting point, we design a workloadanalysis and classification tool that assigns a workload to its most suitable elasticitycontroller out of a set of implemented controllers. The tool has two main components,an analyzer and a classifier. The analyzer analyzes a workload and feeds the analysisresults to the classifier. The classifier assigns a workload to the most suitable elasticitycontroller based on the workload characteristics and a set of predefined business levelobjectives. The tool is evaluated with a set of collected real workloads and a set ofgenerated synthetic workloads. Our evaluation results shows that the tool can help acloud provider to improve the QoS provided to the customers.
  •  
17.
  • Ali-Eldin, Ahmed, et al. (författare)
  • Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control
  • 2012
  • Ingår i: Proceedings of the 3rd workshop on Scientific Cloud Computing Date. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450313407 - 145031340X ; , s. 31-40
  • Konferensbidrag (refereegranskat)abstract
    • Elasticity is the ability of a cloud infrastructure to dynamically change theamount of resources allocated to a running service as load changes. We build anautonomous elasticity controller that changes the number of virtual machinesallocated to a service based on both monitored load changes and predictions offuture load. The cloud infrastructure is modeled as a G/G/N queue. This modelis used to construct a hybrid reactive-adaptive controller that quickly reactsto sudden load changes, prevents premature release of resources, takes intoaccount the heterogeneity of the workload, and avoids oscillations. Using simulations with Web and cluster workload traces, we show that our proposed controller lowers the number of delayed requests by a factor of 70 for the Web traces and 3 for the cluster traces when compared to a reactive controller. Ourcontroller also decreases the average number of queued requests by a factor of 3 for both traces, and reduces oscillations by a factor of 7 for the Web traces and 3 for the cluster traces. This comes at the expense of between 20% and 30% over-provisioning, as compared to a few percent for the reactive controller.
  •  
18.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • Measuring cloud workload burstiness
  • 2014
  • Ingår i: 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC). - : IEEE conference proceedings. - 9781479978816 ; , s. 566-572
  • Konferensbidrag (refereegranskat)abstract
    • Workload burstiness and spikes are among the main reasons for service disruptions and decrease in the Quality-of-Service (QoS) of online services. They are hurdles that complicate autonomic resource management of datacenters. In this paper, we review the state-of-the-art in online identification of workload spikes and quantifying burstiness. The applicability of some of the proposed techniques is examined for Cloud systems where various workloads are co-hosted on the same platform. We discuss Sample Entropy (SampEn), a measure used in biomedical signal analysis, as a potential measure for burstiness. A modification to the original measure is introduced to make it more suitable for Cloud workloads.
  •  
19.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • Optimizing Replica Placement in Peer-Assisted Cloud Stores
  • 2011
  • Konferensbidrag (refereegranskat)abstract
    • Peer-assisted cloud storage systems use the unutilizedresources of the clients subscribed to a storage cloudto offload the servers of the cloud. The provider distributesdata replicas on the clients instead of replicating on the localinfrastructure. These replicas allow the provider to providea highly available, reliable and cheap service at a reducedcost. In this work we introduce NileStore, a protocol forreplication management in peer-assisted cloud storage. Theprotocol converts the replica placement problem into a lineartask assignment problem. We design five utility functionsto optimize placement taking into account the bandwidth,free storage and the size of data in need of replication oneach peer. The problem is solved using a suboptimal greedyoptimization algorithm. We show our simulation results usingthe different utilities under realistic network conditions. Ourresults show that using our approach offloads the cloud serversby about 90% compared to a random placement algorithmwhile consuming 98.5% less resources compared to a normalstorage cloud.
  •  
20.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • Replica Placement in Peer-Assisted Clouds : An Economic Approach
  • 2011
  • Ingår i: Lecture Notes in Computer Science. - Berlin, Heidelberg : Springer. ; , s. 208-213
  • Konferensbidrag (refereegranskat)abstract
    • We introduce NileStore, a replica placement algorithm based on an economical model for use in Peer-assisted cloud storage. The algorithm uses storage and bandwidth resources of peers to offload the cloud provider’s resources. We formulate the placement problem as a linear task assignment problem where the aim is to minimize time needed for file replicas to reach a certain desired threshold. Using simulation, We reduce the probability of a file being served from the provider’s servers by more than 97.5% under realistic network conditions.
  •  
21.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • WAC : A Workload analysis and classification tool for automatic selection of cloud auto-scaling methods
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Autoscaling algorithms for elastic cloud infrastructures dynami-cally change the amount of resources allocated to a service ac-cording to the current and predicted future load. Since there areno perfect predictors, no single elasticity algorithm is suitable foraccurate predictions of all workloads. To improve the quality ofworkload predictions and increase the Quality-of-Service (QoS)guarantees of a cloud service, multiple autoscalers suitable for dif-ferent workload classes need to be used. In this work, we intro-duce WAC, a Workload Analysis and Classification tool that as-signs workloads to the most suitable elasticity autoscaler out of aset of pre-deployed autoscalers. The workload assignment is basedon the workload characteristics and a set of user-defined Business-Level-Objectives (BLO). We describe the tool design and its maincomponents. We implement WAC and evaluate its precision us-ing various workloads, BLO combinations and state-of-the-art au-toscalers. Our experiments show that, when the classifier is tunedcarefully, WAC assigns between 87% and 98.3% of the workloadsto the most suitable elasticity autoscaler.
  •  
22.
  •  
23.
  • Ali-Eldin, Ahmed, 1985-, et al. (författare)
  • Workload Classification for Efficient Auto-Scaling of Cloud Resources
  • 2013
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Elasticity algorithms for cloud infrastructures dynamically change the amount of resources allocated to a running service according to the current and predicted future load. Since there is no perfect predictor, and since different applications’ workloads have different characteristics, no single elasticity algorithm is suitable for future predictions for all workloads. In this work, we introduceWAC, aWorkload Analysis and Classification tool that analyzes workloads and assigns them to the most suitable elasticity controllers based on the workloads’ characteristics and a set of business level objectives.WAC has two main components, the analyzer and the classifier. The analyzer analyzes workloads to extract some of the features used by the classifier, namely, workloads’ autocorrelations and sample entropies which measure the periodicity and the burstiness of the workloads respectively. These two features are used with the business level objectives by the clas-sifier as the features used to assign workloads to elasticity controllers. We start by analyzing 14 real workloads available from different applications. In addition, a set of 55 workloads is generated to test WAC on more workload configurations. We implement four state of the art elasticity algorithms. The controllers are the classes to which the classifier assigns workloads. We use a K nearest neighbors classifier and experiment with different workload combinations as training and test sets. Our experi-ments show that, when the classifier is tuned carefully, WAC correctly classifies between 92% and 98.3% of the workloads to the most suitable elasticity controller.
  •  
24.
  • Ali-Eldin Hassan, Ahmed, 1985, et al. (författare)
  • CAVE: Caching 360° Videos at the Edge
  • 2022
  • Ingår i: NOSSDAV 2022 - Proceedings of the 2022 Workshop on Network and Operating System Support for Digital Audio and Video, Part of MMSys 2022. - New York, NY, USA : ACM. ; , s. 50-56
  • Konferensbidrag (refereegranskat)abstract
    • While 360° videos are gaining popularity due to the emergence of VR technologies, storing and streaming such videos can incur up to 20X higher overheads than traditional HD content. Edge caching, which involves caching and serving 360° videos from edge servers, is one possible approach for addressing these overheads. Prior work on 360° video caching has been based on using past history to cache tiles that are likely to be in a viewer's field of view and has not considered methods to intelligently share a limited edge cache across a set of videos that exhibit large variations in their popularity, size, content, and user abandonment patterns. Towards this end, we present CAVE, an adaptive edge caching framework that intelligently optimizes cache allocation across a set of videos taking into account video content, size, and popularity. Our experiments using realistic video workloads shows CAVE improves cache hit-rates, and thus network saving, by up to 50% over state-of-the-art approaches, while also scaling to up to two thousand videos per edge cache. In addition, in terms of scalability, our developed algorithm is embarrassingly parallel, allowing CAVE to scale beyond state-of-the-art solutions that typically do not support parallelization.
  •  
25.
  •  
26.
  • Ali-Eldin Hassan, Ahmed, 1985, et al. (författare)
  • The hidden cost of the edge: A performance comparison of edge and cloud latencies
  • 2021
  • Ingår i: International Conference for High Performance Computing, Networking, Storage and Analysis, SC. - New York, NY, USA : ACM. - 2167-4337 .- 2167-4329.
  • Konferensbidrag (refereegranskat)abstract
    • Edge computing has emerged as a popular paradigm for running latency-sensitive applications due to its ability to offer lower network latencies to end-users. In this paper, we argue that despite its lower network latency, the resource-constrained nature of the edge can result in higher end-To-end latency, especially at higher utilizations, when compared to cloud data centers. We study this edge performance inversion problem through an analytic comparison of edge and cloud latencies and analyze conditions under which the edge can yield worse performance than the cloud. To verify our analytic results, we conduct a detailed experimental comparison of the edge and the cloud latencies using a realistic application and real cloud workloads. Both our analytical and experimental results show that even at moderate utilizations, the edge queuing delays can offset the benefits of lower network latencies, and even result in performance inversion where running in the cloud would provide superior latencies. We finally discuss practical implications of our results and provide insights into how application designers and service providers should design edge applications and systems to avoid these pitfalls.
  •  
27.
  • Ali-Eldin Hassan, Ahmed, 1985- (författare)
  • Workload characterization, controller design and performance evaluation for cloud capacity autoscaling
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis studies cloud capacity auto-scaling, or how to provision and release re-sources to a service running in the cloud based on its actual demand using an auto-matic controller. As the performance of server systems depends on the system design,the system implementation, and the workloads the system is subjected to, we focuson these aspects with respect to designing auto-scaling algorithms. Towards this goal,we design and implement two auto-scaling algorithms for cloud infrastructures. Thealgorithms predict the future load for an application running in the cloud. We discussthe different approaches to designing an auto-scaler combining reactive and proactivecontrol methods, and to be able to handle long running requests, e.g., tasks runningfor longer than the actuation interval, in a cloud. We compare the performance ofour algorithms with state-of-the-art auto-scalers and evaluate the controllers’ perfor-mance with a set of workloads. As any controller is designed with an assumptionon the operating conditions and system dynamics, the performance of an auto-scalervaries with different workloads.In order to better understand the workload dynamics and evolution, we analyze a6-years long workload trace of the sixth most popular Internet website. In addition,we analyze a workload from one of the largest Video-on-Demand streaming servicesin Sweden. We discuss the popularity of objects served by the two services, the spikesin the two workloads, and the invariants in the workloads. We also introduce, a mea-sure for the disorder in a workload, i.e., the amount of burstiness. The measure isbased on Sample Entropy, an empirical statistic used in biomedical signal processingto characterize biomedical signals. The introduced measure can be used to charac-terize the workloads based on their burstiness profiles. We compare our introducedmeasure with the literature on quantifying burstiness in a server workload, and showthe advantages of our introduced measure.To better understand the tradeoffs between using different auto-scalers with differ-ent workloads, we design a framework to compare auto-scalers and give probabilisticguarantees on the performance in worst-case scenarios. Using different evaluation cri-teria and more than 700 workload traces, we compare six state-of-the-art auto-scalersthat we believe represent the development of the field in the past 8 years. Knowingthat the auto-scalers’ performance depends on the workloads, we design a workloadanalysis and classification tool that assigns a workload to its most suitable elasticitycontroller out of a set of implemented controllers. The tool has two main components;an analyzer, and a classifier. The analyzer analyzes a workload and feeds the analysisresults to the classifier. The classifier assigns a workload to the most suitable elasticitycontroller based on the workload characteristics and a set of predefined business levelobjectives. The tool is evaluated with a set of collected real workloads, and a set ofgenerated synthetic workloads. Our evaluation results shows that the tool can help acloud provider to improve the QoS provided to the customers.
  •  
28.
  • Bauer, André, et al. (författare)
  • Chameleon : A Hybrid, Proactive Auto-Scaling Mechanism on a Level-Playing Field
  • 2019
  • Ingår i: IEEE Transactions on Parallel and Distributed Systems. - : IEEE Computer Society. - 1045-9219 .- 1558-2183. ; 30:4, s. 800-813
  • Tidskriftsartikel (refereegranskat)abstract
    • Auto-scalers for clouds promise stable service quality at low costs when facing changing workload intensity. The major public cloud providers provide trigger-based auto-scalers based on thresholds. However, trigger-based auto-scaling has reaction times in the order of minutes. Novel auto-scalers from literature try to overcome the limitations of reactive mechanisms by employing proactive prediction methods. However, the adoption of proactive auto-scalers in production is still very low due to the high risk of relying on a single proactive method. This paper tackles the challenge of reducing this risk by proposing a new hybrid auto-scaling mechanism, called Chameleon, combining multiple different proactive methods coupled with a reactive fallback mechanism. Chameleon employs on-demand, automated time series-based forecasting methods to predict the arriving load intensity in combination with run-time service demand estimation to calculate the required resource consumption per work unit without the need for application instrumentation. We benchmark Chameleon against five different state-of-the-art proactive and reactive auto-scalers one in three different private and public cloud environments. We generate five different representative workloads each taken from different real-world system traces. Overall, Chameleon achieves the best scaling behavior based on user and elasticity performance metrics, analyzing the results from 400 hours aggregated experiment time.
  •  
29.
  • Chen, Bo, et al. (författare)
  • Deep Contextualized Compressive Offloading for Images
  • 2021
  • Ingår i: SenSys 2021 - Proceedings of the 2021 19th ACM Conference on Embedded Networked Sensor Systems. - New York, NY, USA : ACM. ; , s. 467-473
  • Konferensbidrag (refereegranskat)abstract
    • Recent years have witnessed sensors becoming an indispensable part of our life with the camera being one of the most popular and widely deployed sensors. The camera gives rise to numerous vision-based IoT applications that generate high-level understandings of a live video stream by performing analysis on end devices like mobile or embedded devices. Typically, these applications are built with deep learning (DL) models to conduct complex vision tasks, e.g., image classification and object detection. Due to the prohibitive cost of running DL models on end devices close to the camera and with limited computation capabilities, it is widely adopted to offload the computation to a nearby powerful edge server. However, there is a gap between the restricted offloading bandwidth of the end device and the large volume of image data incurred by the live video stream. In this paper, we present Deep Contextualized Compressive Offloading for Images (DCCOI), a lightweight, context-aware, and bandwidth-efficient offloading framework for images. DCCOI consists of the spatial-adaptive encoder, a lightweight neural network, to spatial-adaptively compress the image, and the generative decoder for reconstructing the image from the compressed data. In contrast to existing DL-based encoders, the spatial-adaptive encoder allows an image region to be encoded into different numbers of feature values based on the information in it. This offers a variable-length coding method for image compression, which is a more optimal way for compression than the fix-length coding method took by existing DL-based compression approaches and demonstrates superior accuracy-compression rate trade-offs. We evaluate DCCOI against several baseline compression techniques while serving an object detection-based application. The results show that DCCOI roughly reduces the offloading size of JPEG by a factor of 9 and DeepCOD, the state-of-the-art offloading approach, by 20% with similar accuracy and a compression overhead less than 50ms.
  •  
30.
  •  
31.
  • Elmroth, Erik, 1964-, et al. (författare)
  • Self-management challenges for multi-cloud architectures
  • 2011
  • Ingår i: Towards a Service-Based Internet. - Berlin, Heidelberg : Springer Berlin/Heidelberg. - 9783642247545 - 9783642247552 ; , s. 38-49
  • Konferensbidrag (refereegranskat)abstract
    • Addressing the management challenges for a multitude of distributed cloud architectures, we focus on the three complementary cloud management problems of predictive elasticity, admission control, and placement (or scheduling) of virtual machines. As these problems are intrinsically intertwined we also propose an approach to optimize the overall system behavior by policy-tuning for the tools handling each of them. Moreover, in order to facilitate the execution of some of the management decisions, we also propose new algorithms for live migration of virtual machines with very high workload and/or over low-bandwidth networks, using techniques such as caching, compression, and prioritization of memory pages.
  •  
32.
  • Fuerst, Alexander, et al. (författare)
  • Cloud-scale VM-deflation for Running Interactive Applications On Transient Servers
  • 2020
  • Ingår i: HPDC 2020 - Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing. - New York, NY, USA : ACM. ; , s. 53-64
  • Konferensbidrag (refereegranskat)abstract
    • Transient computing has become popular in public cloud environ-ments for running delay-insensitive batch and data processing ap-plications at low cost. Since transient cloud servers can be revokedat any time by the cloud provider, they are considered unsuitablefor running interactive application such as web services. In thispaper, we present VM deflation as an alternative mechanism toserver preemption for reclaiming resources from transient cloudservers under resource pressure. Using real traces from top-tiercloud providers, we show the feasibility of using VM deflation asa resource reclamation mechanism for interactive applications inpublic clouds. We show how current hypervisor mechanisms can beused to implement VM deflation and present cluster deflation poli-cies for resource management of transient and on-demand cloudVMs. Experimental evaluation of our deflation system on a Linuxcluster shows that microservice-based applications can be deflatedby up to 50% with negligible performance overhead. Our cluster-level deflation policies allow overcommitment levels as high as 50%,with less than a 1% decrease in application throughput, and canenable cloud platforms to increase revenue by 30%.
  •  
33.
  • Ilyushkin, Alexey, et al. (författare)
  • An Experimental Performance Evaluation of Autoscalers for Complex Workflows
  • 2018
  • Ingår i: ACM Transactions on Modeling and Performance Evaluation of Computing Systems. - : Association for Computing Machinery (ACM). - 2376-3639 .- 2376-3647. ; 3:2
  • Tidskriftsartikel (refereegranskat)abstract
    • Elasticity is one of the main features of cloud computing allowing customers to scale their resources based on the workload. Many autoscalers have been proposed in the past decade to decide on behalf of cloud customers when and how to provision resources to a cloud application based on the workload utilizing cloud elasticity features. However, in prior work, when a new policy is proposed, it is seldom compared to the state-of-the-art, and is often compared only to static provisioning using a predefined quality of service target. This reduces the ability of cloud customers and of cloud operators to choose and deploy an autoscaling policy, as there is seldom enough analysis on the performance of the autoscalers in different operating conditions and with different applications. In our work, we conduct an experimental performance evaluation of autoscaling policies, using as application model workflows, a popular formalism for automating resource management for applications with well-defined yet complex structures. We present a detailed comparative study of general state-of-the-art autoscaling policies, along with two new workflow-specific policies. To understand the performance differences between the seven policies, we conduct various experiments and compare their performance in both pairwise and group comparisons. We report both individual and aggregated metrics. As many workflows have deadline requirements on the tasks, we study the effect of autoscaling on workflow deadlines. Additionally, we look into the effect of autoscaling on the accounted and hourly based charged costs, and we evaluate performance variability caused by the autoscaler selection for each group of workflow sizes. Our results highlight the trade-offs between the suggested policies, how they can impact meeting the deadlines, and how they perform in different operating conditions, thus enabling a better understanding of the current state-of-the-art.
  •  
34.
  •  
35.
  • Ilyushkin, Alexey, et al. (författare)
  • An Experimental Performance Evaluation of Autoscaling Algorithms for Complex Workflows
  • 2017
  • Ingår i: ICPE '17 Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. - New York, NY, USA : ACM. - 9781450344043 ; , s. 75-86
  • Konferensbidrag (refereegranskat)abstract
    • Simplifying the task of resource management and scheduling for customers, while still delivering complex Quality-of-Service (QoS), is key to cloud computing. Many autoscaling policies have been proposed in the past decade to decide on behalf of cloud customers when and how to provision resources to a cloud application utilizing cloud elasticity features. However, in prior work, when a new policy is proposed, it is seldom compared to the state-of-the-art, and is often compared only to static provisioning using a predefined QoS target. This reduces the ability of cloud customers and of cloud operators to choose and deploy an autoscaling policy. In our work, we conduct an experimentalperformance evaluation of autoscaling policies, using as application model workflows, a commonly used formalism for automating resource management for applications with well-defined yet complex structure. We present a detailed comparative study of general state-of-the-art autoscaling policies, along with two new workflow-specific policies. To understand the performance differences between the 7 policies, we conduct various forms of pairwise and group comparisons. We report both individual and aggregated metrics. Our results highlight the trade-offs between the suggested policies, and thus enable a better understanding of the current state-of-the-art.
  •  
36.
  •  
37.
  • Krzywda, Jakub, 1989- (författare)
  • Analysing, modelling and controlling power-performance tradeoffs in data center infrastructures
  • 2017
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The aim of this thesis is to analyse the power-performance tradeoffs in datacenter servers, create models that capture these tradeoffs, and propose controllers to optimise the use of data center infrastructures taking the tradeoffs into consideration. The main research problem that we investigate in this thesis is how to increase the power efficiency of data center servers taking into account the power-performance tradeoffs.The main cause for this research is the massive power consumption of data centers that is a concern both from the financial and environmental footprint perspectives. Irrespectively of the approaches taken to enhance data center power efficiency, substantial reductions in the power consumption of data center servers easily lead to performance degradation of hosted applications, which causes customers dissatisfaction. Therefore, it is crucial for the data center operators to understand and control the power-performance tradeoffs.The research methods used in this thesis include experiments on real testbeds, applying statistical methods to create power-performance models, development of various optimisation techniques to improve the energy-efficiency of servers, and simulations to evaluate proposed solutions at scale.As a result of the research presented in this thesis, we propose taxonomies for selected aspects of data center configurations, events, management actions, and monitored metrics. We discuss the relationships between these elements and to support the analysis present results from a set of testbed experiments.We show limitations in the applicability of various data center management actions, including Dynamic Voltage Frequency Scaling (DVFS), Running Average Power Limit (RAPL), CPU Pinning, horizontal and vertical scaling. Finally, we propose a power budgeting controller that minimizes the performance degradation while enforcing the power limits.The outcomes of this thesis can be used by the data center operators to improve the energy-efficiency of servers and reduce the overall power consumption with minimized performance degradation. Moreover, the software artifacts including virtual machine images, scripts, and simulator are available online.Future work includes further investigation of the problem of graceful performance degradation under power limits, incorporating multi-layer applications spread among several servers and load balancing controller.
  •  
38.
  • Krzywda, Jakub, 1989- (författare)
  • May the power be with you : managing power-performance tradeoffs in cloud data centers
  • 2019
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The overall goal of the work presented in this thesis was to find ways of managing power-performance tradeoffs in cloud data centers. To this end, the relationships between the power consumption of data center servers and the performance of applications hosted in data centers are analyzed, models that capture these relationships are developed, and controllers to optimize the use of data center infrastructures are proposed.The studies were motivated by the massive power consumption of modern data centers, which is a matter of significant financial and environmental concern. Various strategies for improving the power efficiency of data centers have been proposed, including server consolidation, server throttling, and power budgeting. However, no matter what strategy is used to enhance data center power efficiency, substantial reductions in the power consumption of data center servers can easily degrade the performance of hosted applications, causing customer dissatisfaction. It is therefore crucial for data center operators to understand and control power-performance tradeoffs.The research methods used in this work include experiments on real testbeds, the application of statistical methods to create power-performance models, development of various optimization techniques to improve the power efficiency of servers, and simulations to evaluate the proposed solutions at scale.This thesis makes multiple contributions. First, it introduces taxonomies for various aspects of data center configuration, events, management actions, and monitored metrics. We discuss the relationships between these elements and support our analysis with results from a set of testbed experiments. We demonstrate limitations on the usefulness of various data center management actions for controlling power consumption, including Dynamic Voltage Frequency Scaling (DVFS) and Running Average Power Limit (RAPL). We also demonstrate similar limitations on common measures for controlling application performance, including variation of operating system scheduling parameters, CPU pinning, and horizontal and vertical scaling. Finally, we propose a set of power budgeting controllers that act at the application, server, and cluster levels to minimize performance degradation while enforcing power limits.The results and analysis presented in this thesis can be used by data center operators to improve the power-efficiency of servers and reduce overall operational costs while minimizing performance degradation. All of the software generated during this work, including controller source code, virtual machine images, scripts, and simulators, has been open-sourced.
  •  
39.
  • Krzywda, Jakub, 1989-, et al. (författare)
  • Modeling and Simulation of QoS-Aware Power Budgeting in Cloud Data Centers
  • 2020
  • Ingår i: 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). - : IEEE conference proceedings. ; , s. 88-93
  • Konferensbidrag (refereegranskat)abstract
    • Power budgeting is a commonly employed solution to reduce the negative consequences of high power consumption of large scale data centers. While various power budgeting techniques and algorithms have been proposed at different levels of data center infrastructures to optimize the power allocation toservers and hosted applications, testing them has been challengingwith no available simulation platform that enables such testingfor different scenarios and configurations. To facilitate evaluationand comparison of such techniques and algorithms, we introducea simulation model for Quality-of-Service aware power budgetingand its implementation in CloudSim. We validate the proposedsimulation model against a deployment on a real testbed, showcase simulator capabilities, and evaluate its scalability.
  •  
40.
  • Krzywda, Jakub, 1989-, et al. (författare)
  • Modeling and Simulation of QoS-AwarePower Budgeting in Cloud Data Centers
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Power budgeting is a commonly employed solution to reduce the negative consequences of high power consumption of large scale data centers. While various power budgeting techniques and algorithms have been proposed at different levels of data center infrastructures to optimize the power allocation to servers and hosted applications, testing them has been challenging with no available simulation platform that enables such testing for different scenarios and configurations. To facilitate evaluation and comparison of such techniques and algorithms, we introduce a simulation model for Quality-of-Service aware power budgeting and its implementation in CloudSim. We validate the proposed simulation model against a deployment on a real testbed, showcase simulator capabilities, and evaluate its scalability.
  •  
41.
  • Krzywda, Jakub, 1989-, et al. (författare)
  • Power-performance tradeoffs in data center servers : DVFS, CPUpinning, horizontal, and vertical scaling
  • 2018
  • Ingår i: Future generations computer systems. - : Elsevier BV. - 0167-739X .- 1872-7115. ; 81, s. 114-128
  • Tidskriftsartikel (refereegranskat)abstract
    • Dynamic Voltage and Frequency Scaling (DVFS), CPU pinning, horizontal, and vertical scaling, are four techniques that have been proposed as actuators to control the performance and energy consumption on data center servers. This work investigates the utility of these four actuators, and quantifies the power-performance tradeoffs associated with them. Using replicas of the German Wikipedia running on our local testbed, we perform a set of experiments to quantify the influence of DVFS, vertical and horizontal scaling, and CPU pinning on end-to-end response time (average and tail), throughput, and power consumption with different workloads. Results of the experiments show that DVFS rarely reduces the power consumption of underloaded servers by more than 5%, but it can be used to limit the maximal power consumption of a saturated server by up to 20% (at a cost of performance degradation). CPU pinning reduces the power consumption of underloaded server (by up to 7%) at the cost of performance degradation, which can be limited by choosing an appropriate CPU pinning scheme. Horizontal and vertical scaling improves both the average and tail response time, but the improvement is not proportional to the amount of resources added. The load balancing strategy has a big impact on the tail response time of horizontally scaled applications.
  •  
42.
  • Krzywda, Jakub, 1989-, et al. (författare)
  • Power Shepherd : Application Performance Aware Power Shifting
  • 2019
  • Ingår i: Proceedings of the International Conference on Cloud Computing Technology and Science, CloudCom. - : IEEE Computer Society. - 9781728150116 ; , s. 45-53
  • Konferensbidrag (refereegranskat)abstract
    • Constantly growing power consumption of data centers is a major concern from environmental and economical reasons. Current approaches to reduce negative consequences of high power consumption focus on limiting the peak power consumption. During high workload periods, power consumption of highly utilized servers is throttled to stay within the power budget. However, the peak power reduction affects performance of hosted applications and thus leads to Quality of Service violations. In this paper, we introduce Power Shepherd, a hierarchical system for application performance aware power shifting. Power Shepherd reduces the data center operational costs by redistributing the available power among applications hosted in the cluster. This is achieved by, assigning server power budgets by the cluster controller, enforcing these power budgets using Running Average Power Limit (RAPL), and prioritizing applications within each server by adjusting the CPU scheduling configuration. We implement a prototype of the proposed solution and evaluate it in a real testbed equipped with power meters and using representative cloud applications. Our experiments show that Power Shepherd has potential to manage a cluster consisting of thousands of servers and limit the increase of operational costs by a significant amount when the cluster power budget is limited and the system is overutilized. Finally, we identify some outstanding challenges regarding model sensitivity and the fact that this approach in its current from is not beneficial to be used in all situations, e.g., when the system is underutilized.
  •  
43.
  • Li, Zheng, et al. (författare)
  • A Survey on Modeling Energy Consumption of Cloud Applications : Deconstruction, State of the Art, and Trade-Off Debates
  • 2017
  • Ingår i: IEEE Transactions on Sustainable Computing. - : Institute of Electrical and Electronics Engineers (IEEE). - 2377-3782. ; 2:3, s. 255-274
  • Tidskriftsartikel (refereegranskat)abstract
    • Given the complexity and heterogeneity in Cloud computing scenarios, the modeling approach has widely been employed to investigate and analyze the energy consumption of Cloud applications, by abstracting real-world objects and processes that are difficult to observe or understand directly. It is clear that the abstraction sacrifices, and usually does not need, the complete reflection of the reality to be modeled. Consequently, current energy consumption models vary in terms of purposes, assumptions, application characteristics and environmental conditions, with possible overlaps between different research works. Therefore, it would be necessary and valuable to reveal the state-of-the-art of the existing modeling efforts, so as to weave different models together to facilitate comprehending and further investigating application energy consumption in the Cloud domain. By systematically selecting, assessing, and synthesizing 76 relevant studies, we rationalized and organized over 30 energy consumption models with unified notations. To help investigate the existing models and facilitate future modeling work, we deconstructed the runtime execution and deployment environment of Cloud applications, and identified 18 environmental factors and 12 workload factors that would be influential on the energy consumption. In particular, there are complicated trade-offs and even debates when dealing with the combinational impacts of multiple factors.
  •  
44.
  • Liang, Qianlin, et al. (författare)
  • Dělen: Enabling Flexible and Adaptive Model-serving for Multi-tenant Edge AI
  • 2023
  • Ingår i: ACM International Conference Proceeding Series. ; , s. 209-221
  • Konferensbidrag (refereegranskat)abstract
    • Model-serving systems expose machine learning (ML) models to applications programmatically via a high-level API. Cloud platforms use these systems to mask the complexities of optimally managing resources and servicing inference requests across multiple applications. Model serving at the edge is now also becoming increasingly important to support inference workloads with tight latency requirements. However, edge model serving differs substantially from cloud model serving in its latency, energy, and accuracy constraints: these systems must support multiple applications with widely different latency and accuracy requirements on embedded edge accelerators with limited computational and energy resources. To address the problem, this paper presents Dělen,1 a flexible and adaptive model-serving system for multi-tenant edge AI. Dělen exposes a high-level API that enables individual edge applications to specify a bound at runtime on the latency, accuracy, or energy of their inference requests. We efficiently implement Dělen using conditional execution in multi-exit deep neural networks (DNNs), which enables granular control over inference requests, and evaluate it on a resource-constrained Jetson Nano edge accelerator. We evaluate Dělen flexibility by implementing state-of-the-art adaptation policies using Dělen's API, and evaluate its adaptability under different workload dynamics and goals when running single and multiple applications.
  •  
45.
  • Papadopoulos, Alessandro, Professor, et al. (författare)
  • Methodological Principles for Reproducible Performance Evaluation in Cloud Computing
  • 2021
  • Ingår i: IEEE Transactions on Software Engineering. - : Institute of Electrical and Electronics Engineers Inc.. - 0098-5589 .- 1939-3520. ; 47:8, s. 1528-1543
  • Tidskriftsartikel (refereegranskat)abstract
    • The rapid adoption and the diversification of cloud computing technology exacerbate the importance of a sound experimental methodology for this domain. This work investigates how to measure and report performance in the cloud, and how well the cloud research community is already doing it. We propose a set of eight important methodological principles that combine best-practices from nearby fields with concepts applicable only to clouds, and with new ideas about the time-accuracy trade-off. We show how these principles are applicable using a practical use-case experiment. To this end, we analyze the ability of the newly released SPEC Cloud IaaS benchmark to follow the principles, and showcase real-world experimental studies in common cloud environments that meet the principles. Last, we report on a systematic literature review including top conferences and journals in the field, from 2012 to 2017, analyzing if the practice of reporting cloud performance measurements follows the proposed eight principles. Worryingly, this systematic survey and the subsequent two-round human reviews, reveal that few of the published studies follow the eight experimental principles. We conclude that, although these important principles are simple and basic, the cloud community is yet to adopt them broadly to deliver sound measurement of cloud environments.
  •  
46.
  •  
47.
  • Papadopoulos, Alessandro, et al. (författare)
  • PEAS : A Performance Evaluation framework for Auto-Scaling strategies in cloud applications
  • 2016
  • Ingår i: ACM Transactions on Modeling and Performance Evaluation of Computing Systems. - United States : Association for Computing Machinery (ACM). - 2376-3639 .- 2376-3647. ; :4
  • Tidskriftsartikel (refereegranskat)abstract
    • Numerous auto-scaling strategies have been proposed in the last few years for improving various Quality of Service (QoS)indicators of cloud applications, e.g., response time and throughput, by adapting the amount of resources assigned to theapplication to meet the workload demand. However, the evaluation of a proposed auto-scaler is usually achieved throughexperiments under specific conditions, and seldom includes extensive testing to account for uncertainties in the workloads, andunexpected behaviors of the system. These tests by no means can provide guarantees about the behavior of the system in generalconditions. In this paper, we present PEAS, a Performance Evaluation framework for Auto-Scaling strategies in the presenceof uncertainties. The evaluation is formulated as a chance constrained optimization problem, which is solved using scenariotheory. The adoption of such a technique allows one to give probabilistic guarantees of the obtainable performance. Six differentauto-scaling strategies have been selected from the literature for extensive test evaluation, and compared using the proposedframework. We build a discrete event simulator and parameterize it based on real experiments. Using the simulator, each auto-scaler’s performance is evaluated using 796 distinct real workload traces from projects hosted on the Wikimedia foundations’servers, and their performance is compared using PEAS. The evaluation is carried out using different performance metrics,highlighting the flexibility of the framework, while providing probabilistic bounds on the evaluation and the performance of thealgorithms. Our results highlight the problem of generalizing the conclusions of the original published studies and show thatbased on the evaluation criteria, a controller can be shown to be better than other controllers.
  •  
48.
  • Savasci, Mehmet, et al. (författare)
  • DDPC: Automated Data-Driven Power-Performance Controller Design on-the-fly for Latency-sensitive Web Services
  • 2023
  • Ingår i: ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023. - 9781450394161 ; , s. 3067-3076
  • Konferensbidrag (refereegranskat)abstract
    • Traditional power reduction techniques such as DVFS or RAPL are challenging to use with web services because they significantly affect the services' latency and throughput. Previous work suggested the use of controllers based on control theory or machine learning to reduce performance degradation under constrained power. However, generating these controllers is challenging as every web service applications running in a data center requires a power-performance model and a fine-tuned controller. In this paper, we present DDPC, a system for autonomic data-driven controller generation for power-latency management. DDPC automates the process of designing and deploying controllers for dynamic power allocation to manage the power-performance trade-offs for latency-sensitive web applications such as a social network. For each application, DDPC uses system identification techniques to learn an adaptive power-performance model that captures the application's power-latency trade-offs which is then used to generate and deploy a Proportional-Integral (PI) power controller with gain-scheduling to dynamically manage the power allocation to the server running application using RAPL. We evaluate DDPC with two realistic latency-sensitive web applications under varying load scenarios. Our results show that DDPC is capable of autonomically generating and deploying controllers within a few minutes reducing the active power allocation of a web-server by more than 50% compared to state-of-the-art techniques while maintaining the latency well below the target of the application.
  •  
49.
  • Souza, Abel, et al. (författare)
  • CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services
  • 2023
  • Ingår i: ACM International Conference Proceeding Series. ; 28 October 2023, s. 67-73
  • Konferensbidrag (refereegranskat)abstract
    • There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and performance flexibility, enabling the possibility to adapt the location, time, and intensity of processing to align with the availability of renewable and low-carbon energy. An example is a web application hosted across multiple cloud regions, each with varying carbon intensity based on their local electricity mix. Distributed load-balancing enables the exploitation of low-carbon energy through load migration across regions, reducing web applications carbon footprint. In this paper, we present CASPER, a carbon-aware scheduling and provisioning system that primarily minimizes the carbon footprint of distributed web services while also respecting their Service Level Objectives (SLO). We formulate CASPER as an multi-objective optimization problem that considers both the variable carbon intensity and latency constraints of the network. Our evaluation reveals the significant potential of CASPER in achieving substantial reductions in carbon emissions. Compared to baseline methods, CASPER demonstrates improvements of up to 70% with no latency performance degradation.
  •  
50.
  • Wang, Bin, et al. (författare)
  • LaSS: Running Latency Sensitive Serverless Computations at the Edge
  • 2021
  • Ingår i: HPDC 2021 - Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing. - New York, NY, USA : ACM. ; , s. 239-251
  • Konferensbidrag (refereegranskat)abstract
    • Serverless computing has emerged as a new paradigm for running short-lived computations in the cloud. Due to its ability to handle IoT workloads, there has been considerable interest in running serverless functions at the edge. However, the constrained nature of the edge and the latency sensitive nature of workloads result in many challenges for serverless platforms. In this paper, we present LaSS, a platform that uses model-driven approaches for running latency-sensitive serverless computations on edge resources. LaSS uses principled queuing-based methods to determine an appropriate allocation for each hosted function and auto-scales the allocated resources in response to workload dynamics. LaSS uses a fair-share allocation approach to guarantee a minimum of allocated resources to each function in the presence of overload. In addition, it utilizes resource reclamation methods based on container deflation and termination to reassign resources from over-provisioned functions to under-provisioned ones. We implement a prototype of our approach on an OpenWhisk serverless edge cluster and conduct a detailed experimental evaluation. Our results show that LaSS can accurately predict the resources needed for serverless functions in the presence of highly dynamic workloads, and reprovision container capacity within hundreds of milliseconds while maintaining fair share allocation guarantees.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-50 av 51

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy