SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:0743 7315 OR L773:1096 0848 "

Sökning: L773:0743 7315 OR L773:1096 0848

  • Resultat 1-10 av 50
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Johnsson, Lennart, et al. (författare)
  • Generalized Shuffle Permutations on Boolean Cubes
  • 1992
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 16:1, s. 1-14
  • Tidskriftsartikel (refereegranskat)abstract
    • In a generalized permutation an address (a[subscript q-1]a[subscript q-2] ... a0 receives its content from an address obtained through a cyclic shift on a subset of the q dimensions used for the encoding of the addresses. Bit-complementation may be combined with the shift. We give an algorithm that requires K/2 + 2 exchanges for K elements per processor, when storage dimensions are part of the permutation, and concurrent communication on all ports of every processor is possible. The number of element exchanges in sequence is independent of the number of processor dimensions [omega subscript r] in the permutation.
  •  
2.
  • Johnsson, Lennart (författare)
  • Performance Modeling of Distributed Memory Architectures
  • 1991
  • Ingår i: Journal of Parallel and Distributed Computing. - 0743-7315 .- 1096-0848. ; 12:4, s. 300-312
  • Tidskriftsartikel (refereegranskat)abstract
    • We provide performance models for several primitive operations on data structures distributed over memory units interconnected by a Boolean cube network. In particular, we model single-source and multiple-source concurrent broadcasting or reduction, concurrent gather and scatter operations, shifts along several axes of multidimensional arrays, and emulation of butterfly networks. We also show how the processor configuration, the data aggregation, and the encoding of the address space affect the performance for two important basic computations: the multiplication of arbitrarily shaped matrices and the Fast Fourier Transform. We also give an example of the performance behavior for local matrix operations for a processor with a single path to local memory and a set of processor registers. The analytic models are verified by measurements on the Connection Machine Model CM-2.
  •  
3.
  • Nordström, Tomas, 1963-, et al. (författare)
  • Using and designing massively parallel computers for artificial neural networks
  • 1992
  • Ingår i: Journal of Parallel and Distributed Computing. - Orlando : Academic Press. - 0743-7315 .- 1096-0848. ; 14:3, s. 260-285
  • Tidskriftsartikel (refereegranskat)abstract
    • During the past 10 years the fields of artificial neural networks (ANNs) and massively parallel computing have been evolving rapidly. The authors study the attempts to make ANN algorithms run on massively parallel computers as well as designs of new parallel systems tuned for ANN computing. Following a brief survey of the most commonly used models, the different dimensions of parallelism in ANN computing are identified, and the possibilities for mapping onto the structures of different parallel architectures are analyzed. Different classes of parallel architectures used or designed for ANN are identified. Reported implementations are reviewed and discussed. It is concluded that the regularity of ANN computations suits SIMD architectures perfectly and that broadcast or ring communication can be very efficiently utilized. Bit-serial processing is very interesting for ANN, but hardware support for multiplication should be included. Future artificial neural systems for real-time applications will require flexible processing modules that can be put together to form MIMSIMD systems
  •  
4.
  • Aartsen, M. G., et al. (författare)
  • The IceProd framework : Distributed data processing for the IceCube neutrino observatory
  • 2015
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier BV. - 0743-7315 .- 1096-0848. ; 75, s. 198-211
  • Tidskriftsartikel (refereegranskat)abstract
    • IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, identify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. This paper presents the first detailed description of IceProd, a lightweight distributed management system designed to meet these requirements. It is driven by a central database in order to manage mass production of simulations and analysis of data produced by the IceCube detector. IceProd runs as a separate layer on top of other middleware and can take advantage of a variety of computing resources, including grids and batch systems such as CREAM, HTCondor, and PBS. This is accomplished by a set of dedicated daemons that process job submission in a coordinated fashion through the use of middleware plugins that serve to abstract the details of job submission and job management from the framework. (C) 2014 Elsevier Inc. All rights reserved.
  •  
5.
  • Araujo, Victor, et al. (författare)
  • Performance evaluation of FIWARE : A cloud-based IoT platform for smart cities
  • 2019
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier. - 0743-7315 .- 1096-0848. ; 132, s. 250-261
  • Tidskriftsartikel (refereegranskat)abstract
    • As the Internet of Things (IoT) becomes a reality, millions of devices will be connected to IoT platforms in smart cities. These devices will cater to several areas within a smart city such as healthcare, logistics, and transportation. These devices are expected to generate significant amounts of data requests at high data rates, therefore, necessitating the performance benchmarking of IoT platforms to ascertain whether they can efficiently handle such devices. In this article, we present our results gathered from extensive performance evaluation of the cloud-based IoT platform, FIWARE. In particular, to study FIWARE’s performance, we developed a testbed and generated CoAP and MQTT data to emulate large-scale IoT deployments, crucial for future smart cities. We performed extensive tests and studied FIWARE’s performance regarding vertical and horizontal scalability. We present bottlenecks and limitations regarding FIWARE components and their cloud deployment. Finally, we discuss cost-efficient FIWARE deployment strategies that can be extremely beneficial to stakeholders aiming to deploy FIWARE as an IoT platform for smart cities.
  •  
6.
  • Baker, Thar, et al. (författare)
  • Enabling Technologies for Energy Cloud
  • 2021
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier. - 0743-7315 .- 1096-0848. ; 152, s. 108-110
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • We are thrilled and delighted to present this special issue, which emphasises on the novel area of Enabling Technologies for Energy Cloud. This guest editorial provides an overview of all articles accepted for publication in this special issue.
  •  
7.
  • Cao, Liang, et al. (författare)
  • GCHAR : An efficient Group-based Context–aware human activity recognition on smartphone
  • 2018
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier. - 0743-7315 .- 1096-0848. ; 118:part-1, s. 67-80
  • Tidskriftsartikel (refereegranskat)abstract
    • With smartphones increasingly becoming ubiquitous and being equipped with various sensors, nowadays, there is a trend towards implementing HAR (Human Activity Recognition) algorithms and applications on smartphones, including health monitoring, self-managing system and fitness tracking. However, one of the main issues of the existing HAR schemes is that the classification accuracy is relatively low, and in order to improve the accuracy, high computation overhead is needed. In this paper, an efficient Group-based Context-aware classification method for human activity recognition on smartphones, GCHAR is proposed, which exploits hierarchical group-based scheme to improve the classification efficiency, and reduces the classification error through context awareness rather than the intensive computation. Specifically, GCHAR designs the two-level hierarchical classification structure, i.e., inter-group and inner-group, and utilizes the previous state and transition logic (so-called context awareness) to detect the transitions among activity groups. In comparison with other popular classifiers such as RandomTree, Bagging, J48, BayesNet, KNN and Decision Table, thorough experiments on the realistic dataset (UCI HAR repository) demonstrate that GCHAR achieves the best classification accuracy, reaching 94.1636%, and time consumption in training stage of GCHAR is four times shorter than the simple Decision Table and is decreased by 72.21% in classification stage in comparison with BayesNet.
  •  
8.
  • Dhamal, Swapnil Vilas, 1988, et al. (författare)
  • Strategic Investments in Distributed Computing: A Stochastic Game Perspective
  • 2022
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier BV. - 1096-0848 .- 0743-7315. ; 169, s. 317-333
  • Tidskriftsartikel (refereegranskat)abstract
    • We study a stochastic game with a dynamic set of players, for modeling and analyzing their computational investment strategies in distributed computing. Players obtain a certain reward for solving a problem, while incurring a certain cost based on the invested time and computational power. We present our framework while considering a contemporary application of blockchain mining, and show that the framework is applicable to certain other distributed computing settings as well. For an in-depth analysis, we consider a particular yet natural scenario where the rate of solving the problem is proportional to the total computational power invested by the players. We show that, in Markov perfect equilibrium, players with cost parameters exceeding a certain threshold, do not invest; while those with cost parameters less than this threshold, invest maximal power. We arrive at an interesting conclusion that the players need not have information about the system state as well as each others' parameters, namely, cost parameters and arrival/departure rates. With extensive simulations and insights through mean field approximation, we study the effects of players' arrival/departure rates and the system parameters on the players' utilities.
  •  
9.
  • Feliu, Josue, et al. (författare)
  • Speculative inter-thread store-to-load forwarding in SMT architectures
  • 2023
  • Ingår i: Journal of Parallel and Distributed Computing. - : Elsevier. - 0743-7315 .- 1096-0848. ; 173, s. 94-106
  • Tidskriftsartikel (refereegranskat)abstract
    • Applications running on out-of-order cores have benefited for decades of store-to-load forwarding which accelerates communication of store values to loads of the same thread. Despite threads running on a simultaneous multithreading (SMT) core could also access the load queues (LQ) and store queues (SQ) / store buffers (SB) of other threads to allow inter-thread store-to-load forwarding, we have skipped exploiting it because if we allow communication of different SMT threads via their LQs and SQs/SBs, write atomicity may be violated with respect to the outside world beyond the acceptable model of read -own-write-early multiple-copy atomicity (rMCA).In our prior work, we leveraged this idea to propose inter-thread store-to-load forwarding (ITSLF). ITLSF accelerates synchronization and communication of threads running in a simultaneous multi-threading processor by allowing stores in the store-queue of a thread to forward data to loads of another thread running in the same core without violating rMCA.In this work, we extend the original ITSLF mechanism to allow inter-thread forwarding from speculative stores (Spec-ITSLF). Spec-ITSLF allows forwarding store values to other threads earlier, which further accelerates synchronization. Spec-ITSLF outperforms a baseline SMT core by 15%, which is 2% better on average (and up to 5% for the TATP workload) than the original ITSLF mechanism. More importantly, Spec-ITSLF is on par with the original ITSLF mechanism regarding storage overhead but does not need to keep track of the speculative state of stores, which was an important source of overhead and complexity in the original mechanism. (c) 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
  •  
10.
  • García Martín, Eva, et al. (författare)
  • Estimation of energy consumption in machine learning
  • 2019
  • Ingår i: Journal of Parallel and Distributed Computing. - : Academic Press. - 0743-7315 .- 1096-0848. ; 134, s. 75-88
  • Tidskriftsartikel (refereegranskat)abstract
    • Energy consumption has been widely studied in the computer architecture field for decades. While the adoption of energy as a metric in machine learning is emerging, the majority of research is still primarily focused on obtaining high levels of accuracy without any computational constraint. We believe that one of the reasons for this lack of interest is due to their lack of familiarity with approaches to evaluate energy consumption. To address this challenge, we present a review of the different approaches to estimate energy consumption in general and machine learning applications in particular. Our goal is to provide useful guidelines to the machine learning community giving them the fundamental knowledge to use and build specific energy estimation methods for machine learning algorithms. We also present the latest software tools that give energy estimation values, together with two use cases that enhance the study of energy consumption in machine learning.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 50
Typ av publikation
tidskriftsartikel (50)
Typ av innehåll
refereegranskat (45)
övrigt vetenskapligt/konstnärligt (5)
Författare/redaktör
Bohm, Christian (1)
Kim, S. H. (1)
Kolanoski, H. (1)
Sander, H. G. (1)
Vallecorsa, S. (1)
Koepke, L. (1)
visa fler...
Christov, A. (1)
Schmitz, M. (1)
Boeser, S. (1)
Zarzhitsky, P. (1)
Bai, X. (1)
Kaminsky, B. (1)
Landsman, H. (1)
Kowalski, M. (1)
Kim, D. (1)
Van Eijndhoven, N. (1)
Aartsen, M. G. (1)
Ackermann, M. (1)
Adams, J. (1)
Aguilar, J. A. (1)
Altmann, D. (1)
Arguelles, C. (1)
Auffenberg, J. (1)
Barwick, S. W. (1)
Baum, V. (1)
Bay, R. (1)
Beatty, J. J. (1)
Tjus, J. Becker (1)
Hultqvist, Klas (1)
BenZvi, S. (1)
Berghaus, P. (1)
Berley, D. (1)
Bernardini, E. (1)
Bernhard, A. (1)
Besson, D. Z. (1)
Binder, G. (1)
Bindig, D. (1)
Bissok, M. (1)
Blaufuss, E. (1)
Blumenthal, J. (1)
Boersma, David J. (1)
Bose, D. (1)
Botner, Olga (1)
Brayeur, L. (1)
Bretz, H. -P (1)
Brown, A. M. (1)
Casey, J. (1)
Casier, M. (1)
Chirkin, D. (1)
Christy, B. (1)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (12)
Chalmers tekniska högskola (10)
Blekinge Tekniska Högskola (7)
Uppsala universitet (5)
Luleå tekniska universitet (5)
Umeå universitet (3)
visa fler...
Mälardalens universitet (3)
Högskolan i Borås (2)
Karlstads universitet (2)
Högskolan i Halmstad (1)
Stockholms universitet (1)
Linköpings universitet (1)
RISE (1)
visa färre...
Språk
Engelska (50)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (44)
Teknik (7)
Medicin och hälsovetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy