SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) ;pers:(Stenström Per 1957)"

Sökning: hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) > Stenström Per 1957

  • Resultat 1-10 av 169
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Magnusson, Peter S., et al. (författare)
  • SimICS/sun4m : A virtual workstation
  • 2019
  • Ingår i: USENIX 1998 Annual Technical Conference. - New Orleans, LA, USA : USENIX Association.
  • Konferensbidrag (refereegranskat)abstract
    • System level simulators allow computer architects and system software designers to recreate an accurate and complete replica of the program behavior of a target system, regardless of the availability, existence, or instrumentation support of such a system. Applications include evaluation of architectural design alternatives as well as software engineering tasks such as traditional debugging and performance tuning. We present an implementation of a simulator acting as a virtual workstation fully compatible with the sun4m architecture from Sun Microsystems. Built using the system-level SPARC V8 simulator SimICS, SimICS/sun4m models one or more SPARC V8 processors, supports user-developed modules for data cache and instruction cache simulation and execution profiling of all code, and provides a symbolic and performance debugging environment for operating systems. SimICS/sun4m can boot unmodified operating systems, including Linux 2.0.30 and Solaris 2.6, directly from snapshots of disk partitions. To support essentially arbitrary code, we implemented binary-compatible simulators for several devices, including SCSI, console, interrupt, timers, EEPROM, and Ethernet. The Ethernet simulation hooks into the host and allows the virtual workstation to appear on the local network with full services available (NFS, NIS, rsh, etc). Ethernet and console traffic can be recorded for future playback. The performance of SimICS/sun4m is sufficient to run realistic workloads, such as the database benchmark TPC-D, scaling factor 1/100, or an interactive network application such as Mozilla. The slowdown in relation to native hardware is in the range of 25 to 75 (measured using SPECint95). We also demonstrate some applications, including modeling an 8-processor sun4m version (which does not exist), modeling future memory hierarchies, and debugging an operating system.
  •  
2.
  • Hollmann, Jochen, 1970, et al. (författare)
  • An Evaluation of Document Prefetching in a Distributed Digital Library
  • 2003
  • Ingår i: Research and AdvancedTechnology for Digital Libraries / Lecture Notes In Computer Science. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 0302-9743 .- 1611-3349. - 9783540407263 ; 2769, s. 276-287
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Latency is a fundamental problem for all distributed systems including digital libraries. To reduce user perceived delays both caching -- keeping accessed objects for future use -- and prefetching -- transferring objects ahead of access time -- can be used. In a previous paper we have reported that caching is not worthwhile for digital libraries due to low re-access frequencies. In this paper we evaluate our previous findings that prefetching can be used instead. To do this we have set up an experimental prefetching proxy which is able to retrieve documents from remote fulltext archives before the user demands them. Using a simple prediction to keep the overhead of unnecessarily transfered data limited, we find that it is possible to cut the user perceived average delay a factor of two.
  •  
3.
  • Hollmann, Jochen, 1970, et al. (författare)
  • Empirical Observations regarding Predictability in User Access-Behavior in a Distributed Digital Library System
  • 2003
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)abstract
    • Today document archives are geographically distributed but often not replicated. This can potentially result in a low quality of service in terms of reduced availability and long user-perceived access times. Instead of indiscriminate replication we study the effectiveness of caching techniques such as prefetching and selective preloading. Our technique analyzes whether user access behavior is predictable enough to guess what articles to prefetch or to preload based on access logs from DADS, a digital library system for scientific journal articles developed at DTV, the Technical Knowledge Center of Denmark. We have found that once a literature search has been narrowed to up to ten articles, there is a high likelihood that some of them will be eventually downloaded. This suggests that prefetching can be used to hide the article transfer latency. We have also found that 80% of the article downloads are confined to less than 20% of the journals, so preloading a small fraction of the digital library database could significantly shorten the access latency and improve the availability.
  •  
4.
  • Hollmann, Jochen, 1970, et al. (författare)
  • Empirical Observations regarding Predictability in User Access-Behavior in a Distributed Digital Library System
  • 2002
  • Ingår i: Proceedings of the 16th International Parallel and Distributed Processing Symposium. - 0769515738 ; , s. 221-228
  • Konferensbidrag (refereegranskat)abstract
    • Document archives are today geographicallydistributed but often not replicated. This canpotentially result in a low quality of service interms of reduced availability and long user-perceivedaccess times, especially during peakhours. Indiscriminate replication is not feasible dueto the sheer size of the database and itsadministration. In an ongoing project, the goal is tostudy the effectiveness of caching techniques likeprefetching and selective preloading to improvequality of service of digital library systems.In this paper, we analyze whether user accessbehavior is predictable enough to use it to guesswhat articles to prefetch or to preload based on useraccess logs from DADS, a digital library systemdeveloped at the Technical Knowledge Center ofDenmark, DTV. We have found that once a literaturesearch has been narrowed down to less than tenarticles, there is a high likelihood that some ofthem will be eventually downloaded. This suggeststhat prefetching can be used to hide the articletransfer latency. We have also found that as many as80% of the article downloads are confined to lessthan 20% of the journals. This suggests thatpreloading a small fraction of the digital librarydatabase can significantly shorten the access latencyas well as improving the availability.
  •  
5.
  • Islam, Mafijul, 1975, et al. (författare)
  • Zero-Value Caches: Cancelling Loads that Return Zero
  • 2009
  • Ingår i: Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT. - 1089-795X. - 9780769537719 ; , s. 237-245
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The speed gap between processor and memory continues to limit performance. To address this problem, we explore the potential of eliminating Zero Loads — loads accessing memory locations that contain the value “zero” — to improve performance and energy dissipation. Our study shows that such loads comprise as many as 18% of the total number of dynamic loads. We show that a significant fraction of zero loads ends up on the critical memory-access path in out-of-order cores. We propose a non-speculative microarchitectural technique — Zero-Value Cache (ZVC) — to capitalize on zero loads and explore critical design options of such caches. We show that with modest investment (typically a 576-byte structure), we can obtain speedups up to 78% and reduce the overall energy dissipation up to 39%. Most importantly, zero-value caches never cause performance loss.
  •  
6.
  • Alvarez, Lluc, et al. (författare)
  • eProcessor: European, Extendable, Energy-Efficient, Extreme-Scale, Extensible, Processor Ecosystem
  • 2023
  • Ingår i: Proceedings of the 20th ACM International Conference on Computing Frontiers 2023, CF 2023. ; , s. 309-314
  • Konferensbidrag (refereegranskat)abstract
    • The eProcessor project aims at creating a RISC-V full stack ecosystem. The eProcessor architecture combines a high-performance out-of-order core with energy-efficient accelerators for vector processing and artificial intelligence with reduced-precision functional units. The design of this architecture follows a hardware/software co-design approach with relevant application use cases from the high-performance computing, bioinformatics and artificial intelligence domains. Two eProcessor prototypes will be developed based on two fabricated eProcessor ASICs integrated into a computer-on-module.
  •  
7.
  • Angerd, Alexandra, 1988, et al. (författare)
  • A GPU Register File using Static Data Compression
  • 2020
  • Ingår i: ACM International Conference Proceeding Series. - New York, NY, USA : ACM.
  • Konferensbidrag (refereegranskat)abstract
    • GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation.
  •  
8.
  • Angerd, Alexandra, 1988, et al. (författare)
  • GBDI: Going Beyond Base-Delta-Immediate Compression with Global Bases
  • 2022
  • Ingår i: Proceedings - International Symposium on High-Performance Computer Architecture. - 1530-0897. - 9781665420273 ; 2022-April, s. 1115-1127
  • Konferensbidrag (refereegranskat)abstract
    • Memory bandwidth is limiting performance for many emerging applications. While compression techniques can unlock a higher memory bandwidth, prior art offers only modestly better bandwidth. This paper contributes with a new compression method - Global Base Delta Immediate compression (GBDI) - that offers substantially higher memory bandwidth by, unlike prior art, selecting base values across memory blocks. GBDI uses a novel clustering algorithm through data analysis in the background. The presented accelerator infrastructure offers low area overhead and latency. This paper shows that GBDI offers a compression ratio of 2.3×, and yields 1.5× higher bandwidth and 1.1× higher performance compared with a baseline without compression support, on average, for SPEC2017 benchmarks requiring medium to high memory bandwidth.
  •  
9.
  • Grahn, Håkan, et al. (författare)
  • A Comparative Evaluation of Hardware-Only and Software-Only Directory Protocols in Shared-Memory Multiprocessors
  • 2004
  • Ingår i: Journal of Systems Architecture. - : Elsevier BV. - 1383-7621. ; 50:9, s. 537-561
  • Tidskriftsartikel (refereegranskat)abstract
    • The hardware complexity of hardware-only directory protocols in shared-memory multiprocessors has motivated many researchers to emulate directory management by software handlers executed on the compute processors, called software-only directory protocols.In this paper, we evaluate the performance and design trade-offs between these two approaches in the same architectural simulation framework driven by eight applications from the SPLASH-2 suite. Our evaluation reveals some common case operations that can be supported by simple hardware mechanisms and can make the performance of software-only directory protocols competitive with that of hardware-only protocols. These mechanisms aim at either reducing the software handler latency or hiding it by overlapping it with the message latencies associated with inter-node memory transactions. Further, we evaluate the effects of cache block sizes between 16 and 256 bytes as well as two different page placement policies. Overall, we find that a software-only directory protocol enhanced with these mechanisms can reach between 63% and 97% of the baseline hardware-only protocol performance at a lower design complexity.
  •  
10.
  • Holtryd, Nadja, 1988, et al. (författare)
  • DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors
  • 2020
  • Ingår i: Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020. ; , s. 578-589
  • Konferensbidrag (refereegranskat)abstract
    • Cache partitioning in tile-based CMP architectures is a challenging problem because of i) the need to determine capacity allocations with low computational overhead and ii) the need to place allocations close to where they are used, in order to reduce access latency. Although, previous solutions have addressed the problem of reducing the computational overhead and incorporating locality-awareness, they suffer from the overheads of centrally determining allocations.In this paper, we propose DELTA, a novel distributed and locality-aware cache partitioning solution which works by exchanging asynchronous challenges among cores. The distributed nature of the algorithm coupled with the low computational complexity allows for frequent reconfigurations at negligible cost and for the scheme to be implemented directly in hardware. The allocation algorithm is supported by an enforcement mechanism which enables locality-aware placement of data. We evaluate DELTA on 16-and 64-core tiled CMPs with multi-programmed workloads. Our evaluation shows that DELTA improves performance by 9% and 16%, respectively, on average, compared to an unpartitioned shared last-level cache.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 169
Typ av publikation
konferensbidrag (84)
tidskriftsartikel (50)
rapport (13)
samlingsverk (redaktörskap) (12)
patent (8)
bok (1)
visa fler...
forskningsöversikt (1)
visa färre...
Typ av innehåll
refereegranskat (132)
övrigt vetenskapligt/konstnärligt (37)
Författare/redaktör
Manivannan, Madhavan ... (19)
Islam, Mafijul, 1975 (13)
Negi, Anurag, 1980 (12)
Thuresson, Martin, 1 ... (12)
Pericas, Miquel, 197 ... (11)
visa fler...
Dubois, Michel (10)
Papaefstathiou, Vasi ... (7)
Själander, Magnus, 1 ... (7)
Warg, Fredrik, 1974 (7)
Garcia, J. M. (7)
Arelakis, Angelos, 1 ... (6)
Titos Gil, Ruben, 19 ... (6)
Ekman, Magnus, 1977 (6)
Pathan, Risat, 1979 (6)
Larsson-Edefors, Per ... (4)
Svensson, Lars, 1960 (4)
Ardö, Anders (4)
Vajda, András (4)
Björk, Magnus, 1977 (4)
McKee, Sally A, 1963 (4)
Chen, Guancheng (4)
Dybdahl, Haakon (4)
Angerd, Alexandra, 1 ... (3)
Sintorn, Erik, 1980 (3)
Cristal, Adrian (3)
Bardine, Alessandro (3)
Holtryd, Nadja, 1988 (3)
Busck, Alexander (3)
Engbom, Mikael (3)
Chen, Jianwei (3)
Hughes, John, 1958 (2)
Grahn, Håkan (2)
Nilsson, Jim (2)
Marazakis, Manolis (2)
Goel, Bhavishya, 198 ... (2)
Jeppson, Kjell, 1947 (2)
Mueller, Frank (2)
Dahlgren, Fredrik, 1 ... (2)
Sheeran, Mary, 1959 (2)
Azhar, Muhammad Waqa ... (2)
Jeong, J (2)
Whalley, David (2)
Foglia, PieroFrances ... (2)
Gabrielli, G (2)
Prete, Antonio (2)
Vallejo, F (2)
Karlsson, Jonas, 197 ... (2)
Gaydadjiev, Georgi, ... (2)
De Bosschere, Koen (2)
visa färre...
Lärosäte
Chalmers tekniska högskola (169)
Blekinge Tekniska Högskola (2)
Göteborgs universitet (1)
Lunds universitet (1)
RISE (1)
Språk
Engelska (168)
Svenska (1)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (169)
Teknik (28)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy