↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Perais Arthur) "

Search: WFRF:(Perais Arthur)

Result 1-3 of 3

Sort/group result

Sort by: Hits per page:

Enumeration	Reference	Cover	Find
1.	Asgharzadeh, Ashkan, et al. (author) Free Atomics : Hardware Atomic Operations without Fences 2022 In: PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22). - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450386104 ; , s. 14-26 Conference paper (peer-reviewed)abstract Atomic Read-Modify-Write (RMW) instructions are primitive synchronization operations implemented in hardware that provide the building blocks for higher-abstraction synchronization mechanisms to programmers. According to publicly available documentation, current x86 implementations serialize atomic RMW operations, i.e., the store buffer is drained before issuing atomic RMWs and subsequent memory operations are stalled until the atomic RMW commits. This serialization, carried out by memory fences, incurs a performance cost which is expected to increase with deeper pipelines. This work proposes Free atomics, a lightweight, speculative, deadlock-free implementation of atomic operations that removes the need for memory fences, thus improving performance, while preserving atomicity and consistency. Free atomics is, to the best of our knowledge, the first proposal to enable store-to-load forwarding for atomic RMWs. Free atomics only requires simple modifications and incurs a small area overhead (15 bytes). Our evaluation using gem5-20 shows that, for a 32-core configuration, Free atomics improves performance by 12.5%, on average, for a large range of parallel workloads and 25.2%, on average, for atomic-intensive parallel workloads over a fenced atomic RMW implementation.
2.	Perais, Arthur, et al. (author) Cost-effective speculative scheduling in high performance processors 2015 In: Proc. 42nd International Symposium on Computer Architecture. - New York : ACM Press. - 9781450334020 ; , s. 247-259 Conference paper (peer-reviewed)abstract To maximize performance, out-of-order execution processors sometimes issue instructions without having the guarantee that operands will be available in time; e.g. loads are typically assumed to hit in the L1 cache and dependent instructions are issued accordingly. This form of speculation - that we refer to as speculative scheduling - has been used for two decades in real processors, but has received little attention from the research community. In particular, as pipeline depth grows, and the distance between the Issue and the Execute stages increases, it becomes critical to issue instructions dependent on variable-latency instructions as soon as possible rather than wait for the actual cycle at which the result becomes available. Unfortunately, due to the uncertain nature of speculative scheduling, the scheduler may wrongly issue an instruction that will not have its source(s) available on the bypass network when it reaches the Execute stage. In that event, the instruction is canceled and replayed, potentially impairing performance and increasing energy consumption. In this work, we do not present a new replay mechanism. Rather, we focus on ways to reduce the number of replays that are agnostic of the replay scheme. First, we propose an easily implementable, low-cost solution to reduce the number of replays caused by L1 bank conflicts. Schedule shifting always assumes that, given a dual-load issue capacity, the second load issued in a given cycle will be delayed because of a bank conflict. Its dependents are thus always issued with the corresponding delay. Second, we also improve on existing L1 hit/miss prediction schemes by taking into account instruction criticality. That is, for some criterion of criticality and for loads whose hit/miss behavior is hard to predict, we show that it is more cost-effective to stall dependents if the load is not predicted critical.
3.	Sembrant, Andreas, et al. (author) Long Term Parking (LTP) : Criticality-aware Resource Allocation in OOO Processors 2015 In: Proc. 48th International Symposium on Microarchitecture. - New York, NY, USA : ACM. - 9781450340342 ; , s. 334-346 Conference paper (peer-reviewed)abstract Modern processors employ large structures (IQ, LSQ, register file, etc.) to expose instruction-level parallelism (ILP) and memory-level parallelism (MLP). These resources are typically allocated to instructions in program order. This wastes resources by allocating resources to instructions that are not yet ready to be executed and by eagerly allocating resources to instructions that are not part of the application’s critical path.This work explores the possibility of allocating pipeline resources only when needed to expose MLP, and thereby enabling a processor design with significantly smaller structures, without sacrificing performance. First we identify the classes of instructions that should not reserve resources in program order and evaluate the potential performance gains we could achieve by delaying their allocations. We then use this information to “park” such instructions in a simpler, and therefore more efficient, Long Term Parking (LTP) structure. The LTP stores instructions until they are ready to execute, without allocating pipeline resources, and thereby keeps the pipeline available for instructions that can generate further MLP.LTP can accurately and rapidly identify which instructions to park, park them before they execute, wake them when needed to preserve performance, and do so using a simple queue instead of a complex IQ. We show that even a very simple queue-based LTP design allows us to significantly reduce IQ (64 →32) and register file (128→96) sizes while retaining MLP performance and improving energy efficiency.

Skapa referenser, mejla, bekava och länka

Permalink

Result 1-3 of 3

Refine your search

Type of publication: conference paper (3)

Type of content: peer-reviewed (3)

Author/Editor: Perais, Arthur (3); Hagersten, Erik (2); Sembrant, Andreas (2); Seznec, Andre (2); Michaud, Pierre (2); Kaxiras, Stefanos (1); show more...; Ros, Alberto (1); Carlson, Trevor E. (1); Black-Schaffer, Davi ... (1); Asgharzadeh, Ashkan (1); Cebrian, Juan M. (1); show less...

University: Uppsala University (3)

Language: English (3)

Research subject (UKÄ/SCB): Natural sciences (2); Engineering and Technology (1)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - National Library Systems
LIBRIS.kb.se

pil uppåt

Close

Copy and save the link in order to return to this view