↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Acacio Manuel E.) "

Search: WFRF:(Acacio Manuel E.)

Result 1-4 of 4

Sort/group result

Sort by: Hits per page:

Enumeration	Reference	Cover	Find
1.	Cebrián, Juan M., et al. (author) A dedicated private-shared cache design for scalable multiprocessors 2017 In: Concurrency and Computation. - : Wiley. - 1532-0626 .- 1532-0634. ; 29:2 Journal article (peer-reviewed)
2.	Feliu, Josue, et al. (author) ITSLF : Inter-Thread Store-to-Load Forwarding in Simultaneous Multithreading 2021 In: Proceedings of 54th Annual IEEE/ACM International Symposium on Microarchitecture, Micro 2021. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450385572 ; , s. 1296-1308 Conference paper (peer-reviewed)abstract In this paper, we argue that, for a class of fine-grain, synchronizationintensive, parallel workloads, it is advantageous to consolidate synchronization and communication as much as possible among the threads of simultaneous multithreading (SMT) cores. While, today, the shared L1 is the closest coherent level where synchronization and communication between SMT threads can take place, we observe that there is an even closer shared level, entirely inside a single core. This level comprises the load queues (LQ) and store queues (SQ) / store buffers (SB) of the SMT threads and to the best of our knowledge it has never been used as such. The reason is that if we allow communication of different SMT threads via their LQs and SQs/SBs, i.e., inter-thread store-to-load forwarding (ITSLF), we violate write atomicity with respect to the outside world, beyond the acceptable model of read-own-write-early multiple-copy atomicity (rMCA). The key insight of our work is that we can accelerate synchronization and communication among SMT threads with inter-thread store-to-load forwarding, without affecting the memory model-in particular without violating rMCA. We demonstrate how we can achieve this entirely through speculative interactions between LQs and SQs/SBs of different threads, while ensuring deadlock-free execution. Without changing the architectural model, the ISA, or the software, and without adding extra hardware in the form of a specialized accelerator, our insight enables a new design point for a standard architecture. We demonstrate that with ITSLF, workloads scale better on a single 8-way SMT core (with the resources of a single-threaded core) than on a baseline SMT (with or without optimizations), or on 8 single-threaded cores.
3.	Feliu, Josue, et al. (author) Speculative inter-thread store-to-load forwarding in SMT architectures 2023 In: Journal of Parallel and Distributed Computing. - : Elsevier. - 0743-7315 .- 1096-0848. ; 173, s. 94-106 Journal article (peer-reviewed)abstract Applications running on out-of-order cores have benefited for decades of store-to-load forwarding which accelerates communication of store values to loads of the same thread. Despite threads running on a simultaneous multithreading (SMT) core could also access the load queues (LQ) and store queues (SQ) / store buffers (SB) of other threads to allow inter-thread store-to-load forwarding, we have skipped exploiting it because if we allow communication of different SMT threads via their LQs and SQs/SBs, write atomicity may be violated with respect to the outside world beyond the acceptable model of read -own-write-early multiple-copy atomicity (rMCA).In our prior work, we leveraged this idea to propose inter-thread store-to-load forwarding (ITSLF). ITLSF accelerates synchronization and communication of threads running in a simultaneous multi-threading processor by allowing stores in the store-queue of a thread to forward data to loads of another thread running in the same core without violating rMCA.In this work, we extend the original ITSLF mechanism to allow inter-thread forwarding from speculative stores (Spec-ITSLF). Spec-ITSLF allows forwarding store values to other threads earlier, which further accelerates synchronization. Spec-ITSLF outperforms a baseline SMT core by 15%, which is 2% better on average (and up to 5% for the TATP workload) than the original ITSLF mechanism. More importantly, Spec-ITSLF is on par with the original ITSLF mechanism regarding storage overhead but does not need to keep track of the speculative state of stores, which was an important source of overhead and complexity in the original mechanism. (c) 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
4.	Shimchenko, Marina, et al. (author) Analysing software prefetching opportunities in hardware transactional memory 2022 In: Journal of Supercomputing. - : Springer Nature. - 0920-8542 .- 1573-0484. ; 78:1, s. 919-944 Journal article (peer-reviewed)abstract Hardware transactional memory emerged to make parallel programming more accessible. However, the performance pitfall of this technique is squashing speculatively executed instructions and re-executing them in case of aborts, ultimately resorting to serialization in case of repeated conflicts. A significant fraction of aborts occurs due to conflicts (concurrent reads and writes to the same memory location performed by different threads). Our proposal aims to reduce conflict aborts by reducing the window of time during which transactional regions can suffer conflicts. We achieve this by using software prefetching instructions inserted automatically at compile-time. Through these prefetch instructions, we intend to bring the necessary data for each transaction from the main memory to the cache before the transaction itself starts to execute, thus converting the otherwise long latency cache misses into hits during the execution of the transaction. The obtained results show that our approach decreases the number of aborts by 30% on average and improves performance by up to 19% and 10% for two out of the eight evaluated benchmarks. We provide insights into when our technique is beneficial given certain characteristics of the transactional regions, the advantages and disadvantages of our approach, and finally, discuss potential solutions to overcome some of its limitations.

Skapa referenser, mejla, bekava och länka

Permalink

Result 1-4 of 4

Refine your search

Type of publication: journal article (3); conference paper (1)

Type of content: peer-reviewed (4)

Author/Editor: Ros, Alberto (4); Acacio, Manuel E. (4); Kaxiras, Stefanos (3); Jimborean, Alexandra (2); Fernández-Pascual, R ... (2); Feliu, Josue (2); show more...; Cebrian, Juan M. (1); Titos-Gil, Ruben (1); Shimchenko, Marina (1); show less...

University: Uppsala University (4)

Language: English (4)

Research subject (UKÄ/SCB): Natural sciences (4)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - National Library Systems
LIBRIS.kb.se

pil uppåt

Close

Copy and save the link in order to return to this view