SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:uu-523379"
 

Sökning: id:"swepub:oai:DiVA.org:uu-523379" > ITSLF :

ITSLF : Inter-Thread Store-to-Load Forwarding in Simultaneous Multithreading

Feliu, Josue (författare)
Univ Murcia, Dept Comp Engn, Murcia, Spain.
Ros, Alberto (författare)
Univ Murcia, Dept Comp Engn, Murcia, Spain.
Acacio, Manuel E. (författare)
Univ Murcia, Dept Comp Engn, Murcia, Spain.
visa fler...
Kaxiras, Stefanos (författare)
Uppsala universitet,Datorarkitektur och datorkommunikation,Avdelningen för datorteknik,Datorteknik
visa färre...
Univ Murcia, Dept Comp Engn, Murcia, Spain Datorarkitektur och datorkommunikation (creator_code:org_t)
2021-10-17
2021
Engelska.
Ingår i: Proceedings of 54th Annual IEEE/ACM International Symposium on Microarchitecture, Micro 2021. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450385572 ; , s. 1296-1308
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • In this paper, we argue that, for a class of fine-grain, synchronizationintensive, parallel workloads, it is advantageous to consolidate synchronization and communication as much as possible among the threads of simultaneous multithreading (SMT) cores. While, today, the shared L1 is the closest coherent level where synchronization and communication between SMT threads can take place, we observe that there is an even closer shared level, entirely inside a single core. This level comprises the load queues (LQ) and store queues (SQ) / store buffers (SB) of the SMT threads and to the best of our knowledge it has never been used as such. The reason is that if we allow communication of different SMT threads via their LQs and SQs/SBs, i.e., inter-thread store-to-load forwarding (ITSLF), we violate write atomicity with respect to the outside world, beyond the acceptable model of read-own-write-early multiple-copy atomicity (rMCA). The key insight of our work is that we can accelerate synchronization and communication among SMT threads with inter-thread store-to-load forwarding, without affecting the memory model-in particular without violating rMCA. We demonstrate how we can achieve this entirely through speculative interactions between LQs and SQs/SBs of different threads, while ensuring deadlock-free execution. Without changing the architectural model, the ISA, or the software, and without adding extra hardware in the form of a specialized accelerator, our insight enables a new design point for a standard architecture. We demonstrate that with ITSLF, workloads scale better on a single 8-way SMT core (with the resources of a single-threaded core) than on a baseline SMT (with or without optimizations), or on 8 single-threaded cores.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Engineering (hsv//eng)

Nyckelord

Simultaneous multithreading
memory consistency
store-to-load forwarding
multiple-copy atomicity

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy