SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:kth-159997"
 

Sökning: id:"swepub:oai:DiVA.org:kth-159997" > Parallel distribute...

Parallel distributed scalable runtime address generation scheme for a coarse grain reconfigurable computation and storage fabric

Farahini, Nasim (författare)
KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS
Hemani, Ahmed (författare)
KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS
Sohofi, Hassan (författare)
KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS
visa fler...
Jafri, Syed M. A. H. (författare)
KTH,Elektroniksystem
Tajammul, Muhammad Adeel (författare)
KTH,Elektroniksystem
Paul, Kolin (författare)
visa färre...
 (creator_code:org_t)
Elsevier BV, 2014
2014
Engelska.
Ingår i: Microprocessors and microsystems. - : Elsevier BV. - 0141-9331 .- 1872-9436. ; 38:8, s. 788-802
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • This paper presents a hardware based solution for a scalable runtime address generation scheme for DSP applications mapped to a parallel distributed coarse grain reconfigurable computation and storage fabric. The scheme can also deal with non-affine functions of multiple variables that typically correspond to multiple nested loops. The key innovation is the judicious use of two categories of address generation resources. The first category of resource is the low cost AGU that generates addresses for given address bounds for affine functions of up to two variables. Such low cost AGUs are distributed and associated with every read/write port in the distributed memory architecture. The second category of resource is relatively more complex but is also distributed but shared among a few storage units and is capable of handling more complex address generation requirements like dynamic computation of address bounds that are then used to configure the AGUs, transformation of non-affine functions to affine function by computing the affine factor outside the loop, etc. The runtime computation of the address constraints results in negligibly small overhead in latency, area and energy while it provides substantial reduction in program storage, reconfiguration agility and energy compared to the prevalent pre-computation of address constraints. The efficacy of the proposed method has been validated against the prevalent address generation schemes for a set of six realistic DSP functions. Compared to the pre-computation method, the proposed solution achieved 75% average code compaction and compared to the centralized runtime address generation scheme, the proposed solution achieved 32.7% average performance improvement.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Nyckelord

Streaming address generation
CGRA
Parallel distributed DSP
Code compaction

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy