Parallel distributed scalable runtime address generation scheme for a coarse grain reconfigurable computation and storage fabric

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Sökning: id:"swepub:oai:DiVA.org:kth-159997" > Parallel distribute...

1 av 1
Föregående post
Nästa post
Till träfflistan

Parallel distributed scalable runtime address generation scheme for a coarse grain reconfigurable computation and storage fabric

Farahini, Nasim (författare): KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS

Hemani, Ahmed (författare): KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS

Sohofi, Hassan (författare): KTH,Elektroniksystem,ESY ELECTRONICS AND EMBEDDED SYSTEMS

visa fler...

Jafri, Syed M. A. H. (författare): KTH,Elektroniksystem

Tajammul, Muhammad Adeel (författare): KTH,Elektroniksystem

Paul, Kolin (författare)

visa färre...

(creator_code:org_t)

Elsevier BV, 2014
2014
Engelska.
Ingår i: Microprocessors and microsystems. - : Elsevier BV. - 0141-9331 .- 1872-9436. ; 38:8, s. 788-802

Relaterad länk:: https://urn.kb.se/re...; visa fler...; https://doi.org/10.1...; visa färre...

Tidskriftsartikel (refereegranskat)

Abstract Ämnesord

Stäng

This paper presents a hardware based solution for a scalable runtime address generation scheme for DSP applications mapped to a parallel distributed coarse grain reconfigurable computation and storage fabric. The scheme can also deal with non-affine functions of multiple variables that typically correspond to multiple nested loops. The key innovation is the judicious use of two categories of address generation resources. The first category of resource is the low cost AGU that generates addresses for given address bounds for affine functions of up to two variables. Such low cost AGUs are distributed and associated with every read/write port in the distributed memory architecture. The second category of resource is relatively more complex but is also distributed but shared among a few storage units and is capable of handling more complex address generation requirements like dynamic computation of address bounds that are then used to configure the AGUs, transformation of non-affine functions to affine function by computing the affine factor outside the loop, etc. The runtime computation of the address constraints results in negligibly small overhead in latency, area and energy while it provides substantial reduction in program storage, reconfiguration agility and energy compared to the prevalent pre-computation of address constraints. The efficacy of the proposed method has been validated against the prevalent address generation schemes for a set of six realistic DSP functions. Compared to the pre-computation method, the proposed solution achieved 75% average code compaction and compared to the centralized runtime address generation scheme, the proposed solution achieved 32.7% average performance improvement.

Hitta via bibliotek

Microprocessors and microsystems (Sök värdpublikationen i LIBRIS)

Till lärosätets databas

1 av 1
Föregående post
Nästa post
Till träfflistan

Hitta mer i SwePub

Av författaren/redakt...: Farahini, Nasim; Hemani, Ahmed; Sohofi, Hassan; Jafri, Syed M. A ...; Tajammul, Muhamm ...; Paul, Kolin

Om ämnet

NATURVETENSKAP: NATURVETENSKAP; och Data och informa ...; och Datavetenskap

Artiklar i publikationen: Microprocessors ...

Av lärosätet: Kungliga Tekniska Högskolan

Sök utanför SwePub

Sök vidare i:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se

Parallel distributed scalable runtime address generation scheme for a coarse grain reconfigurable computation and storage fabric

Ämnesord

Nyckelord

Publikations- och innehållstyp

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Sök utanför SwePub