↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Modarressi Mehdi) "

Search: WFRF:(Modarressi Mehdi)

Result 1-7 of 7

Sort/group result

Sort by: Hits per page:

Enumeration	Reference	Cover	Find
1.	Bidgoli, Ali M., et al. (author) NeuroPIM : Felxible Neural Accelerator for Processing-in-Memory Architectures 2023 In: Proceedings - 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems, DDECS 2023. - : Institute of Electrical and Electronics Engineers Inc.. - 9798350332773 ; , s. 51-56 Conference paper (peer-reviewed)abstract The performance of microprocessors under many modern workloads is mainly limited by the off-chip memory bandwidth. The emerging process-in-memory paradigm present a unique opportunity to reduce data movement overheads by moving computation closer to memory. State-of-the-art processing-in-memory proposals stack a logic layer on top of one or multiple memory layers in a 3D fashion and leverage the logic layer to build near-memory processing units. Such processing units are either application-specific accelerators or general-purpose cores. In this paper, we present NeuroPIM, a new processing-in-memory architecture that uses a neural network as the memory-side general-purpose accelerator. This design is mainly motivated by the observation that in many real-world applications, some program regions, or even the entire program, can be replaced by a neural network that is learned to approximate the program's output. NeuroPIM benefits from both the flexibility of general-purpose processors and superior performance of application-specific accelerators. Experimental results show that NeuroPIM provides up to 41% speedup over a processor-side neural network accelerator and up to 8x speedup over a general-purpose processor.
2.	Dabiri, Bita, et al. (author) Network-on-ReRAM for Scalable Processing-in-Memory Architecture Design 2021 In: Proceedings - 2021 24th Euromicro Conference on Digital System Design, DSD 2021. - 9781665427036 ; , s. 143-149 Conference paper (peer-reviewed)abstract The non-volatile metal-oxide resistive random access memory (ReRAM) is an emerging alternative for the current memory technologies. The unique capability of ReRAM to perform analog and digital arithmetic and logic operations has enabled this technology to incorporate both computation and memory capabilities on the same unit. Due to this interesting property, there is a growing trend in recent years to implement emerging data-intensive applications on ReRAM structures. A typical ReRAM-based processing-in-memory architecture may consist tens to hundreds of ReRAM units (mats) that can either store or process data. To support such large-scale ReRAM structure, this paper proposes a scalable network-on-ReRAM architecture. The proposed network employs a novel associative router architecture, designed based on the ReRAM-based content-addressable memories. With the in-memory packet processing capability, this router yields higher throughput and resource utilization levels than a conventional router. This router is technology compatible with ReRAM and as our evaluations show, employing it to build a network-on-ReRAM makes the emerging ReRAM-based processing-in-memory architectures more scalable and performance-efficient.
3.	Hojabr, Reza, et al. (author) Customizing Clos Network-on-Chip for Neural Networks 2017 In: IEEE Transactions on Computers. - 0018-9340 .- 1557-9956. ; 66:11, s. 1865-1877 Journal article (peer-reviewed)
4.	Mahdiani, Hoda, et al. (author) ΔnN : Power-Efficient Neural Network Acceleration Using Differential Weights 2020 In: IEEE Micro. - : IEEE COMPUTER SOC. - 0272-1732 .- 1937-4143. ; 40:1, s. 67-74 Journal article (peer-reviewed)abstract The enormous and ever-increasing complexity of state-of-the-art neural networks has impeded the deployment of deep learning on resource-limited embedded and mobile devices. To reduce the complexity of neural networks, this article presents Delta NN, a power-efficient architecture that leverages a combination of the approximate value locality of neuron weights and algorithmic structure of neural networks. Delta NN keeps each weight as its difference (Delta) to the nearest smaller weight: each weight reuses the calculations of the smaller weight, followed by a calculation on the Delta value to make up the difference. We also round up/down the Delta to the closest power of two numbers to further reduce complexity. The experimental results show that Delta NN boosts the average performance by 14%-37% and reduces the average power consumption by 17%-49% over some state-of-the-art neural network designs.
5.	Rezaei, Seyyed Hossein Seyyedaghaei, et al. (author) A Three-Dimensional Networks-on-Chip Architecture with Dynamic Buffer Sharing 2016 In: 2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP). - 9781467387767 ; , s. 771-776 Conference paper (peer-reviewed)abstract 3D integration is a practical solution for overcoming the failure of Dennard scaling in future technology generations. This emerging technology stacks several die slices on top of each other on a single chip in order to provide higher-bandwidth and lower-latency than a 2D design due to extremely shorter inter-layer distances in the third dimension and. In this paper, we leverage the low latency vertical links to address buffer management, one of the most important design and management issues in Network-on-Chip(NoC). To this end, we present VerBuS, an architecture for 3D routers with Vertical BUffer Sharing capability enabled by ultra-low latency vertical links of a 3D chip. VerBuS can share virtual channels (VC) between vertically stacked routers. This way, the buffering capacity of a highly loaded router is increased by using idle VCs of vertically adjacent routers. Experimental results show up to 20% improvement in NoC performance metrics over state-of-the-art 3D router designs.
6.	Rezaei, Seyyed Hossein Seyyedaghaei, et al. (author) Fault-Tolerant 3-D Network-on-Chip Design using Dynamic Link Sharing 2016 In: PROCEEDINGS OF THE 2016 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE). - : IEEE conference proceedings. - 9783981537079 ; , s. 1195-1200 Conference paper (peer-reviewed)abstract The most important challenge in the emerging 3D integration technology is the higher temperature, particularly in the layers that are more distant from the heat sink, compared to planar 2D chips. High temperature, in turn, increases circuit's susceptibility to permanent and intermittent faults. On the other hand, the fast and high-bandwidth vertical links in the 3D integration technology have opened new horizons for network-on-chip (NoC) design innovations. In this paper, we leverage these ultra-low-latency vertical links to design a fault-tolerant 3D NoC architecture. In this architecture, permanent and intermittent defects on links and crossbars are bypassed by borrowing the idle bandwidth from vertically adjacent links and crossbars. Evaluation results under synthetic and realistic workloads show that the proposed fault-tolerance mechanism offers higher reliability and lower performance loss, when compared with state-of-the-art fault-tolerant 3D NoC designs.
7.	SeyyedAghaei Rezaei, S. H., et al. (author) NoM : Network-on-Memory for Inter-bank Data Transfer in Highly-banked Memories 2020 In: IEEE Computer Architecture Letters. - : Institute of Electrical and Electronics Engineers (IEEE). - 1556-6056. ; 19:1, s. 80-83 Journal article (peer-reviewed)abstract Data copy is a widely-used memory operation in many programs and operating system services. In conventional computers, data copy is often carried out by two separate read and write transactions that pass data back and forth between the memory hierarchy and processor registers. Some prior mechanisms propose to avoid this unnecessary data movement by using the shared internal bus in DRAM chip to directly copy data between two DRAM banks. While these methods exhibit superior performance, compared to conventional techniques, this technique does not allow data copy over different DRAM channels. Hence, this technique has limited benefit for the emerging 3D stacked memories (such as HMC and HBM) that contains tens of banks across multiple memory controllers. In this paper, we present Network-on-Memory (NoM), a lightweight inter-bank communication scheme that enables direct data copy within memory. NoM adopts a TDM-based circuit-switching design, where circuit setup is done by the memory controller. Compared to previous state-of-the-art approaches, NoM enables both data copy over multiple DRAM channels and concurrent copy operation. Our evaluation shows that NoM improves the performance of data-intensive workloads by 3.8X on average compare to the state-of-the-art techniques, respectively. IEEE

Skapa referenser, mejla, bekava och länka

Permalink

Result 1-7 of 7

Refine your search

Type of publication: conference paper (4); journal article (3)

Type of content: peer-reviewed (7)

Author/Editor: Daneshtalab, Masoud (7); Modarressi, Mehdi (7); Rezaei, Seyyed Hosse ... (2); Khonsari, Ahmad (1); Bidgoli, Ali M. (1); Fattahi, Sepideh (1); show more...; Rezaei, Seyyed H. S. (1); Dabiri, Bita (1); Hojabr, Reza (1); Yasoubi, Ali (1); Mutlu, Onur (1); Mahdiani, Hoda (1); Khadem, Alireza (1); Ghanbari, Azam (1); Fattahi-Bayat, Farim ... (1); Roshanisefat, Shervi ... (1); Aminabadi, Reza Yazd ... (1); SeyyedAghaei Rezaei, ... (1); Ausavarungnirun, Rac ... (1); Sadrosadati, Mohamma ... (1); show less...

University: Royal Institute of Technology (4); Mälardalen University (4)

Language: English (7)

Research subject (UKÄ/SCB): Natural sciences (5); Engineering and Technology (3)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - National Library Systems
LIBRIS.kb.se

pil uppåt

Close

Copy and save the link in order to return to this view