SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Hemani Ahmed) "

Sökning: WFRF:(Hemani Ahmed)

  • Resultat 181-190 av 284
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
181.
  • Malik, Omer, et al. (författare)
  • High Level Synthesis Framework for a Coarse Grain Reconfigurable Architecture
  • 2010
  • Ingår i: 28th Norchip Conference, NORCHIP 2010. - 9781424489732 ; , s. 5669439-
  • Konferensbidrag (refereegranskat)abstract
    • A High Level Synthesis Framework for mapping DSP algorithms on a Coarse Grain Reconfigurable Architecture is presented. Behavioral specification of the algorithm in C is specified with pragmas in comments and the tool generates configware after performing timing and synchronization synthesis. Pragmas identify SIMD type concurrency and sweep the architectural space with allocation and binding annotations to produce implementations from fully serial to fully parallel. This allows user to stay at algorithmic level and guide the HLS tool to search a restricted architectural space bounded by the pragmas thus making the synthesis process more efficient and predictable.
  •  
182.
  •  
183.
  • Malik, Omer, et al. (författare)
  • Synchronizing distributed state machines in a coarse grain reconfigurable architecture
  • 2011
  • Ingår i: 2011 International Symposium on System on Chip, SoC 2011. - 9781457706721 ; , s. 128-135
  • Konferensbidrag (refereegranskat)abstract
    • This work presents methodology for synchronizing distributed FSMs (Finite State Machines) which are generated while implementing different algorithms on a coarse grain reconfigurable architecture. These FSMs interact with each other while executing algorithms and they are dependent upon each other; thus they need to be synchronized with each other for performing correct execution. The algorithms presented in this paper makes appropriate use of different strategies available for synchronizing these FSMs. The tool hides all sorts of low level details from the Programmer. It lets the designer focus on the details of algorithm (at higher level of abstraction) and cycle by cycle timings are resolved automatically.
  •  
184.
  •  
185.
  •  
186.
  • Meincke, Thomas, et al. (författare)
  • Globally asynchronous locally synchronous architecture for large high-performance ASICs
  • 1999
  • Ingår i: ; 2, s. 512-515
  • Konferensbidrag (refereegranskat)abstract
    • Clock nets are the major source of power consumption in large, high-performance ASICs and a design bottleneck when it comes to tolerable clock skew. A way to obviate the global clock net is to partition the design into large synchronous blocks each having its own clock. Data with other blocks is exchanged asynchronously using handshake signals. Adopting such a strategy requires a methodology that supports: 1) a partitioning method dividing a design into the number of synchronous blocks such that the gain due to global clock net removal exceeds the communication overhead and 2) synthesis of handshake protocols to implement the data transfer between synchronous blocks. We describe this methodology and present results of applying it to a realistic design done in 0.25 micron, ranging in operating frequencies from 20 MHz to 1 GHz. The results show that the net power savings compared to fully synchronous designs are on an average about 30%
  •  
187.
  • Mirsalari, Seyed Ahmad, et al. (författare)
  • Optimizing Self-Organizing Maps for Bacterial Genome Identification on Parallel Ultra-Low-Power Platforms
  • 2023
  • Ingår i: ICECS 2023 - 2023 30th IEEE International Conference on Electronics, Circuits and Systems: Technosapiens for Saving Humanity. - : Institute of Electrical and Electronics Engineers (IEEE).
  • Konferensbidrag (refereegranskat)abstract
    • Pathogenic bacteria significantly threaten human health, highlighting the need for precise and efficient methods for swiftly identifying bacterial species. This paper addresses the challenges associated with performing genomics computations for pathogen identification on embedded systems with limited computational power. We propose an optimized implementation of Self-Organizing Maps (SOMs) targeting a parallel ultra-low-power platform based on the RISC-V instruction set architecture. We propose two mapping methods for implementing the SOM algorithm on a parallel cluster, coupled with software techniques to improve the throughput. Orthogonally to parallelization, we investigate the impact of smaller-than-32-bit floating-point formats (smallFloats) on energy savings, precision, and performance. Our experimental results show that all smallFloat formats exhibit a 100% classification accuracy. The parallel variants achieve a speed-up of 1.98 × , 3.79 ×, and 6.83 × on 2, 4, and 8 cores, respectively. Comparing our design with a 16-bit fixed-point implementation on a coarse grain reconfigurable architecture (CGRA), the FP8 implementation achieves, on average, 1. 42 × energy efficiency, 1. 51 × speedup, and a 50% reduction in memory footprint compared to CGRA. Furthermore, FP8 vectorization increases the average speed-up by 2.5 ×.
  •  
188.
  • Ngyen, T., et al. (författare)
  • FIST : A framework to interleave spiking neural networks on CGRAs
  • 2015
  • Ingår i: Proceedings - 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2015. - : IEEE. ; , s. 751-758
  • Konferensbidrag (refereegranskat)abstract
    • Coarse Grained Reconfigurable Architectures (CGRAs) are emerging as enabling platforms to meet the high performance demanded by modern embedded applications. In many application domains (e.g. robotics and cognitive embedded systems), the CGRAs are required to simultaneously host processing (e.g. Audio/video acquisition) and estimation (e.g. audio/video/image recognition) tasks. Recent works have revealed that the efficiency and scalability of the estimation algorithms can be significantly improved by using neural networks. However, existing CGRAs commonly employ homogeneous processing resources for both the tasks. To realize the best of both the worlds (conventional processing and neural networks), we present FIST. FIST allows the processing elements and the network to dynamically morph into either conventional CGRA or a neural network, depending on the hosted application. We have chosen the DRRA as a vehicle to study the feasibility and overheads of our approach. Synthesis results reveal that the proposed enhancements incur negligible overheads (4.4% area and 9.1% power) compared to the original DRRA cell.
  •  
189.
  • Nidhi, U., et al. (författare)
  • High performance 3D-FFT implementation
  • 2013
  • Ingår i: Circuits and Systems (ISCAS), 2013 IEEE International Symposium on. - : IEEE. - 9781467357609 ; , s. 2227-2230
  • Konferensbidrag (refereegranskat)abstract
    • 3D FFT is a very data and compute intensive kernel encountered in many applications. We report a high performance design and implementation of 3D-FFT on a CGRA which supports partial reconfiguration. The hardware software multi clock design uses dynamic reconfiguration to reduce the required communication bandwidth to achieve a sustained throughput of 40 GOPS on a wordsize of 48 bits. Performance metrics including overheads and speed over software for implementations of up to 256 point 3D-FFT have been presented in the paper.
  •  
190.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 181-190 av 284
Typ av publikation
konferensbidrag (212)
tidskriftsartikel (43)
doktorsavhandling (11)
rapport (8)
bokkapitel (4)
annan publikation (2)
visa fler...
licentiatavhandling (2)
samlingsverk (redaktörskap) (1)
proceedings (redaktörskap) (1)
visa färre...
Typ av innehåll
refereegranskat (246)
övrigt vetenskapligt/konstnärligt (38)
Författare/redaktör
Hemani, Ahmed (225)
Hemani, Ahmed, 1961- (47)
Jantsch, Axel (44)
Tenhunen, Hannu (43)
Öberg, Johnny (41)
Ellervee, Peeter (36)
visa fler...
Paul, Kolin (30)
Kumar, Shashi (24)
Stathis, Dimitrios (22)
Plosila, Juha (20)
Farahini, Nasim (20)
Postula, Adam (20)
Svantesson, Bengt (19)
Abbas, Haider (16)
Yngström, Louise (16)
Li, Shuo (15)
Yang, Yu (15)
Jafri, Syed Mohammad ... (14)
Kumar, Anshul (13)
Jafri, Syed (11)
O'Nils, Mattias (10)
Daneshtalab, Masoud (10)
Chabloz, Jean-Michel (10)
Penolazzi, Sandro (10)
Hemani, Ahmed, Profe ... (9)
Tajammul, Muhammad A ... (9)
Lu, Zhonghai (8)
Liu, Pei (8)
Lindqvist, Dan (8)
Meincke, Thomas (8)
Jafri, Syed M. A. H. (8)
Magnusson, Christer (7)
Sander, Ingo (7)
Badawi, Mohammad (7)
Lansner, Anders, Pro ... (6)
Zou, Zhuo (6)
Shami, Muhammad Ali (6)
Olsson, Thomas (5)
Nilsson, Peter (5)
Deb, Abhijit Kumar (5)
Sohofi, Hassan (5)
Isoaho, Jouni (5)
Mokhtari, Mehran (5)
Xu, Jiawei (5)
Zheng, Li-Rong (4)
Lansner, Anders (4)
Li, Feng (4)
Lansner, Anders, Pro ... (4)
Boppu, Srinivas (4)
Wang, Deyu (4)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (272)
Stockholms universitet (13)
Lunds universitet (4)
Mittuniversitetet (3)
Uppsala universitet (2)
Högskolan i Halmstad (2)
visa fler...
Umeå universitet (1)
Linköpings universitet (1)
Jönköping University (1)
visa färre...
Språk
Engelska (282)
Odefinierat språk (2)
Forskningsämne (UKÄ/SCB)
Teknik (225)
Naturvetenskap (51)
Medicin och hälsovetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy