SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Plosila J.) "

Sökning: WFRF:(Plosila J.)

  • Resultat 1-50 av 64
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Mohamed, S. A. S., et al. (författare)
  • Asynchronous Corner Tracking Algorithm Based on Lifetime of Events for DAVIS Cameras
  • 2020
  • Ingår i: 15th International Symposium on Visual Computing, ISVC 2020. - Cham : Springer Science and Business Media Deutschland GmbH. ; , s. 530-541
  • Konferensbidrag (refereegranskat)abstract
    • Event cameras, i.e., the Dynamic and Active-pixel Vision Sensor (DAVIS) ones, capture the intensity changes in the scene and generates a stream of events in an asynchronous fashion. The output rate of such cameras can reach up to 10 million events per second in high dynamic environments. DAVIS cameras use novel vision sensors that mimic human eyes. Their attractive attributes, such as high output rate, High Dynamic Range (HDR), and high pixel bandwidth, make them an ideal solution for applications that require high-frequency tracking. Moreover, applications that operate in challenging lighting scenarios can exploit from the high HDR of event cameras, i.e., 140 dB compared to 60 dB of traditional cameras. In this paper, a novel asynchronous corner tracking method is proposed that uses both events and intensity images captured by a DAVIS camera. The Harris algorithm is used to extract features, i.e., frame-corners from keyframes, i.e., intensity images. Afterward, a matching algorithm is used to extract event-corners from the stream of events. Events are solely used to perform asynchronous tracking until the next keyframe is captured. Neighboring events, within a window size of 5 × 5 pixels around the event-corner, are used to calculate the velocity and direction of extracted event-corners by fitting the 2D planar using a randomized Hough transform algorithm. Experimental evaluation showed that our approach is able to update the location of the extracted corners up to 100 times during the blind time of traditional cameras, i.e., between two consecutive intensity images.
  •  
2.
  • Mohamed, S. A. S., et al. (författare)
  • DBA-Filter : A Dynamic Background Activity Noise Filtering Algorithm for Event Cameras
  • 2022
  • Ingår i: Proceedings of the 2021 Computing Conference, Volume 1. - Cham : Springer Science and Business Media Deutschland GmbH. ; , s. 685-696
  • Konferensbidrag (refereegranskat)abstract
    • Newly emerged dynamic vision sensors (DVS) offer a great potential over traditional sensors (e.g. CMOS) since they have a high temporal resolution in the order of μs, ultra-low power consumption and high dynamic range up to 140 dB compared to 60 dB in frame cameras. Unlike traditional cameras, the output of DVS cameras is a stream of events that encodes the location of the pixel, time, and polarity of the brightness change. An event is triggered when the change of brightness, i.e. log intensity, of a pixel exceeds a certain threshold. The output of event cameras often contains a significant amount of noise (outlier events) alongside the signal (inlier events). The main cause of that is transistor switch leakage and noise. This paper presents a dynamic background activity filtering, called DBA-filter, for event cameras based on an adaptation of the K-nearest neighbor (KNN) algorithm and the optical flow. Results show that the proposed algorithm is able to achieve a high signal to noise ratio up to 13.64 dB. 
  •  
3.
  • Mohamed, S. A. S., et al. (författare)
  • Dynamic resource-aware corner detection for bio-inspired vision sensors
  • 2020
  • Ingår i: 2020 25th International Conference on Pattern Recognition, (ICPR). - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 10465-10472
  • Konferensbidrag (refereegranskat)abstract
    • Event-based cameras are vision devices that transmit only brightness changes with low latency and ultra-low power consumption. Such characteristics make event-based cameras attractive in the field of localization and object tracking in resource-constrained systems. Since the number of generated events in such cameras is huge, the selection and filtering of the incoming events are beneficial from both increasing the accuracy of the features and reducing the computational load. In this paper, we present an algorithm to detect asynchronous corners form a stream of events in real-time on embedded systems. The algorithm is called the Three Layer Filtering-Harris or TLF-Harris algorithm. The algorithm is based on an events' filtering strategy whose purpose is 1) to increase the accuracy by deliberately eliminating some incoming events, i.e., noise and 2) to improve the real-time performance of the system, i.e., preserving a constant throughput in terms of input events per second, by discarding unnecessary events with a limited accuracy loss. An approximation of the Harris algorithm, in turn, is used to exploit its high-quality detection capability with a low-complexity implementation to enable seamless real-time performance on embedded computing platforms. The proposed algorithm is capable of selecting the best corner candidate among neighbors and achieves an average execution time savings of 59% compared with the conventional Harris score. Moreover, our approach outperforms the competing methods, such as eFAST, eHarris, and FA-Harris, in terms of real-time performance, and surpasses Arc* in terms of accuracy.
  •  
4.
  • Yasin, J. N., et al. (författare)
  • Dynamic Formation Reshaping Based on Point Set Registration in a Swarm of Drones
  • 2021
  • Ingår i: Advances in Intelligent Systems and Computing. - Cham : Springer Nature. ; , s. 577-588
  • Konferensbidrag (refereegranskat)abstract
    • This work focuses on the formation reshaping in an optimized manner in autonomous swarm of drones. Here, the two main problems are: 1) how to break and reshape the initial formation in an optimal manner, and 2) how to do such reformation while minimizing the overall deviation of the drones and the overall time, i.e. without slowing down. To address the first problem, we introduce a set of routines for the drones/agents to follow while reshaping to a secondary formation shape. And the second problem is resolved by utilizing the temperature function reduction technique, originally used in the point set registration process. The goal is to be able to dynamically reform the shape of multi-agent based swarm in near-optimal manner while going through narrow openings between, for instance obstacles, and then bringing the agents back to their original shape after passing through the narrow passage using point set registration technique.
  •  
5.
  • Yasin, J. N., et al. (författare)
  • Low-cost ultrasonic based object detection and collision avoidance method for autonomous robots
  • 2021
  • Ingår i: International Journal of Information Technology (Singapore). - : Springer Nature. - 2511-2104 .- 2511-2112. ; 13:1, s. 97-107
  • Tidskriftsartikel (refereegranskat)abstract
    • This work focuses on the development of an effective collision avoidance algorithm that detects and avoids obstacles autonomously in the vicinity of a potential collision by using a single ultrasonic sensor and controlling the movement of the vehicle. The objectives are to minimise the deviation from the vehicle’s original path and also the development of an algorithm utilising one of the cheapest sensors available for very lost cost systems. For instance, in a scenario where the main ranging sensor malfunctions, a backup low cost sensor is required for safe navigation of the vehicle while keeping the deviation to a minimum. The developed algorithm utilises only one ultrasonic sensor and approximates the front shape of the detected object by sweeping the sensor mounted on top of the unmanned vehicle. In this proposed approach, the sensor is rotated for shape approximation and edge detection instead of moving the robot around the encountered obstacle. It has been tested in various indoor situations using different shapes of objects, stationary objects, moving objects, and soft or irregularly shaped objects. The results show that the algorithm provides satisfactory outcomes by entirely avoiding obstacles and rerouting the vehicle with a minimal deviation.
  •  
6.
  • Yasin, J N, et al. (författare)
  • Navigation of Autonomous Swarm of Drones Using Translational Coordinates
  • 2020
  • Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - Cham : Springer. ; , s. 353-362
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • This work focuses on an autonomous swarm of drones, a multi-agent system, where the leader agent has the capability of intelligent decision making while the other agents in the swarm follow the leader blindly. The proposed algorithm helps with cost cutting especially in the multi-drone systems, i.e., swarms, by reducing the power consumption and processing requirements of each individual agent. It is shown that by applying a pre-specified formation design with feedback cross-referencing between the agents, the swarm as a whole can not only maintain the desired formation and navigate but also avoid collisions with obstacles and other drones. Furthermore, the power consumed by the nodes in the considered test scenario, is reduced by 50% by utilising the proposed methodology. 
  •  
7.
  • Guang, L., et al. (författare)
  • Coarse and fine-grained monitoring and reconfiguration for energy-efficient NoCs
  • 2012
  • Ingår i: System on Chip (SoC), 2012 International Symposium on. - : IEEE. - 9781467328951 ; , s. 6376351-
  • Konferensbidrag (refereegranskat)abstract
    • Comparative evaluations of centralized, clustered and distributed architectures, for energy management in NoCs, are presented. The paper starts with the systematic examination of the monitoring, decision-making, and reconfiguration processes in building coarse and fine-grained self-adaptation architectures. With examining the physical support in modern technology, network-wide, cluster-wide and per-node energy-management architectures on NoCs are presented, utilizing either voltage regulators or multiple on-chip power delivery networks (MPNs). To identify the effectiveness and efficiency of energy-performance tradeoffs, extensive quantitative simulations are performed with various temporal and spatially changing traffics. Based on the results, we can first observe that the centralized architecture can not adapt to the traffic's spatial locality for effective energy-performance tradeoff. Second, the distributed energy management has the lowest energy-delay product mostly attributed to the fast voltage switching of MPNs, while the synchronization incurs noticeable energy overhead. The clustered architecture, last but not least, is a suitable alternative when the advanced MPN technology is not available. It has low energy and energy-delay product, with very small energy overhead from the monitoring communication.
  •  
8.
  • Guang, L., et al. (författare)
  • HLS-DoNoC : High-level simulator for dynamically organizational NoCs
  • 2012
  • Ingår i: Design and Diagnostics of Electronic Circuits & Systems (DDECS), 2012 IEEE 15th International Symposium on. - : IEEE. - 9781467311854 ; , s. 89-94
  • Konferensbidrag (refereegranskat)abstract
    • A high-level simulator is presented for the design and analysis of dynamically organizational Networks-on-Chip (DoNoCs). The DoNoC is able to organize statically or dynamically different network nodes for run-time coarse and fine grained reconfiguration, in particular power management. As an important step in the design flow, a simulator for early-stage design exploration is the focus of the paper. Built upon classic wormhole-based NoC architecture, the simulator is capable of experimenting diverse run-time monitoring and reconfiguration methods. In particular, dynamic clusterization can be performed with inter-cluster interfaces properly configured at the run-time. The simulator is flit-level accurate, trace-driven, and easy-to-reconfigure. It supports both synchronous and ratiochronous timing, and can provide the communication performance and power/energy consumption. The paper demonstrates the usage of the simulator in the design of various cluster-based power management schemes.
  •  
9.
  • Guang, L., et al. (författare)
  • Survey of self-adaptive NoCs with energy-efficiency and dependability
  • 2012
  • Ingår i: International Journal of Embedded and Real-Time Communication Systems. - : IGI Global. - 1947-3176 .- 1947-3184. ; 3:2, s. 1-22
  • Forskningsöversikt (refereegranskat)abstract
    • The self-adaptive Network-on-Chip (NoC) is a promising communication architecture for massively parallel embedded systems. With constant technology scaling and the consequent stronger influence of process variations, the necessity of run-time monitoring and adaptive reconfiguration becomes widely acknowledged. This article presents a survey of existing techniques and methods, in particular for energy efficiency and dependability. The article firstly examines the motivation of self-adaptive computing in parallel embedded systems. A self-adaptive system model is abstracted, which is composed of goals, monitoring interface, and self-adaptation. Based on the model, the authors extensively survey previous works addressing adaptive NoCs with different monitoring techniques and reconfiguration methods, for power/energy optimization and dependability enhancement. Several design examples are elaborated which serve proper guiding purposes. The authors also identify important issues which are often overlooked or deserve more attention. The article provides review and insight for future design on this topic.
  •  
10.
  • Jafri, Syed, et al. (författare)
  • Implementation and evaluation of configuration scrubbing on CGRAs : A case study
  • 2013
  • Ingår i: 2013 International Symposium on System-on-Chip, SoC 2013 - Proceedings. - : IEEE Computer Society. ; , s. 6675262-
  • Konferensbidrag (refereegranskat)abstract
    • This paper investigates the overhead imposed by various configuration scrubbing techniques used in fault-tolerant Coarse Grained Reconfigurable Arrays (CGRAs). Today, reconfigurable architectures host large configuration memories. As we progress further in the nanometer regime, these configuration memories have become increasingly susceptible to single event upsets caused e.g. by cosmic radiation. Configuration scrubbing is a frequently used technique to protect these configuration memories against single event upsets. Existing works on configuration scrubbing deal only with FPGA without any reference to the CGRAs (in which configuration memories consume up to 50% of silicon area). Moreover, in the known literature lacks a comprehensive comparison of various configuration scrubbing techniques to guide system designers about the merits/demerits of different scrubbing methods which could be applied to CGRAs. To address these problems, in this paper we classify various configuration scrubbing techniques and quantify their trade-offs when implemented on a CGRA. Synthesis results reveal that scrubbing logic incurs negligible silicon overhead (up to 3% of the area of computational units). Simulation results obtained for a few algorithms/applications (FFT, FIR, matrix multiplication, and WLAN) show that the choice of the configuration scrubbing scheme (external vs. internal) has significant impact on both the size of configuration memory and the number of reconfiguration cycles (respectively 20-80% more and up to 38 times more for the former).
  •  
11.
  • Mohamed, S. A. S., et al. (författare)
  • A Survey on Odometry for Autonomous Navigation Systems
  • 2019
  • Ingår i: IEEE Access. - : Institute of Electrical and Electronics Engineers Inc.. - 2169-3536. ; 7, s. 97466-97486
  • Tidskriftsartikel (refereegranskat)abstract
    • The development of a navigation system is one of the major challenges in building a fully autonomous platform. Full autonomy requires a dependable navigation capability not only in a perfect situation with clear GPS signals but also in situations, where the GPS is unreliable. Therefore, self-contained odometry systems have attracted much attention recently. This paper provides a general and comprehensive overview of the state of the art in the field of self-contained, i.e., GPS denied odometry systems, and identifies the out-coming challenges that demand further research in future. Self-contained odometry methods are categorized into five main types, i.e., wheel, inertial, laser, radar, and visual, where such categorization is based on the type of the sensor data being used for the odometry. Most of the research in the field is focused on analyzing the sensor data exhaustively or partially to extract the vehicle pose. Different combinations and fusions of sensor data in a tightly/loosely coupled manner and with filtering or optimizing fusion method have been investigated. We analyze the advantages and weaknesses of each approach in terms of different evaluation metrics, such as performance, response time, energy efficiency, and accuracy, which can be a useful guideline for researchers and engineers in the field. In the end, some future research challenges in the field are discussed.
  •  
12.
  • Nigussie, E., et al. (författare)
  • Boosting performance of self-timed delay-insensitive bit parallel on-chip interconnects
  • 2011
  • Ingår i: IET CIRC DEVICE SYST. - : Institution of Engineering and Technology (IET). - 1751-858X. ; 5:6, s. 505-517
  • Tidskriftsartikel (refereegranskat)abstract
    • The authors present a performance boosting technique with a better power efficiency for delay-insensitive on-chip interconnects. The increase in signal propagation delay uncertainty with technology scaling makes self-timed delay-insensitive on-chip interconnects the most appropriate alternative. However, achieving high-performance communication in self-timed delay-insensitive links is difficult, especially for large bit parallel transmission because of the time-consuming detection of each bit validity. The authors present a high-speed completion detection technique along with its circuit implementation and two on-chip interconnects which use the proposed completion detection circuit. The performance, power consumption, power efficiency and area of the presented on-chip interconnects are analysed and compared with the conventionally implemented delay-insensitive interconnects. For 64-bit parallel transmission, 2.07 and 1.72 times throughput improvement with 47 and 39% more power efficiency have been achieved for the two interconnects compared to their conventional counterparts. The interconnect circuits are designed and simulated using Cadence Analog Spectre and Hspice with 65 nm complementary metal-oxide semiconductor technology from STMicroelectronics.
  •  
13.
  • Nigussie, E., et al. (författare)
  • Semi-Serial On-Chip Link Implementation for Energy Efficiency and High Throughput
  • 2012
  • Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : Institute of Electrical and Electronics Engineers (IEEE). - 1063-8210 .- 1557-9999. ; 20:12, s. 2265-2277
  • Tidskriftsartikel (refereegranskat)abstract
    • A high-throughput and low-energy semi-serial on-chip communication link based on novel design techniques and circuit solutions is presented. This self-timed link is designed using high-speed serialization/deserializtion and pulse dual-rail encoding techniques. The link also employs wave-pipelined differential pulse current-mode signaling to maintain the high speed data intake from the serializer. The energy efficiency of the proposed semi-serial link, which consists of bit-serial links in parallel, mainly comes from the sharing of the novel serializer's control circuit among the bit-serial links. In addition, the integration of pulse signaling with wave-pipelining, the use of a new low-complexity data validity detection technique, and the avoidance of data decoding logic also contribute to the power reduction. Furthermore, the formulated pulse dual-rail encoding provides an opportunity to implement pulse signaling at no cost. The ability to detect data validity at bit level allows acknowledgment per word without losing the delay-insensitivity of the transmission. The proposed semi-serial link is analyzed and compared with bit-serial and fully bit-parallel links for 64-bit data and communication distances of 1 to 8 mm. The semi-serial link which consists of eight bit-serial links provides 72.72 Gbps throughput with 286 fJ/bit energy dissipation for 8 mm transmission. It dissipates the lowest energy per bit compared to fully bit-parallel links while achieving the same throughput. The links are designed and simulated in Cadence Analog Spectre using 65-nm technology from STMicroelectronics.
  •  
14.
  • Tahir, A., et al. (författare)
  • Active suspension system for heavy vehicles
  • 2015
  • Ingår i: 2014 International Symposium on Fundamentals of Electrical Engineering, ISFEE 2014. - 9781479968213
  • Konferensbidrag (refereegranskat)abstract
    • An Active Suspension System has the capacity to introduce, accumulate, and disperse energy to the system. Depending on the functional circumstances, the system may vary its parameters. This paper seeks to explain the designing of an Active Suspension System for heavy vehicles in the form of a case study and is focused on three methodological approaches: Proportional Integral Derivative control, Linear Quadratic Regulator control, and chattering free Sliding Mode Control. The findings should make an important contribution to the field of automation and control engineering. The upshots are also accentuated to evaluate the performances of control designs.
  •  
15.
  • Daneshtalab, M., et al. (författare)
  • A Low-Latency and Memory-Efficient On-chip Network
  • 2010
  • Ingår i: NOCS 2010. ; , s. 99-106
  • Konferensbidrag (refereegranskat)abstract
    • Using multiple SDRAMs in MPSoCs and NoCs to increase memory parallelism is very common nowadays. In-order delivery, resource utilization, and latency are the most critical issues in such architectures. In this paper, we present a novel network interface architecture to cope with these issues efficiently. The proposed network interface exploits a resourceful reordering mechanism to handle the in-order delivery and to increase the resource utilization. A brilliant memory controller is efficiently integrated into this network interface to improve the memory utilization and reduce both memory and network latencies. In addition, to bring compatibility with existing IP cores the proposed network interface utilizes AXI transaction based protocol. Experimental results with synthetic test cases demonstrate that the proposed architecture gives significant improvements in average network latency (12%), average memory access latency (19%), and average memory utilization (22%).
  •  
16.
  • Daneshtalab, M., et al. (författare)
  • CMIT : A novel cluster-based topology for 3D stacked architectures
  • 2010
  • Ingår i: IEEE 3D System Integration Conference 2010, 3DIC 2010.
  • Konferensbidrag (refereegranskat)abstract
    • Combining the benefits of 3D IC and Network-on-Chip (NoC) schemes, provides a significant performance gain for 3D stacked architectures. In recent years, Through-Silicon-Via (TSV), employed for inter-layer connectivity (vertical channel), has attracted a lot of interest since it enables faster and more power efficient inter-layer communication across multiple stacked layers. However, the area overhead of TSVs reduces wafer utilization and yield which impact design of 3D architectures using a large number of TSVs. In this paper, we propose a novel stacked topology, named CMIT (Cluster Mesh Inter-layer Topology) for 3D architectures to reduce the area overhead of TSVs and power dissipation on each layer with minimal performance penalty. Experimental results with synthetic test cases demonstrate that the presented topology can save more than 75% of TSV area footprint and reduces more than 10% of power consumption with a negligible performance overhead.
  •  
17.
  • Daneshtalab, M., et al. (författare)
  • High-performance on-chip network platform for memory-on-processor architectures
  • 2011
  • Ingår i: 6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip, ReCoSoC 2011 - Proceedings.
  • Konferensbidrag (refereegranskat)abstract
    • Three Dimensional Integrated Circuits (3D ICs) are emerging to improve existing Two Dimensional (2D) designs by providing smaller chip areas, higher performance and lower power consumption. Stacking memory layers on top of a multiprocessor layer (logic layer) is a potential solution to reduce wire delay and increase the bandwidth. To fully employ this capability, an efficient on-chip communication platform is required to be integrated in the logic layer. In this paper, we present an on-chip network platform for the logic layer utilizing an efficient network interface to exploit the potential bandwidth of stacked memory-on-processor architectures. Experimental results demonstrate that the platform equipped with the presented network interface increases the performance considerably.
  •  
18.
  • Daneshtalab, M., et al. (författare)
  • High-Performance TSV Architecture for 3-D ICs
  • 2010
  • Ingår i: Proceedings - IEEE Annual Symposium on VLSI, ISVLSI 2010. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781424473212 ; , s. 467-468
  • Konferensbidrag (refereegranskat)abstract
    • Three-dimensional integrated circuits (3-D ICs) outperform traditional planar ICs in terms of performance, packaging density, interconnection power consumption, and functionality. Since the performance of 3-D ICs employing Through Silicon Vias (TSVs) depends on vertical interlayer interconnects, in this paper we present a high-performance bus architecture for TSVs.
  •  
19.
  • Daneshtalab, M., et al. (författare)
  • Input-Output Selection Based Router for Networks-on-Chip
  • 2010
  • Ingår i: IEEE Annual Symposium on VLSI, ISVLSI 2010. ; , s. 92-97
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we propose a novel on-chip router architecture for avoiding congested areas in regular twodimensional on-chip networks. This architecture takes advantage of an efficient adaptive routing model based on the Hamiltonian path for both the multicast and unicast traffic. The output selection of the proposed architecture is based on the congestion condition of neighboring routers and the input selection is based on the Weighted Round Robin mechanism which allows packets to be serviced from each input port according to its congestion level The simulation results show that in multicast, unicast, and mixed traffic profiles the proposed model has lower average delays and lower average and peak power compared to previously proposed models.
  •  
20.
  • Daneshtalab, M., et al. (författare)
  • Memory-Efficient On-Chip Network With Adaptive Interfaces
  • 2012
  • Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. - : Institute of Electrical and Electronics Engineers (IEEE). - 0278-0070 .- 1937-4151. ; 31:1, s. 146-159
  • Tidskriftsartikel (refereegranskat)abstract
    • To achieve higher memory bandwidth in network-based multiprocessor architectures, multiple dynamic random access memories can be accessed simultaneously. In such architectures, not only resource utilization and latency are the critical issues but also a reordering mechanism is required to deliver the response transactions of concurrent memory accesses in-order. In this paper, we present a memory-efficient on-chip network architecture to cope with these issues efficiently. Each node of the network is equipped with a novel network interface (NI) to deal with out-of-order delivery, and a priority-based router to decrease the network latency. The proposed NI exploits a streamlined reordering mechanism to handle the in-order delivery and utilizes the advance extensible interface transaction-based protocol to maintain compatibility with existing intellectual property cores. To improve the memory utilization and reduce the memory latency, an optimized memory controller is integrated in the presented NI. Experimental results with synthetic test cases demonstrate that the proposed on-chip network architecture provides significant improvements in average network latency (16%), average memory access latency (19%), and average memory utilization (22%).
  •  
21.
  • Daneshtalab, M., et al. (författare)
  • Pipeline-based interlayer bus structure for 3D networks-on-chip
  • 2010
  • Ingår i: Proceedings - 15th CSI International Symposium on Computer Architecture and Digital Systems, CADS 2010. ; , s. 35-41
  • Konferensbidrag (refereegranskat)abstract
    • The structure of direct vertical interconnections, called Through Silicon Vias (TSVs), is an important issue in the realm of 3D ICs. The bus-based and network-based structures are the two dominant architectures for implementing TSVs as interlayer connection in 3D ICs. Both implementations have some disadvantages. The former suffers from poor scalability and deteriorates the performance at high injection rates, and the latter consumes more area and power dissipation. In this paper, we propose a novel pipeline bus structure for TSVs to improve the performance of the prior bus-based architecture. The presented structure can utilize bi-synchronous FIFO for synchronization between stacked layers if each layer is fabricated by different technologies. Experimental results with synthetic test cases demonstrate that the proposed architecture gives significant improvements in average network latency. Also, the hardware area and power consumption of the presented bus structure are 9% and 11% less than the typical bus structure of TSVs, respectively.
  •  
22.
  • Dytckov, S., et al. (författare)
  • Exploring NoC jitter effect on simulation of spiking neural networks
  • 2014
  • Ingår i: Proceedings of the 2014 International Conference on High Performance Computing and Simulation, HPCS 2014. - 9781479953134 ; , s. 693-696
  • Konferensbidrag (refereegranskat)abstract
    • The major bottleneck in simulation of large-scale neural networks is the communication problem due to one-to-many neuron connectivity. Network-on-Chip concept has been proposed to address the problem. This work explores the drawback that is introduced by interconnection networks - a delay jitter. The preliminary experiment is held in the spiking neural network simulator introducing variable communicational delay to the simulation. The performance degradation is reported.
  •  
23.
  • Ebrahimi, M., et al. (författare)
  • A High-Performance Network Interface Architecture for NoCs Using Reorder Buffer Sharing
  • 2010
  • Ingår i: 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2010. ; , s. 546-550
  • Konferensbidrag (refereegranskat)abstract
    • Increasing memory parallelism in MPSoCs to provide higher memory bandwidth is achieved by accessing multiple memories simultaneously. Inasmuch as the response transactions of concurrent memory accesses must be in-order, a reordering mechanism is required. To our knowledge the resource utilization of conventional reordering mechanisms is low. In this paper, we present a novel network interface architecture for on-chip networks to increase the resource utilization and to improve overall performance. Also, based on the proposed architecture, a hybrid network interface is presented to integrate both memory and processor in a tile. The proposed architecture exploits AXI transaction based protocol to be compatible with existing IP cores. Experimental results with synthetic test cases demonstrate that the proposed architecture outperforms the conventional architecture in terms of latency. Also, the cost of the presented architecture is evaluated with UMC 0.09μm technology.
  •  
24.
  • Ebrahimi, M., et al. (författare)
  • Agent-based on-chip network using efficient selection method
  • 2011
  • Ingår i: 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, VLSI-SoC 2011. - : IEEE. ; , s. 284-289
  • Konferensbidrag (refereegranskat)abstract
    • Congestion in on-chip networks may cause many drawbacks in multiprocessor systems including throughput reduction, increase in latency, and additional power consumption. Furthermore, conventional congestion control methods, employed for on-chip networks, cannot efficiently collect congestion information and distribute them over the on-chip network. In this paper, we present a novel structure for on-chip networks, named Agent-based Network-on-Chip (ANoC), to diagnose the congested areas. In addition to the presented structure, an efficient Congestion-Aware Selection (CAS) method is proposed to reduce overall network latency. CAS is capable of selecting an appropriate output channel to route packets along a less congested path. 29% average and 35% maximum latency reduction are achieved on SPLASH-2 and PARSEC benchmarks running on a 36-core Chip Multi-Processor.
  •  
25.
  • Ebrahimi, M., et al. (författare)
  • An Efficient Dynamic Multicast Routing Protocol for Distributing Traffic in NOCs
  • 2009
  • Ingår i: 2009 Design, Automation and Test in Europe Conference and Exhibition, DATE '09. ; , s. 1064-1069
  • Konferensbidrag (refereegranskat)abstract
    • Nowadays, in MPSoCs and NoCs, multicast protocol is significantly used for many parallel applications such as cache coherency in distributed shared-memory architectures, clock synchronization, replication, or barrier synchronization. Among several multicast schemes proposed in on chip interconnection networks, path-based multicast scheme has been proven to be more efficient than the tree-based, and unicast-based. In this paper a low distance path-based multicast scheme is proposed. The proposed method takes advantage of the network partitioning, and utilizing of an efficient destination ordering algorithm. The results in performance, and power consumption show that the proposed method outstands the previous on chip path-based multicasting algorithms.
  •  
26.
  • Ebrahimi, M., et al. (författare)
  • Efficient congestion-aware selection method for on-chip networks
  • 2011
  • Ingår i: 6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip, ReCoSoC 2011 - Proceedings. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781457706400
  • Konferensbidrag (refereegranskat)abstract
    • The choice of routing algorithm can have a large impact on the performance of on-chip networks. As adaptive routing algorithms may return a set of output channels, a selection method (routing policy) is employed to choose the appropriate output channel from the given set. In this paper, we present a novel on-chip network structure to detect the local and non-local congested areas. Based on the presented structure, an efficient congestion-aware selection method is proposed to choose an output channel that allows a packet to be routed through a less congested area.
  •  
27.
  • Ebrahimi, M., et al. (författare)
  • Exploring partitioning methods for 3D Networks-on-Chip utilizing adaptive routing model
  • 2011
  • Ingår i: 5th ACM/IEEE International Symposium on Networks-on-Chip, NOCS 2011. - New York, NY, USA : ACM. ; , s. 73-80
  • Konferensbidrag (refereegranskat)abstract
    • Three-Dimensional (3D) integration is a solution to the interconnect bottleneck in Two-Dimensional (2D) MultiProcessor System on Chip (MPSoC). 3D IC design improves performance and decreases power consumption by replacing long horizontal interconnects with shorter vertical ones. As the multicast communication is utilized commonly in various parallel applications, the performance can be significantly improved by supporting of multicast operations at the hardware level. In this paper, we propose a set of partitioning approaches each with a different level of efficiency. In addition, we present an advantageous method named Recursive Partitioning (RP) in which the network is recursively partitioned until all partitions contain comparable number of nodes. By this approach, the multicast traffic is distributed among several subsets and the network latency is considerably decreased. We also present Minimal Adaptive Routing (MAR) algorithm for the unicast and multicast traffic in 3D-mesh Networks-on-Chip (NoCs). The idea behind the MAR algorithm is utilizing the Hamiltonian path to provide a set of alternative paths.
  •  
28.
  • Ebrahimi, M., et al. (författare)
  • HARAQ : Congestion-Aware Learning Model for Highly Adaptive Routing Algorithm in On-Chip Networks
  • 2012
  • Ingår i: Proceedings of the 2012 6th IEEE/ACM International Symposium on Networks-on-Chip, NoCS 2012. ; , s. 19-26
  • Konferensbidrag (refereegranskat)abstract
    • The occurrence of congestion in on-chip networks can severely degrade the performance due to increased message latency. In mesh topology, minimal methods can propagate messages over two directions at each switch. When shortest paths are congested, sending more messages through them can deteriorate the congestion condition considerably. In this paper, we present an adaptive routing algorithm for on-chip networks that provide a wide range of alternative paths between each pair of source and destination switches. Initially, the algorithm determines all permitted turns in the network including 180-degree turns on a single channel without creating cycles. The implementation of the algorithm provides the best usage of all allowable turns to route messages more adaptively in the network. On top of that, for selecting a less congested path, an optimized and scalable learning method is utilized. The learning method is based on local and global congestion information and can estimate the latency from each output channel to the destination region.
  •  
29.
  • Fattah, M., et al. (författare)
  • A low-overhead, fully-distributed, guaranteed-delivery routing algorithm for faulty network-on-chips
  • 2015
  • Ingår i: Proceedings - 2015 9th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2015. - New York, NY, USA : ACM Digital Library. - 9781450333962
  • Konferensbidrag (refereegranskat)abstract
    • This paper introduces a new, practical routing algorithm, Maze-routing, to tolerate faults in network-on-chips. The algorithm is the first to provide all of the following properties at the same time: 1) fully-distributed with no centralized component, 2) guaranteed delivery (it guarantees to deliver packets when a path exists between nodes, or otherwise indicate that destination is unreachable, while being deadlock and livelock free), 3) low area cost, 4) low reconfiguration overhead upon a fault. To achieve all these properties, we propose Maze-routing, a new variant of face routing in on-chip networks and make use of deflections in routing. Our evaluations show that Maze-routing has 16X less area overhead than other algorithms that provide guaranteed delivery. Our Maze-routing algorithm is also high performance: for example, when up to 5 links are broken, it provides 50% higher saturation throughput compared to the state-of-the-art. Copyright 2015 ACM.
  •  
30.
  • Guang, L., et al. (författare)
  • Dual Monitoring Communication for Self-Aware Network-on-Chip : Architecture and Case Study
  • 2012
  • Ingår i: International Journal of Adaptive, Resilient and Autonomic Systems. - : IGI Global. - 1947-9220 .- 1947-9239. ; 3
  • Tidskriftsartikel (refereegranskat)abstract
    • Self-aware and adaptive Network-on-Chip (NoC) with dual monitoring networks is presented. Proper monitoring interface is an essential prerequisite to adaptive system reconfiguration in parallel on-chip computing. This work proposes a DMC (dual monitoring communication) architecture to support self-awareness on the NoC platform. One type of monitoring communication is integrated with data channel, in order to trace the run-time profile of data communication in high-speed on-chip networking. The other type is separate from the data communication, and is needed to report the run-time profile to the supervising monitor. Direct latency monitoring on mesochronous NoC is presented as a case study and is directly traced in the integrated communication with a novel latency monitoring table in each router. The latency information is reported by the separate monitoring communication to the supervising monitor, which reconfigures the system to adjust the latency, for instance by dynamic voltage and frequency scaling. With quantitative evaluation using synthetic traces and real applications, the effectiveness and efficiency of direct latency monitoring with DMC architecture is demonstrated. The area overhead of DMC architecture is estimated to be small in 65nm CMOS technology.
  •  
31.
  •  
32.
  • Guang, Liang, et al. (författare)
  • Hierarchical Agent Monitored Parallel On-Chip System : A Novel Design Paradigm and its Formal Specification
  • 2010
  • Ingår i: International Journal of Embedded and Real-Time Communication Systems (IJERTCS). - : IGI Global. - 1947-3176 .- 1947-3184. ; 1:2, s. 86-105
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, the authors present a formal specification of a novel design paradigm, hierarchical agent monitored SoCs (HAMSOC). The paradigm motivates dynamic monitoring in a hierarchical and distributed manner, with adaptive agents embedded for local and global operations. Formal methods are of essential importance to the development of such a novel and complex platform. As the initial effort, functional specification is indispensable to the non-ambiguous system modeling before potential property verification. The formal specification defines the manner by which the system can be constructed with hierarchical components and the representation of run-time information in modeling entities and every type of the monitoring operations. The syntax follows the standard set theory with additional glossary and notations introduced to facilitate practical SoC design process. A case study of hierarchical monitoring for power management in NoC (Network-on-chip), written with the formal specification, is demonstrated
  •  
33.
  • Guang, Liang, et al. (författare)
  • Hierarchical Agent Monitoring Design Platform - towards Self-aware and Adaptive Embedded Systems
  • 2011
  • Ingår i: PECCS 2011 - Proceedings of the 1st International Conference on Pervasive and Embedded Computing and Communication Systems. ; , s. 573-581
  • Konferensbidrag (refereegranskat)abstract
    • Hierarchical agent monitoring design platform(HAM) is presented as a generic design approach for the emerging self-aware and adaptive embedded systems. Such systems, with various existing proposals for different advanced features, call for a concrete, practical and portable design approach. HAM addresses this necessity by providing a scalable and generically applicable design platform. This paper elaborately describes the hierarchical agent monitoring architecture, with extensive reference to the state-of-the-art technology in embedded systems. Two case studies are exemplified to demonstrate the design process and benefits of HAM design platform. One is about hierarchical agent monitored Network-on-Chip with quantitative experiments of hierarchical energy management. The other one is a projectional study of applying HAM on smart house systems, focusing on the design for enhanced dependability.
  •  
34.
  •  
35.
  • Guang, L., et al. (författare)
  • Hierarchical Monitoring in Smart House : Design Scalability, Dependability and Energy-Efficiency
  • 2012
  • Ingår i: Communications in Information Science and Management Engineering. - 2222-1859. ; 2
  • Tidskriftsartikel (refereegranskat)abstract
    • Hierarchical monitoring is presented on smart house platforms to provide scalability, dependability and energy efficiency. Hierarchical monitoring is a scalable and generic approach for optimization and diagnostic operations in distributed embedded systems. The paper studies the design of hierarchical monitoring on smart house platforms as an example of WPANs (wireless personal area networks). We present the functional partition of hierarchical agents in a smart house, and show that the architecture can be conveniently built upon the widely used Zigbee standard. We give a qualitative discussion of the design scalability and dependability compared to the centralized monitoring. In addition, we quantitatively compare the energy consumption of monitoring communication in hierarchical and centralized architectures, with the classic free space propagation model. The qualitative discussion and quantitative analysis demonstrate the scalability, dependability and energy efficiency of hierarchical monitoring in a domestic environment.
  •  
36.
  • Guang, Liang, et al. (författare)
  • Hierarchical Monitoring in Smart House : Design Scalability, Dependability and Energy-Efficiency
  • 2011
  • Ingår i: Proc. of the 3rd International Conference on Information Science and Engineering (ICISE2011). ; , s. 291-296
  • Konferensbidrag (refereegranskat)abstract
    • Energy-efficient hierarchical monitoring is presented on smart house platforms. The rapid expansion of embedded systems requires scalable and portable design interfaces to tackle with the increasing complexity. Hierarchical monitoring is a scalable and generic approach for optimization and diagnostic operations in distributed embedded systems. The paper studies the design of hierarchical montioring on smart house platforms built upon the Zigbee standard of PANs (personal area networks). It presents the functional partition of hierarchical agents in a smart house, and gives a qualitative discussion of the design scalability and dependability, in particular compared to centralized monitoring. In addition, quantitative evaluation of the energy efficiency of monitoring communication in a smart house is performed using a PAN simulator with Zigbee routing configuration. We demonstrate that hierarchical monitoring is more energy efficient than centralized monitoring in various scenarios of a domestic environment.
  •  
37.
  • Guang, Liang, et al. (författare)
  • Hierarchical power monitoring on NoC - a case study for hierarchical agent monitoring design approach
  • 2010
  • Ingår i: 28th Norchip Conference, NORCHIP 2010. - 9781424489732 ; , s. 5669428-
  • Konferensbidrag (refereegranskat)abstract
    • A case study is presented for hierarchical agent monitoring design approach, which provides a high level abstraction for designing monitoring functions on massively parallel and distributed systems. The case study features hierarchical power monitoring on NoC platforms, where each level of agents perform specific monitoring operations based on their granularity. The monitoring hierarchy and operations are specified by a formal language for consistent and non-ambiguous system design. Various benchmarks are mapped onto NoCs, running with hierarchical power monitoring agents. Quantitative evaluations are performed in terms of energy efficiency, communication latency, and silicon overhead.
  •  
38.
  • Guang, L., et al. (författare)
  • Positioning antifragility for clouds on public infrastructures
  • 2014
  • Ingår i: Procedia Computer Science. - : Elsevier BV. - 1877-0509. ; , s. 856-861
  • Konferensbidrag (refereegranskat)abstract
    • Cloud computing scalably and sustainably utilizes computing and communication resources. One segment of the cloud ecosystem is the services built upon public infrastructures to address general benefits. This segment itself is an open system, involving many contributors and stakeholders, and its growth and development is an unpredictable process influenced by economical, societal and technological factors.This paper argues the antifragility as an indispensable feature for cloud computing, and proposes a development process for the open system to maintain, improve and prosper under contradicting interests of users, companies and governments. The proposal emphasizes multi-player's roles and interaction, and the temporal and spatial interleaving of development stages of different application domains.
  •  
39.
  • Guang, L., et al. (författare)
  • Self-adaptive SoCs for dependability : Review and prospects
  • 2014
  • Ingår i: Advancing Embedded Systems and Real-Time Communications with Emerging Technologies. - : IGI Global. - 9781466660366 ; , s. 1-21
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract
    • Dependability is a primary concern for emerging billion-transistor SoCs (Systems-on-Chip), especially when the constant technology scaling introduces an increasing rate of faults and errors. Considering the time-dependent device degradation (e.g. caused by aging and run-time voltage and temperature variations), self-adaptive circuits and architectures to improve dependability is promising and very likely inevitable. This chapter extensively surveys existing works on monitoring, decision-making, and reconfiguration addressing different dependability threats to Very Large Scale Integration (VLSI) chips. Centralized, distributed, and hierarchical fault management, utilizing various redundancy schemes and exploiting logical or physical reconfiguration methods, are all examined. As future research directions, the challenge of integrating different error management schemes to account for multifold threats and the great promise of error resilient computing are identified. This chapter provides, for chip designers, much needed insights on applying a self-adaptive computing paradigm to approach dependability on error-prone, cost-sensitive SoCs.
  •  
40.
  • Guang, L., et al. (författare)
  • Vertical and horizontal integration towards collective adaptive system : a visionary approach
  • 2012
  • Ingår i: Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ; , s. 762-765
  • Konferensbidrag (refereegranskat)abstract
    • Hybrid multi-domain computing systems are emerging. While the context-aware self-adaptive system models are under intensive research in individual computing domains, their integration into a collective adaptive system still remains a major challenge. This position paper visions a meet-in-the-middle approach, where horizontal integration is applied to sub-system models extracted from vertical integration. The integration relies on orthogonal behavior and execution models respectively capturing the functional and non-functional features of sub-systems. The construction towards guaranteed services can be achieved with composition of static (worst-case) execution models, while best-effort services can be constructed with statistical models. Given that each computing domain has, to some extent, formulated its own design flow of context-aware systems, the envisaged meet-in-the-middle integration approach maximizes the reuse of existing models and platforms, thus is promising for the highly-complex system design process.
  •  
41.
  • Haghbayan, M. -H, et al. (författare)
  • Dark silicon aware power management for manycore systems under dynamic workloads
  • 2014
  • Ingår i: 2014 32nd IEEE International Conference on Computer Design, ICCD 2014. ; , s. 509-512
  • Konferensbidrag (refereegranskat)abstract
    • Dark Silicon denotes the phenomenon that, due to thermal and power constraints, the fraction of transistors that can operate at full frequency is decreasing with each technology generation. We propose a PID (Proportional Integral Derivative) controller based dynamic power management method that considers an upper bound on power consumption (called the Thermal Design Power (TDP)). To avoid violation of the TDP constraint for manycore systems running highly dynamic workloads, it provides fine-grained DVFS (Dynamic Voltage and Frequency Scaling) including near-threshold operation. In addition, the method distinguishes applications with hard Real-Time, soft Real-Time and no Real-Time constraints and treats them with appropriate priorities. In simulations with dynamic workloads mixed-critical application profiles, we show that the method is effective in honoring the TDP bound and it can boost system throughput by over 43% compared to a naive TDP scheduling policy.
  •  
42.
  • Haghbayan, M. -H, et al. (författare)
  • Power-aware online testing of manycore systems in the dark silicon era
  • 2015
  • Ingår i: Proceedings -Design, Automation and Test in Europe, DATE. - : IEEE conference proceedings. - 9783981537048 ; , s. 435-440
  • Konferensbidrag (refereegranskat)abstract
    • Online defect screening techniques to detect runtime faults are becoming a necessity in current and near future technologies. At the same time, due to aggressive technology scaling into the nanometer regime, power consumption is becoming a significant burden. Most of today's chips employ advanced power management features to monitor the power consumption and apply dynamic power budgeting (i.e., capping) accordingly to prevent over-heating of the chip. Given the notable power dissipation of existing testing methods, one needs to efficiently manage the power budget to cover test process of a many-core system in runtime. In this paper, we propose a power-aware online testing method for many-core systems benefiting from advanced power management capabilities. The proposed power-aware method uses non-intrusive online test scheduling strategy to functionally test the cores in their idle period. In addition, we propose a test-aware utilization-oriented runtime mapping technique that considers the utilization of cores and their test criticality in the mapping process. Our extensive experimental results reveal that the proposed power-aware online testing approach can efficiently utilize temporarily free resources and available power budget for the testing purposes, within less than 1% penalty on system throughput for the 16nm technology.
  •  
43.
  • Jafri, Syed Mohammad Asad Hassan, et al. (författare)
  • Energy-Aware Fault-Tolerant CGRAs Addressing Application with Different Reliability Needs
  • 2013
  • Ingår i: Digital System Design (DSD), 2013 Euromicro Conference on. - : IEEE conference proceedings. ; , s. 525-534
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we propose a polymorphic fault tolerant architecture that can be tailored to efficiently support the reliability needs of multiple applications at run-time. Today, coarse-grained reconfigurable architectures (CGRAs) host multiple applications with potentially different reliability needs. Providing platform-wide worst-case (maximum) protection to all the applications is neither optimal nor desirable. To reduce the fault-tolerance overhead, adaptive fault-tolerance strategies have been proposed. The proposed techniques access the reliability requirements of each application and adjust the fault-tolerance intensity (and hence overhead), accordingly. However, existing flexible reliability schemes only allow to shift between different levels of modular redundancy (duplication, triplication, etc.) and deal with only a single class of faults (e.g. soft errors). To complement these strategies, we propose energy-aware fault-tolerance that, in addition to modular redundancy, can also provide low cost, sub-modular (e.g. residue mod 3) redundancy, to cater both permanent and temporary faults. Our solution relies on an agent based control layer and a configurable fault-tolerance data path. The control layer identifies the application class and configures the data path to provide the needed reliability. Simulation results using a few selected algorithms (FFT, matrix multiplication, and FIR filter) showed that the proposed method provides flexible protection with energy overhead ranging from 3.125% to 107% for different reliability levels. Synthesis results have confirmed that the proposed architecture significantly reduces the area overhead for self-checking (59.1%) and fault tolerant (7.1%) versions, compared to the state of the art adaptive reliability techniques.
  •  
44.
  • Jafri, Syed M. A. H., et al. (författare)
  • Private reliability environments for efficient fault-tolerance in CGRAs
  • 2014
  • Ingår i: Design automation for embedded systems. - : Springer Science and Business Media LLC. - 0929-5585 .- 1572-8080. ; 18:3-4, s. 295-327
  • Tidskriftsartikel (refereegranskat)abstract
    • In the era of platforms hosting multiple applications with variable reliability needs, worst-case platform-wide fault-tolerance decisions are neither optimal nor desirable. As a solution to this problem, designs commonly employ adaptive fault-tolerance strategies that provide each application with the reliability level actually needed. However, in the CGRA domain, the existing schemes either only allow to shift between different levels of modular redundancy (duplication, triplication, etc.) or protect only a particular region of a device (e.g. configuration memory, computation, or data memory). To complement these strategies, we propose private fault-tolerance environments which, in addition to modular redundancy, also provide low cost sub-modular (e.g. residue mod 3) redundancy capable of handling both permanent and temporary faults in configuration memory, computation, communication, and data memory. In addition, we also present adaptive configuration scrubbing techniques which prevent fault accumulation in the configuration memory. Simulation results using a few selected algorithms (FFT, matrix multiplication, and FIR filter) show that the approach proposed is capable of providing flexible protection with energy overhead ranging from 3.125 % to 107 % for different reliability levels. Synthesis results have confirmed that the proposed architecture reduces the area overhead for self-checking (58 %) and fault-tolerant (7.1 %) versions, compared to the state of the art adaptive reliability techniques.
  •  
45.
  • Jafri, Syed Mohammad Asad Hassan, et al. (författare)
  • RuRot : Run-time rotatable-expandable partitions for efficient mapping in CGRAs
  • 2014
  • Ingår i: Proceedings - International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, SAMOS 2014. - 9781479937707 ; , s. 233-241
  • Konferensbidrag (refereegranskat)abstract
    • Today, Coarse Grained Reconfigurable Architectures (CGRAs) host multiple applications, with arbitrary communication and computation patterns. Compile-time mapping decisions are neither optimal nor desirable to efficiently support the diverse and unpredictable application requirements. As a solution to this problem, recently proposed architectures offer run-time remapping. The run-time remappers displace or expand (parallelize/serialize) an application to optimize different parameters (such as platform utilization). However, the existing remappers support application displacement or expansion in either horizontal or vertical direction. Moreover, most of the works only address dynamic remapping in packet-switched networks and therefore are not applicable to the CGRAs that exploit circuitswitching for low-power and high predictability. To enhance the optimality of the run-time remappers, this paper presents a design framework called Run-time Rotatable-expandable Partitions (RuRot). RuRot provides architectural support to dynamically remap or expand (i.e. parallelize) the hosted applications in CGRAs with circuit-switched interconnects. Compared to state of the art, the proposed design supports application rotation (in clockwise and anticlockwise directions) and displacement (in horizontal and vertical directions), at run-time. Simulation results using a few applications reveal that the additional flexibility enhances the device utilization, significantly (on average 50 % for the tested applications). Synthesis results confirm that the proposed remapper has negligible silicon (0.2 % of the platform) and timing (2 cycles per application) overheads.
  •  
46.
  • Jafri, Syed M. A. H., et al. (författare)
  • TEA : Timing and Energy Aware compression architecture for Efficient Configuration in CGRAs
  • 2015
  • Ingår i: Microprocessors and microsystems. - : Elsevier. - 0141-9331 .- 1872-9436.
  • Tidskriftsartikel (refereegranskat)abstract
    • Coarse Grained Reconfigurable Architectures (CGRAs) are emerging as enabling platforms to meet the high performance demanded by modern applications (e.g. 4G, CDMA, etc.). Recently proposed CGRAs offer time-multiplexing and dynamic applications parallelism to enhance device utilization and reduce energy consumption at the cost of additional memory (up to 50% area of the overall platform). To reduce the memory overheads, novel CGRAs employ either statistical compression, intermediate compact representation, or multicasting. Each compaction technique has different properties (i.e. compression ratio, decompression time and decompression energy) and is best suited for a particular class of applications. However, existing research only deals with these methods separately. Moreover, they only analyze the compaction ratio and do not evaluate the associated energy overheads. To tackle these issues, we propose a polymorphic compression architecture that interleaves these techniques in a unique platform. The proposed architecture allows each application to take advantage of a separate compression/decompression hierarchy (consisting of various types and implementations of hardware/software decoders) tailored to its needs. Simulation results, using different applications (FFT, Matrix multiplication, and WLAN), reveal that the choice of compression hierarchy has a significant impact on compression ratio (up to 52%), decompression energy (up to 4 orders of magnitude), and configuration time (from 33. n to 1.5. s) for the tested applications. Synthesis results reveal that introducing adaptivity incurs negligible additional overheads (1%) compared to the overall platform area.
  •  
47.
  • Kakakhel, S. R. U., et al. (författare)
  • A qualitative comparison model for application layer IoT protocols
  • 2019
  • Ingår i: 2019 4th International Conference on Fog and Mobile Edge Computing, FMEC 2019. - : Institute of Electrical and Electronics Engineers Inc.. - 9781728117966 ; , s. 210-215
  • Konferensbidrag (refereegranskat)abstract
    • Protocols enable things to connect and communicate, thus making the Internet of Things possible. The performance aspect of the Internet of Things protocols, vital to its widespread utilization, have received much attention. However, one aspect of IoT protocols, essential to its adoption in the real world, is a protocols' feature set. Comparative analysis based on competing features and properties are rarely if ever, discussed in the literature. In this paper, we define 19 attributes in 5 categories that are essential for IoT stakeholders to consider. These attributes are then used to contrast four IoT protocols, MQTT, HTTP, CoAP and XMPP. Furthermore, we discuss scenarios where an assessment based on comparative strengths and weaknesses would be beneficial. The provided comparison model can be easily extended to include protocols like MQTT-SN, AMQP and DDS. 
  •  
48.
  • Majd, A., et al. (författare)
  • Hierarchal Placement of Smart Mobile Access Points in Wireless Sensor Networks Using Fog Computing
  • 2017
  • Ingår i: Proceedings - 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2017. - : Institute of Electrical and Electronics Engineers Inc.. - 9781509060580 ; , s. 176-180
  • Konferensbidrag (refereegranskat)abstract
    • Recent advances in computing and sensor technologies have facilitated the emergence of increasingly sophisticated and complex cyber-physical systems and wireless sensor networks. Moreover, integration of cyber-physical systems and wireless sensor networks with other contemporary technologies, such as unmanned aerial vehicles (i.e. drones) and fog computing, enables the creation of completely new smart solutions. By building upon the concept of a Smart Mobile Access Point (SMAP), which is a key element for a smart network, we propose a novel hierarchical placement strategy for SMAPs to improve scalability of SMAP based monitoring systems. SMAPs predict communication behavior based on information collected from the network, and select the best approach to support the network at any given time. In order to improve the network performance, they can autonomously change their positions. Therefore, placement of SMAPs has an important role in such systems. Initial placement of SMAPs is an NP problem. We solve it using a parallel implementation of the genetic algorithm with an efficient evaluation phase. The adopted hierarchical placement approach is scalable, it enables construction of arbitrarily large SMAP based systems.
  •  
49.
  • Phong, N. D. B., et al. (författare)
  • Silicon synapse designs for VLSI neuromorphic platform
  • 2014
  • Ingår i: NORCHIP 2014 - 32nd NORCHIP Conference. - : IEEE. - 9781479954421 ; , s. 7004745-
  • Konferensbidrag (refereegranskat)abstract
    • Analog silicon neurons were proven to be a promising solution for VLSI neuromorphic platform to implement massively scalable computing systems. They possess the advantages of consuming less power and silicon area than digitally designed neurons. This paper compares the differences in power and area consumption between two methods of synapse design for analog neuron models: time-based modulation and current-based modulation. The obtained results demonstrate that under the same technology process (ST CMOS 65nm), the neuron that uses time-based modulation consumes less power (almost six times) and silicon area (about thirty times) but higher energy (twelve times) than that of the current-based modulation.
  •  
50.
  • Rahmani, Amir, et al. (författare)
  • ARB-NET : A novel adaptive monitoring platform for stacked mesh 3D NoC architectures
  • 2012
  • Ingår i: Design Automation Conference (ASP-DAC), 2012 17th Asia and South Pacific. - 9781467307703 ; , s. 413-418
  • Konferensbidrag (refereegranskat)abstract
    • The emerging three-dimensional integrated circuits (3D ICs) offer a promising solution to mitigate the barriers of interconnect scaling in modern systems. In order to exploit the intrinsic capability of reducing the wire length in 3D ICs, 3D NoC-Bus Hybrid mesh architecture was proposed. Besides its various advantages in terms of area, power consumption, and performance, this architecture has a unique and hitherto previously unexplored way to implement an efficient system-wide monitoring network. In this paper, an integrated low-cost monitoring platform for 3D stacked mesh architectures is proposed which can be efficiently used for various system management purposes. The proposed generic monitoring platform called ARB-NET utilizes bus arbiters to exchange the monitoring information directly with each other without using the data network. As a test case, based on the proposed monitoring platform, a fully congestion-aware adaptive routing algorithm named AdaptiveXYZ is presented taking advantage from viable information generated within bus arbiters. Our extensive simulations with synthetic and real benchmarks reveal that our architecture using the AdaptiveXYZ routing can help achieving significant power and performance improvements compared to recently proposed stacked mesh 3D NoCs.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-50 av 64

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy