SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Jantsch Axel Professor) "

Sökning: WFRF:(Jantsch Axel Professor)

  • Resultat 1-10 av 12
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Holsmark, Rickard, 1970- (författare)
  • Deadlock Free Routing in Mesh Networks on Chip with Regions
  • 2009
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • There is a seemingly endless miniaturization of electronic components, which has enabled designers to build sophisticated computing structureson silicon chips. Consequently, electronic systems are continuously improving with new and more advanced functionalities. Design complexity ofthese Systems on Chip (SoC) is reduced by the use of pre-designed cores. However, several problems related to the interconnection of coresremain. Network on Chip (NoC) is a new SoC design paradigm, which targets the interconnect problems using classical network concepts. Still,SoC cores show large variance in size and functionality, whereas several NoC benefits relate to regularity and homogeneity. This thesis studies some network aspects which are characteristic to NoC systems. One is the issue of area wastage in NoC due to cores of varioussizes. We elaborate on using oversized regions in regular mesh NoC and identify several new design possibilities. Adverse effects of regions oncommunication are outlined and evaluated by simulation. Deadlock freedom is an important region issue, since it affects both the usability and performance of routing algorithms. The concept of faultyblocks, used in deadlock free fault-tolerant routing algorithms has similarities with rectangular regions. We have improved and adopted one suchalgorithm to provide deadlock free routing in NoC with regions. This work also offers a methodology for designing topology agnostic, deadlockfree, highly adaptive application specific routing algorithms. The methodology exploits information about communication among tasks of anapplication. This is used in the analysis of deadlock freedom, such that fewer deadlock preventing routing restrictions are required. A comparative study of the two proposed routing algorithms shows that the application specific algorithm gives significantly higher performance.But, the fault-tolerant algorithm may be preferred for systems requiring support for general communication. Several extensions to our work areproposed, for example in areas such as core mapping and efficient routing algorithms. The region concept can be extended for supporting reuse ofa pre-designed NoC as a component in a larger hierarchical NoC.
  •  
2.
  • Jiang, Ke (författare)
  • Security-Driven Design of Real-Time Embedded Systems
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Real-time embedded systems (RTESs) have been widely used in modern society. And it is also very common to find them in safety and security critical applications, such as transportation and medical equipment. There are, usually, several constraints imposed on a RTES, for example, timing, resource, energy, and performance, which must be satisfied simultaneously. This makes the design of such systems a difficult problem.More recently, the security of RTESs emerges as a major design concern, as more and more attacks have been reported. However, RTES security, as a parameter to be considered during the design process, has been overlooked in the past. This thesis approaches the design of secure RTESs focusing on aspects that are particularly important in the context of RTES, such as communication confidentiality and side-channel attack resistance.Several techniques are presented in this thesis for designing secure RTESs, including hardware/software co-design techniques for communication confidentiality on distributed platforms, a global framework for secure multi-mode real-time systems, and a scheduling policy for thwarting differential power analysis attacks. All the proposed solutions have been extensively evaluated in a large amount of experiments, including two real-life case studies, which demonstrate the efficiency of the presented techniques.
  •  
3.
  • Eslami Kiasari, Abbas (författare)
  • Performance Analysis and Design Space Exploration of On-Chip Interconnection Networks
  • 2013
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The advance of semiconductor technology, which has led to more than one billion transistors on a single chip, has enabled designers to integrate dozens of IP (intellectual property) blocks together with large amounts of embedded memory. These advances, along with the fact that traditional communication architectures do not scale well have led to significant changes in the architecture and design of integrated circuits. One solution to these problems is to implement such a complex system using an on-chip interconnection network or network-on-chip (NoC). The multiple concurrent connections of such networks mean that they have extremely high bandwidth. Regularity can lead to design modularity providing a standard interface for easier component reuse and improved interoperability.The present thesis addresses the performance analysis and design space exploration of NoCs using analytical and simulation-based performance analysis approaches. At first, we developed a simulator aimed to performance analysis of interconnection networks. The simulator is then used to evaluate the performance of networks topologies and routing algorithms since their choice heavily affect the performance of NoCs. Then, we surveyed popular mathematical formalisms – queueing theory, network calculus, schedulability analysis, and dataflow analysis – and how they have been applied to the analysis of on-chip communication performance in NoCs. We also addressed research problems related to modelling and design space exploration of NoCs.In the next step, analytical router models were developed that analyse NoC performance. In addition to providing aggregate performance metrics such as latency and throughput, our approach also provides feedback about the network characteristics at a fine-level of granularity. Our approach explicates the impact that various design parameters have on the performance, thereby providing invaluable insight into NoC design. This makes it possible to use the proposed models as a powerful design and optimisation tool.We then used the proposed analytical models to address the design space exploration and optimisation problem. System-level frameworks to address the application mapping and to design routing algorithms for NoCs were presented. We first formulated an optimisation problem of minimizing average packet latency in the network, and then solved this problem using the simulated annealing heuristic. The proposed framework can also address other design space exploration problems such as topology selection and buffer dimensioning.
  •  
4.
  • Liu, Ming, 1982- (författare)
  • A High-end Reconfigurable Computation Platform for Particle Physics Experiments
  • 2008
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Modern nuclear and particle physics experiments run at a very high reaction rate and are able to deliver a data rate of up to hundred GBytes/s.  This data rate is far beyond the storage and on-line analysis capability. Fortunately physicists have only interest in a very small proportion among the huge amounts of data. Therefore in order to select the interesting data and reject the background by sophisticated pattern recognition processing, it is essential to realize an efficient data acquisition and trigger system which results in a reduced data rate by several orders of magnitude. Motivated by the requirements from multiple experiment applications, we are developing a high-end reconfigurable computation platform for data acquisition and triggering. The system consists of a scalable number of compute nodes, which are fully interconnected by high-speed communication channels. Each compute node features 5 Xilinx Virtex-4 FX60 FPGAs and up to 10 GBytesDDR2 memory. A hardware/software co-design approach is proposed to develop custom applications on the platform, partitioning performance-critical calculation to the FPGA hardware fabric while leaving flexible and slow controls to the embedded CPU plus the operating system. The system is expected to be high-performance and general-purpose for various applications especially in the physics experiment domain. As a case study, the particle track reconstruction algorithm for HADES has been developed and implemented on the computation platform in the format of processing engines. The Tracking Processing Unit (TPU) recognizes peak bins on the projection plane and reconstructs particle tracks in realtime. Implementation results demonstrate its acceptable resource utilization and the feasibility to implement the module together with the sys-tem design on the FPGA. Experimental results show that the online track reconstruction computation achieves 10.8 - 24.3 times performance acceleration per TPU module when compared to the software solution on a Xeon2.4 GHz commodity server.
  •  
5.
  • Liu, Ming, 1982- (författare)
  • Adaptive Computing based on FPGA Run-time Reconfigurability
  • 2011
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In the past two decades, FPGA has been witnessed from its restricted use as glue logic towards real System-on-Chip (SoC) platforms. Profiting from the great development on semiconductor and IC technologies, the programmability of FPGAs enables themselves wide adoption in all kinds of aspects of embedded designs. Modern FPGAs provide the additional capability of being dynamically and partially reconfigured during the system run-time. The run-time reconfigurability enhances FPGA designs from the sole spatial to both spatial and temporal parallelism, providing more design flexibility for advanced system features. Adaptive computing delegates an advanced computing paradigm in which computation tasks and resources are intelligently managed in correspondence with conditional requirements. In this thesis, we investigate adaptive designs on FPGA platforms: We present a comprehensive and practical design framework for adaptive computing based on the FPGA run-time reconfigurability. It concerns several design key issues in different hardware/software layers, specifically hardware architecture, run-time reconfiguration technical support, OS and device drivers, hardware process scheduler, context switching as well as Inter-Process Communications (IPC). Targeting a special application of data acquisition (DAQ) and trigger systems in nuclear and particle physics experiments, we set up the data streaming model and conduct theoretical analysis on the adaptive system. Three application studies are employed to verify the proposed adaptive design framework: The first application demonstrates a peripheral controller adaptable system aiming at general embedded designs. Through dynamically loading/unloading a NOR flash memory controller and an SRAM controller, both flash memory and SRAM accesses may be accomplished with less resource consumption than in traditional static designs. In the second case, two real algorithm processing engines are adaptively time-multiplexed in the same reconfigurable slot for particle recognition computation. Experimental results reveal the reduced on-chip resource requirements, as well as an approximate processing capability of the peer static design. Taking advantage of the FPGA dynamic reconfigurability, we present in the third application a novel on-FPGA interconnection microarchitecture named RouterLess NoC (RL-NoC). RL-NoC employs the novel design concept of Move Logic Not Data (MLND), and significantly distinguishes itself from the existing interconnection architectures such as buses, crossbars or NoCs. It does not rely on routers to deliver packets hop by hop as canonical NoCs do, but buffers data packets in virtual channels and brings various nodes using run-time reconfiguration to produce or consume them. In comparison with canonical packet-switching NoCs, the routerless architecture features lower design complexity, less resource consumption, higher work frequency, more efficient power dissipation as well as comparable or even higher packet delivery efficiency. It is regarded as a promising interconnection approach in some design scenarios on FPGAs, especially for light-weight applications.
  •  
6.
  • Shaoteng, Liu, 1984- (författare)
  • New circuit switching techniques in on-chip networks
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Network on Chip (NoC) is proposed as a promising technology to address the communication challenges in deep sub-micron era. NoC brings network-based communication into the on-chip environment and tackles the problems like long wire complexities, bandwidth scaling and so on. After more than a decade's evolution and development, there are many NoC architectures and solutions available. Nevertheless, NoCs can be classi_ed into two categories: packet switched NoC and circuit switched NoC. In this thesis, targeting circuit switched NoC, we present our innovations and considerations on circuit switched NoCs in three areas, namely, connection setup method, time division multiplexing (TDM) technology and spatial division multiplexing (SDM) technology.Connection setup technique deeply inuences the architecture and performance of a circuit switched NoC, since circuit switched NoC requires to set up connections before launching data transfer. We propose a novel parallel probe based method for dynamic distributed connection setup. This setup method on one hand searches all the possible minimal paths in parallel. On the other hand, it also has a mechanism to reduce resource occupation during the path search process by reclaiming redundant paths. With this setup method, connections are more likely to be established because of the exploration on the path diversity.TDM based NoC constitutes a sub-category of circuit switched NoC. We propose a double time-wheel technique to facilitate a probe based connection setup in TDM NoCs. With this technique, path search algorithms used in connection setup are no longer limited to deterministic routing algorithms. Moreover, the hardware cost can be reduced, since setup requests and data flows can co-exist in one network. Apart from the double time-wheel technique for connection setup, we also propose a highway technique that can enhance the slot utilization during data transfer. This technique can accelerate the transfer of a data flow while maintaining the throughput guarantee and the packet order.SDM based NoC constitutes another sub-category of circuit switched NoC. SDM NoC can benefit from high clock frequency and simple synchronization efforts. To better support the dynamic connection setup in SDM NoCs, we design a single cycle allocator for channel allocation inside each router. This allocator can guarantee both strong fairness and maximal matching quality. We also build up a circuit switched NoC, which can support multiple channels and multiple networks, to study different ways of organizing channels and setting up connections. Finally, we make a comparison between circuit switched NoC and packet switched NoC. We show the strengths and weaknesses on each of them by analysis and evaluation.
  •  
7.
  • She, Huimin, 1982- (författare)
  • Network-Calculus-based Performance Analysis for Wireless Sensor Networks
  • 2009
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Recently, wireless sensor network (WSN) has become a promising technologywith a wide range of applications such as supply chain monitoringand environment surveillance. It is typically composed of multiple tiny devicesequipped with limited sensing, computing and wireless communicationcapabilities. Design of such networks presents several technique challengeswhile dealing with various requirements and diverse constraints. Performanceanalysis techniques are required to provide insight on design parametersand system behaviors. Based on network calculus, we present a deterministic analysis methodfor evaluating the worst-case delay and buffer cost of sensor networks. Tothis end, three general traffic flow operators are proposed and their delayand buffer bounds are derived. These operators can be used in combinationto model any complex traffic flowing scenarios. Furthermore, the methodintegrates a variable duty cycle to allow the sensor nodes to operate at lowrates thus saving power. In an attempt to balance traffic load and improveresource utilization and performance, traffic splitting mechanisms areintroduced for mesh sensor networks. Based on network calculus, the delayand buffer bounds are derived in non-splitting and splitting scenarios.In addition, analysis of traffic splitting mechanisms are extended to sensornetworks with general topologies. To provide reliable data delivery in sensornetworks, retransmission has been adopted as one of the most popularschemes. We propose an analytical method to evaluate the maximum datatransmission delay and energy consumption of two types of retransmissionschemes: hop-by-hop retransmission and end-to-end retransmission. We perform a case study of using sensor networks for a fresh food trackingsystem. Several experiments are carried out in the Omnet++ simulationenvironment. In order to validate the tightness of the two bounds obtainedby the analysis method, the simulation results and analytical results arecompared in the chain and mesh scenarios with various input traffic loads.From the results, we show that the analytic bounds are correct and tight.Therefore, network calculus is useful and accurate for performance analysisof wireless sensor network.
  •  
8.
  • Zhu, Jun, 1976- (författare)
  • Performance Analysis and Implementationof Predictable Streaming Applications onMultiprocessor Systems-on-Chip
  • 2010
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Driven by the increasing capacity of integrated circuits, multiprocessorsystems-on-chip (MPSoCs) are widely used in modern consumer electron-ics devices. In this thesis, the performance analysis and implementationmethodologies are explored to design predictable streaming applications onMPSoCs computing platforms. The application functionality and concur-rency are described in synchronous data flow (SDF) computational models,and two state-of-the-art architecture templates are adopted as multiproces-sor architectures, i.e., network-on-chip (NoC) based MPSoC and hybrid re-configurable CPU/FPGA platforms. Based on the author’s contributions onsimulation and formal analytical methods, performance analysis and designspace exploration for embedded MPSoCs architectures have been addressed. An energy efficient design space exploration flow is proposed for stream-ing applications with guaranteed throughput on NoC based MPSoCs, in whichboth application throughput analysis and system energy calculation are car-ried out by simulation on a multi-clocked synchronous modelling frame-work. On the other hand, based on event models of data streams, a formalanalytical scheduling framework for real-time streaming applications withminimal buffer requirement on hybrid CPU/FPGA architectures is exploited.The scheduling problem has been formalized declaratively by constraint basetechniques, and solved by a public domain constraint solver. Consecutively,the constraint based method has been extended to solve problems rangingfrom global computation/communication scheduling and reconfiguration anal-ysis to Pareto efficient design. Finally, a prototype of stream processing sys-tem on FPGA based MPSoC is built to substantiate the results from theoreti-cal studies in this thesis.
  •  
9.
  • Al Khatib, Iyad, 1975- (författare)
  • Performance Analysis of Application-Specific Multicore Systems on Chip
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The last two decades have witnessed the birth of revolutionary technologies in data communications including wireless technologies, System on Chip (SoC), Multi Processor SoC (MPSoC), Network on Chip (NoC), and more. At the same time we have witnessed that performance does not always keep pace with expectations in many services like multimediaservices and biomedical applications. Moreover, the IT market has suffered from some crashes. Hence, this triggered us to think of making use of available technologies and developing new ones so that the performance level is suitable for given applications and services. In the medical field, from a statistical viewpoint, the biggest diseases in number of deaths are heart diseases, namely Cardiovascular Disease (CVD) and Stroke. The application with the largest market for CVD is the electrocardiogram (ECG/EKG) analysis. According to the World Health Organization (WHO) report in 2003, 29.2% of global deaths are due to CVD and Stroke, half of which could be prevented if there was proper monitoring. We found in the new advance in microelectronics, NoC, SoC, and MPSoC, a chance of a solution for such a big problem. We look at the communication technologies, wireless networks, and MPSoC and realize that many projects can be founded, and they may affect people's lives positively, as for example, curing people more rapidly, as well as homecare of such large scale diseases. These projects have a medical impact as well as economic and social impacts. The intention is to use performance analysis of interconnected microelectronic systems and combine it with MPSoC and NoC technologies in order to evolve to new systems on chip that may make a difference. Technically, we aim at rendering more computations in less time, on a chip with smaller volume, and with less expense. The performance demand and the vision of having a market success, i.e. contributing to lower healthcare costs, pose many challenges on the hardware/software co-design to meet these goals. This calls upon the development of new integrated circuits featuring increased energy efficiency while providing higher computation capabilities, i.e. better performance. The biomedical application of ECG analysis is an ideal target for an application-specific SoC implementation. However, new 12-lead ECG analyses algorithms are needed to meet the aforementioned goals. In this thesis, we present two novel algorithms for ECG analysis, namely the Autocorrelation-Function (ACF) based algorithm and the Fast Fourier Transform (FFT) based algorithm. In this respect, we explore the design space by analyzing different hardware and software architectures. As a result, we realize a design with twelve processors that can compute 3.5 million arithmetic computations and respect the real time hard deadline for our biomedical application (3.5-4seconds), and that can deploy the ACF-based and FFT-based algorithms. Then, we investigate the configuration space looking for the most effective solution, performance and energy-wise. Consequently, we present three interconnect architectures (Single Bus, Full Crossbar, and Partial Crossbar) and compare them with existing solutions. The sampling frequencies of 2.2 KHz and 4 KHz, with 12 DSPs, are found to be the critical points for our Shared-Bus design and Crossbar architecture, respectively. We also show how our performance analysis methods can be applied to such a field of SoC design and with a specific purpose application in order to converge to a solution that is acceptable from a performance viewpoint, meets the real-time demands, and can be implemented with the present technologies while at the same time paving the way for easier and faster development. In order to connect our MPSoC solution to communication networks to transmit the medical results to a healthcare center, we come up with new protocols that will allow the integration of multiple networks on chips in a communication network. Finally, we present a methodology for HW/SW Codesign for application-specific systems (with focus on biomedical applications) that require a large number of computations since this will foster the convergence to solutions that are acceptable from a performance point of view.
  •  
10.
  • Henriksson, Tomas, 1974- (författare)
  • Hardware Architecture for Protocol Processing
  • 2001
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Protocol processing is increasingly important. Through the years the hardware architectures for network equipment have evolved constantly. It is important to make a difference between terminals and routers and the different processing tasks they encounter. It is also important to analyze in detail the functional coverage of a hardware architecture. The maximal supported line speed is also interesting and especially which functionality can be kept at this line speed.There are some types of hardware architectures that have gained much anention in research and from industry. Among these application specific instruction set computers, RISC with optimized instruction sets and reconfigurable hardware architectures are most often used. Very many network processors have been presented that aim for routers. So far not many protocol processors for terminals have been suggested. In terminals the requirements are different, for example low power consumption is very important for battery powered terminals.I and my colleagues have proposed a novel way to build a protocol processor for a terminal. The main concept is to use an array of reconfigurable functional pages, which are connected in a deep pipeline. This deep pipeline serial processor is supported by a micro controller for exception handling and configuration tasks. The most performance-critical functional page in an Ethemet TCP/lP environment is the cyclic redundancy check. We allocated and scheduled the cyclic redundancy check in parallel with other functions. After having investigated different solutions we found that our functional page for cyclic redundancy check can manage 10 Gb/s, if a 0.15 micron manufacturing process is used in combination with optimized RTL code and synthesis.Our architecture allows extensive parallel operation. The functionality is partitioned into the autonomous functional pages, which work in parallel. This reduces control overhead and simplifies the verification process. Low control overhead and extensively parallel computations admit low-power operation. The designed processor handles reception processing on a single packet or frame. It works in parallel with the host processor and significantly reduces the workload on the host processor. The designed processor always operates at line speed and supports up to 10 Gb/s.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 12

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy