↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "(db:Swepub) pers:(Lu Zhonghai) conttype:(scientificother) srt2:(2020-2022)"

Sökning: (db:Swepub) pers:(Lu Zhonghai) conttype:(scientificother) > (2020-2022)

Resultat 1-4 av 4

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Lu, Zhonghai (författare) Guest Editorial : IEEE TC Special Issue On Communications for Many-core Processors and Accelerators 2021 Ingår i: IEEE Transactions on Computers. - : Institute of Electrical and Electronics Engineers (IEEE). - 0018-9340 .- 1557-9956. ; 70:6, s. 817-818 Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)
2.	Wang, Boqian, 1990- (författare) High-Performance Network-on-Chip Design for Many-Core Processors 2020 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract With the development of on-chip manufacturing technologies and the requirements of high-performance computing, the core count is growing quickly in Chip Multi/Many-core Processors (CMPs) and Multiprocessor System-on-Chip (MPSoC) to support larger scale parallel execution. Network-on-Chip (NoC) has become the de facto solution for CMPs and MPSoCs in addressing the communication challenge. In the thesis, we tackle a few key problems facing high-performance NoC designs.For general-purpose CMPs, we encompass a full system perspective to design high-performance NoC for multi-threaded programs. By exploring the cache coherence under the whole system scenario, we present a smart communication service called Advance Virtual Channel Reservation (AVCR) to provide a highway to target packets, which can greatly reduce their contention delay in NoC. AVCR takes advantage of the fact that we can know or predict the destination of some packets ahead of their arrival at the Network Interface (NI). Exploiting the time interval before a packet is ready, AVCR establishes an end-to-end highway from the source NI to the destination NI. This highway is built up by reserving the Virtual Channel (VC) resources ahead of the target packet transmission and offering priority service to flits in the reserved VC in the wormhole router, which can avoid the target packets’ VC allocation and switch arbitration delay. Besides, we also propose an admission control method in NoC with a centralized Artificial Neural Network (ANN) admission controller, which can improve system performance by predicting the most appropriate injection rate of each node using the network performance information. In the online control process, a data preprocessing unit is applied to simplify the ANN architecture and make the prediction results more accurate. Based on the preprocessed information, the ANN predictor determines the control strategy and broadcasts it to each node where the admission control will be applied.For application-specific MPSoCs, we focus on developing high-performance NoC and NI compatible with the common AMBA AXI4 interconnect protocol. To offer the possibility of utilizing the AXI4 based processors and peripherals in the on-chip network based system, we propose a whole system architecture solution to make the AXI4 protocol compatible with the NoC based communication interconnect in the many-core system. Due to possible out-of-order transmission in the NoC interconnect, which conflicts with the ordering requirements specified by the AXI4 protocol, in the first place, we especially focus on the design of the transaction ordering units, realizing a high-performance and low cost solution to the ordering requirements. The microarchitectures and the functionalities of the transaction ordering units are also described and explained in detail for ease of implementation. Then, we focus on the NI and the Quality of Service (QoS) support in NoC. In our design, the NI is proposed to make the NoC architecture independent from the AXI4 protocol via message format conversion between the AXI4 signal format and the packet format, offering high flexibility to the NoC design. The NoC based communication architecture is designed to support high-performance multiple QoS schemes. The NoC system contains Time Division Multiplexing (TDM) and VC subnetworks to apply multiple QoS schemes to AXI4 signals with different QoS tags and the NI is responsible for traffic distribution between two subnetworks. Besides, a QoS inheritance mechanism is applied in the slave-side NI to support QoS during packets’ round-trip transfer in NoC.
3.	Yang, Yu (författare) High-Level Synthesis for SiLago : Advances in Optimization of High-Level Synthesis Tool and Neural Network Algorithms 2022 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract Embedded hardware designs and their automation improve energy and engineering efficiency. However, these two goals are often contradictory. The attempts to improve energy efficiency often come at the cost of engineering efficiency and vice-versa. High-level synthesis (HLS) is a good example of this challenge. It has been researched for more than three decades. Nevertheless, it has not become a mainstream design flow component concerning custom hardware synthesis due to the big efficiency gap between the HLS-generated hardware design and the manual RTL design.This thesis attempts to address the HLS challenge. We divide the research challenge of improving state-of-the-art HLS into three components: 1) the hardware architecture and its underlying VLSI design style, 2) the design automation algorithms and data structures, and 3) the optimization of the algorithm to be mapped.The SiLago hardware platform has been reported as a prominent hardware architecture that can deliver ASIC-like efficiency and could be an ideal HLS hardware platform. It has the following features: 1) SiLago embodies parallel distributed two-level control. 2) SiLago blocks are hardened blocks that can create valid VLSI designs by abutment without involving logic or physical synthesis.Consequently, when targeting the SiLago hardware platform, the SiLago HLS tool generates not a single controller but multiple collaborative controllers, each of which is a hierarchy of two levels. The distributed two-level control scheme poses unique challenges in synchronization and scheduling tasks. Unique data structures and instruction scheduling models are developed for the SiLago HLS tool to support the distributed two-level control scheme. The SiLago HLS tool also generates a valid GDSII macro whose average energy, area, and performance are not estimated but known with post-layout accuracy thanks to the predictable SiLago hardware blocks. Moreover, the SiLago HLS tool is not intended for the end-user. It is designed to develop a library of algorithm implementations used by the application-level synthesis (ALS) tool in the SiLago framework. The application is defined as a hierarchy of algorithms. This library would include algorithms that vary in their function, dimension, and degree of parallelism. The ALS tool explores the design space in terms of number and type of algorithm implementation, rather than arithmetic resources, as HLS tools do.Algorithms are often developed by domain experts. For efficient implementation in hardware, such algorithms often need to be optimized with the hardware platform in mind. Two algorithm instances have been chosen for demonstration purposes. The first instance is a self-organizing map (SOM) based genome recognition algorithm. The second example concerns a complex model of cortex called Bayesian confidence propagation neural network (BCPNN). As developed by computational neuroscientists, the original model demands too much memory storage and memory access.This thesis addresses the latter two components because the first component has been addressed in the literature. We will first demonstrate the design of the SiLago HLS tool to support the hardware features like the distributed two-level control system. Moreover, we will use the two complex algorithm instances -- SOM and BCPNN, to demonstrate both general-purpose and algorithm-specific hardware-oriented algorithm optimization techniques. With the research carried out in this thesis, the SiLago HLS framework is greatly improved.
4.	Yu, Yang, 1991- (författare) Design and Security Analysis of TRNGs and PUFs 2022 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract True Random Number Generators (TRNGs) and Physical Unclonable Functions (PUFs) are two important types of cryptographic primitives. TRNGs create a hardware-based, non-deterministic noise that is often used for generating keys, initialization vectors, and nonces for various applications that require cryptographic protection. PUFs have been proposed as a tamper-resistant alternative to the traditional secret key generation and challenge-response authentication methods. A compromised TRNG or PUF can lead to a system-wide loss of security.The conventional TRNG or PUF designs are challenged by new attack vectors such as deep learning-based side-channel analysis. In this dissertation, we propose several new PUF and TRNG designs and evaluations of their performance and security.The first PUF we introduce is called threshold PUF. We show that, in principle, any n-input threshold logic gate can be used as a base for building an n-input PUF. We implement and evaluate a threshold PUF based on recently proposed threshold logic flip-flops using SPICE simulation as a proof of concept. Threshold PUFs open up the possibility of using the rich body of knowledge on threshold logic implementations for designing PUFs. The second proposed design is a lightweight PUF construction called CRC-PUF, which focuses on protecting PUFs against machine learning-based modeling attacks. In CRC-PUF, input challenges are de-synchronized from output responses to make the PUF model difficult to learn. The input transformation which does the de-synchronization is based on a Cyclic Redundancy Check (CRC), thus the name CRC-PUF. By changing the CRC generator polynomial for each new response, we assure that recovering the transforming challenge has a success probability of at most 2-86 for 128-bit challenge-response pairs.The first TRNG design we introduce is based on a Non-Linear Feedback Ring Oscillator (NLFRO). The proposed NLFRO-TRNG structure harvests randomness from noise and unpredictable variations in delay cells and bi-stable elements, which is further amplified by the formation of non-linear feedback loops. The NLFRO outputs have chaotic behavior, allowing the construction of TRNGs with high entropy and speed. We implement three NLFRO-TRNGs on FPGA and evaluate the properties of the implementations with the NIST 800-90B entropy estimation and NIST 800-22 statistical test suits. The second proposed TRNG design is based on a strong PUF. The PUF based TRNG exploits the inherent determinism of PUF to enable in-field testing of the entropy sources by known answer tests. We present a prototype FPGA implementation of the proposed TRNG based on an arbiter PUF that passes all NIST 800-22 statistical tests and has the minimal entropy of 0.918 estimated according to NIST 800-90B recommendations.Apart from TRNG and PUF designs, it is crucial to consider potential attack vectors that can be created leveraging recently emerged technologies. To that end, in the second part of this dissertation, we introduce a novel attack on FPGA-based PUF and TRNG implementations that combines bitstream modification along with deep learning-based side-channel analysis. We evaluate this new attack vector on the design of an arbiter PUF and a ring oscillator-based TRNG implemented on Xilinx Artix-7 28nm FPGAs. In both cases, we are able to achieve close to 100% classification accuracy to recover the output or response. In the case of the arbiter PUF, the attack can even overcome countermeasures that are based on encrypting the challenges or responses.With such potent attack vectors readily available, the construction of strong countermeasures is necessary. Unfortunately, many of the state-of-the-art countermeasures are one-sided. In the final part of the dissertation, we use a countermeasure proposed for the protection of the Advanced Encryption Standard as an example. We conduct experiments and conclude that it can assist another type of side-channel attack that is not considered by the countermeasure.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-4 av 4

Avgränsa träffmängd

Typ av publikation: doktorsavhandling (2); tidskriftsartikel (1); licentiatavhandling (1)

Typ av innehåll: övrigt vetenskapligt/konstnärligt (4)

Författare/redaktör: Lu, Zhonghai (2); Lu, Zhonghai, Profes ... (2); Hemani, Ahmed, 1961- (1); Yang, Yu (1); Dubrova, Elena, Prof ... (1); Wang, Boqian, 1990- (1); visa fler...; Chen, Kun-Chih, Asso ... (1); Purnaprajna, Madhura ... (1); Yu, Yang, 1991- (1); Polian, Ilia, Profes ... (1); visa färre...

Lärosäte: Kungliga Tekniska Högskolan (4)

Språk: Engelska (4)

Forskningsämne (UKÄ/SCB): Teknik (3); Naturvetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy