SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Hemani Ahmed) "

Search: WFRF:(Hemani Ahmed)

  • Result 1-50 of 287
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Abbas, Haider, et al. (author)
  • A Structured Approach for Internalizing Externalities Caused by IT Security Mechanisms
  • 2010
  • In: IEEE ETCS 2010. - Wuhan, China. ; , s. 149-153
  • Conference paper (peer-reviewed)abstract
    • Organizations relying on Information Technology for their business processes have to employ various Security Mechanisms (Authentication, Authorization, Hashing, Encryption etc) to achieve their organizational security objectives of data confidentiality, integrity and availability. These security mechanisms except from their intended role of increased security level for this organization may also affect other systems outside the organization in a positive or negative manner called externalities. Externalities emerge in several ways i.e. direct cost, direct benefit, indirect cost and indirect benefit. Organizations barely consider positive externalities although they can be beneficial and the negative externalities that could create vulnerabilities are simply ignored. In this paper, we will present an infrastructure to streamline information security externalities that appear dynamically for an organization
  •  
2.
  • Abbas, Haider, et al. (author)
  • Adaptability Infrastructure for Bridging IT Security Evaluation and Options Theory
  • 2009
  • In: ACM- IEEE SIN 2009 International Conference on Security of Information and Networks. - North Cyprus : ACM Press. - 9781605584126
  • Conference paper (peer-reviewed)abstract
    • The constantly rising threats in IT infrastructure raise many concerns for an organization, altering security requirements according to dynamically changing environment, need of midcourse decision management and deliberate evaluation of security measures are most striking. Common Criteria for IT security evaluation has long been considered to be victimized by uncertain IT infrastructure and considered resource hungry, complex and time consuming process. Considering this aspect we have continued our research quest for analyzing the opportunities to empower IT security evaluation process using Real Options thinking. The focus of our research is not only the applicability of real options analysis in IT security evaluation but also observing its implications in various domains including IT security investments and risk management. We find it motivating and worth doing to use an established method from corporate finance i.e. real options and utilize its rule of thumb technique as a road map to counter uncertainty issues for evaluation of IT products. We believe employing options theory in security evaluation will provide the intended benefits. i.e. i) manage dynamically changing security requirements ii) accelerating evaluation process iii) midcourse decision management. Having all the capabilities of effective uncertainty management, options theory follows work procedures based on mathematical calculations quite different from information security work processes. In this paper, we will address the diversities between the work processes of security evaluation and real options analysis. We present an adaptability infrastructure to bridge the gap and make them coherent with each other. This liaison will transform real options concepts into a compatible mode that provides grounds to target IT security evaluation and common criteria issues. We will address ESAM system as an example for illustrations and applicability of the concepts.
  •  
3.
  •  
4.
  • Abbas, Haider, et al. (author)
  • Addressing Dynamic Issues in Information Security Management
  • 2011
  • In: Information Management & Computer Security. - UK : Emerald Group Publishing Limited. - 0968-5227 .- 1758-5805. ; 19:1, s. 5-24
  • Journal article (peer-reviewed)abstract
    • Ett ramverk för behandling av osäkerhet inom ledningssystem för informationssäkerhet presenteras. Ramverket baseras på teorier från corporate finance. En fallstudie visar hur ramverket kan appliceras.
  •  
5.
  •  
6.
  •  
7.
  • Abbas, Haider, et al. (author)
  • Architectural Description of an Automated System for Uncertainty Issues Management in Information Security
  • 2010
  • In: International Journal of computer Science and Information Security. - USA. - 1947-5500. ; 8:3, s. 59-67
  • Journal article (peer-reviewed)abstract
    • Information technology evolves at a faster pace giving organizations a limited scope to comprehend and effectively react to steady flux nature of its progress. Consequently the rapid technological progression raises various concerns for the IT system of an organization i.e. existing hardware/software obsoleteness, uncertain system behavior, interoperability of various components/method, sudden changes in IT security requirements and expiration of security evaluations. These issues are continuous and critical in their nature that create uncertainty in IT infrastructure and threaten the IT security measures of an organization. In this research, Options theory is devised to address uncertainty issues in IT security management and the concepts have been developed/validated through real cases on SHS (Spridnings-och-Hämtningssystem) and ESAM (E-society) systems. AUMSIS (Automated Uncertainty Management System in Information Security) is the ultimate objective of this research which provides an automated system for uncertainty management in information security. The paper presents the architectural description of AUMSIS, its various components, information flow, storage and information processing details using options valuation techniques. It also presents heterogeneous information retrieval problems and their solution. The architecture is validated with examples from SHS system
  •  
8.
  • Abbas, Haider, et al. (author)
  • DUDE: Decryption, Unpacking, Deobfuscation, and Endian Conversion Framework for Embedded Devices Firmware
  • 2023
  • In: IEEE Transactions on Dependable and Secure Computing. - : Institute of Electrical and Electronics Engineers (IEEE). - 1545-5971 .- 1941-0018.
  • Journal article (peer-reviewed)abstract
    • Commercial-Off-The-Shelf (COTS) embedded devices rely on vendor-specific firmware to perform essential tasks. These firmware have been under active analysis by researchers to check security features and identify possible vendor backdoors. However, consistently unpacking newly created filesystem formats has been exceptionally challenging. To thwart attempts at unpacking, vendors frequently use encryption and obfuscation methods. On the other hand, when handling encrypted, obfuscated, big endian cramfs, or custom filesystem formats found in firmware under test, the available literature and tools are insufficient. This study introduces DUDE, an automated framework that provides novel functionalities, outperforming cutting-edge tools in the decryption, unpacking, deobfuscation, and endian conversion of firmware. For big endian compressed romfs filesystem formats, DUDE supports endian conversion. It also supports deobfuscating obfuscated signatures for successful unpacking. Moreover, decryption support for encrypted binaries from the D-Link and MOXA series has also been added, allowing for easier analysis and access to the contents of these firmware files. Additionally, the framework offers unpacking assistance by supporting the extraction of special filesystem formats commonly found in firmware samples from various vendors. A remarkable 78% (1424 out of 1814) firmware binaries from different vendors were successfully unpacked using the suggested framework. This performance surpasses the capabilities of commercially available tools combined on a single platform.
  •  
9.
  •  
10.
  •  
11.
  • Abbas, Haider, et al. (author)
  • Option Based Evaluation: Security Evaluation of IT Products Based on Options Theory
  • 2009
  • In: IEEE  ECBS-EERC 2009. - New York : IEEE. - 9781424446773 ; , s. 134-141
  • Conference paper (peer-reviewed)abstract
    • Reliability of IT systems and infrastructure is a critical need for organizations to trust their business processes. This makes security evaluation of IT systems a prime concern for these organizations. Common Criteria is an elaborate, globally accepted security evaluation process that fulfills this need. However CC rigidly follows the initial specification and security threats and takes too long to evaluate and as such is also very expensive. Rapid development in technology and with it the new security threats further aggravates the long evaluation time problem of CC to the extent that by the time a CC evaluation is done, it may no longer be valid because new security threats have emerged that have not been factored in. To address these problems, we propose a novel Option Based Evaluation methodology for security of IT systems that can also be considered as an enhancement to the CC process. The objective is to address uncertainty issues in IT environment and speed up the slow CC based evaluation processes. OBE will follow incremental evaluation model and address the following main concerns based on options theory i.e. i) managing dynamic security requirement with mid-course decision management ii) devising evaluation as an improvement process iii) reducing cost and time for evaluation of an IT product.
  •  
12.
  • Abbas, Haider, 1979- (author)
  • Options-Based Security-Oriented Framework for Addressing Uncerainty Issues in IT Security
  • 2010
  • Doctoral thesis (other academic/artistic)abstract
    • Continuous development and innovation in Information Technology introduces novel configuration methods, software development tools and hardware components. This steady state of flux is very desirable as it improves productivity and the overall quality of life in societies. However, the same phenomenon also gives rise to unseen threats, vulnerabilities and security concerns that are becoming more critical with the passage of time. As an implication, technological progress strongly impacts organizations’ existing information security methods, policies and techniques, making obsolete existing security measures and mandating reevaluation, which results in an uncertain IT infrastructure. In order to address these critical concerns, an options-based reasoning borrowed from corporate finance is proposed and adapted for evaluation of security architecture and decision- making to handle them at organizational level. Options theory has provided significant guidance for uncertainty management in several domains, such as Oil & Gas, government R&D and IT security investment projects. We have applied options valuation technique in a different context to formalize optimal solutions in uncertain situations for three specific and identified uncertainty issues in IT security. In the research process, we formulated an adaptation model for expressing options theory in terms useful for IT security which provided knowledge to formulate and propose a framework for addressing uncertainty issues in information security. To validate the efficacy of this proposed framework, we have applied this approach to the SHS (Spridnings- och Hämtningssystem) and ESAM (E-Society) systems used in Sweden. As an ultimate objective of this research, we intend to develop a solution that is amenable to automation for the three main problem areas caused by technological uncertainty in information security: i) dynamically changing security requirements, ii) externalities caused by a security system, iii) obsoleteness of evaluation. The framework is general and capable of dealing with other uncertainty management issues and their solutions, but in this work we primarily deal with the three aforementioned uncertainty problems. The thesis presents an in-depth background and analysis study for a proposed options-based security-oriented framework with case studies for SHS and ESAM systems. It has also been assured that the framework formulation follows the guidelines from industry best practices criteria/metrics. We have also proposed how the whole process can be automated as the next step in development.
  •  
13.
  •  
14.
  • Abbas, Haider, et al. (author)
  • Security Evaluation of IT Products : Bridging the Gap between Common Criteria (CC) and Real Option Thinking
  • 2008
  • In: WCECS 2008. - 9789889867102 ; , s. 530-533
  • Conference paper (peer-reviewed)abstract
    • Information security has long been considered as a key concern for organizations benefiting from the electronic era. Rapid technological developments have been observed in the last decade which has given rise to novel security threats, making IT, an uncertain infrastructure. For this reason, the business organizations have an acute need to evaluate the security aspects of their IT infrastructure. Since many years, CC (Common Criteria) has been widely used and accepted for evaluating the security of IT products. It does not impose predefined security rules that a product should exhibit but a language for security evaluation. CC has certain advantages over ITSEC1, CTCPEC2 and TCSEC3 due to its ability to address all the three dimensions: a) it provides opportunity for users to specify their security requirements, b) an implementation guide for the developers and c) provides comprehensive criteria to evaluate the security requirements. Among the few notable shortcomings of CC is the amount of resources and a lot of time consumption. Another drawback of CC is that the security requirements in this uncertain IT environment must be defined before the project starts. ROA is a well known modern methodology used to make investment decisions for the projects under uncertainty. It is based on options theory that provides not only strategic flexibility but also helps to consider hidden options during uncertainty. ROA comes in two flavors: first for the financial option pricing and second for the more uncertain real world problems where the end results are not deterministic. Information security is one of the core areas under consideration where researchers are employing ROA to take security investment decisions. In this paper, we give a brief introduction of ROA and its use in various domains. We will evaluate the use of Real options based methods to enhance the Common Criteria evaluation methodology to manage the dynamic security requirement specification and reducing required time and resources. We will analyze the possibilities to overcome CC limitations from the perspective of the end user, developer and evaluator. We believe that with the ROA enhanced capabilities will potentially be able to stop and possibly reverse this trend and strengthen the CC usage with a more effective and responsive evaluation methodology.
  •  
15.
  •  
16.
  • Altayo Gonzalez, u1dr0yqp, et al. (author)
  • Synthesis of Predictable Global NoC by Abutment in Synchoros VLSI Design
  • 2021
  • In: Proceedings - 2021 15th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2021. - New York, NY, USA : Association for Computing Machinery (ACM). ; , s. 61-66
  • Conference paper (peer-reviewed)abstract
    • Synchoros VLSI design style has been proposed as an alternative to the standard cell-based design style; the word synchoros is derived from the Greek word choros for space. Synchoricity discretises space with a virtual grid, the way synchronicity discretises time with clock ticks. SiLago (Silicon Lego) blocks are atomic synchoros building blocks like Lego bricks. SiLago blocks absorb all metal layer details, i.e., all wires, to enable composition by abutment of valid; valid in the sense of being technology design rules compliant, timing clean and OCV ruggedized. Effectively, composition by abutment eliminates logic and physical synthesis for the end user. Like Lego system, synchoricity does need a finite number of SiLago block types to cater to different types of designs. Global NoCs are important system level design components. In this paper, we show, how with a small library of SiLago blocks for global NoCs, it is possible to automatically synthesize arbitrary global NoCs of different types, dimensions, and topology. The synthesized global NoCs are not only valid VLSI designs, but their cost metrics (area, latency, and energy) are known with post-layout accuracy in linear time. We argue that this is essential to be able to do chip-level design space exploration. We show how the abstract timing model of such global NoC SiLago blocks can be built and used to analyse the timing of global NoC links with post layout accuracy and in linear time. We validate this claim by subjecting the same VLSI designs of global NoC to commercial EDA's static timing analysis and show that the abstract timing analysis enabled by synchoros VLSI design gives the same results as the commercial EDA tools.
  •  
17.
  • Anagnostopoulos, I., et al. (author)
  • Power-Aware Dynamic Memory Management on Many-Core Platforms Utilizing DVFS
  • 2013
  • In: ACM Transactions on Embedded Computing Systems. - : Association for Computing Machinery (ACM). - 1539-9087 .- 1558-3465. ; 13:1, s. 40-
  • Journal article (peer-reviewed)abstract
    • Today multicore platforms are already prevalent solutions for modern embedded systems. In the future, embedded platforms will have an even more increased processor core count, composing many-core platforms. In addition, applications are becoming more complex and dynamic and try to efficiently utilize the amount of available resources on the embedded platforms. Efficient memory utilization is a key challenge for application developers, especially since memory is a scarce resource and often becomes the system's bottleneck. To cope with this dynamism and achieve better memory footprint utilization (lowmemory fragmentation) application developers resort to the usage of dynamic memory (heap) management techniques, by allocating and deallocating data at runtime. Moreover, overall power consumption is another key challenge that needs to be taken into consideration. Towards this, designers employ the usage of Dynamic Voltage and Frequency Scaling (DVFS) mechanisms, adapting to the application's computational demands at runtime. In this article, we propose the combination of dynamic memory management techniques with DVFS ones. This is performed by integrating, within thememorymanager, runtimemonitoringmechanisms that steer the DVFSmechanisms to adjust clock frequency and voltage supply based on heap performance. The proposed approach has been evaluated on a distributed shared-memory many-core platform composed of multiple LEON3 processors interconnected by a Network-on-Chip infrastructure, supporting DVFS. Experimental results show that by using the proposed method for monitoring and applying DVFS mechanisms the power consumption concerning dynamic memory management was reduced by approximately 37%. In addition we present the trade-offs the proposed approach. Last, by combining the developed method with heap fragmentation-aware dynamic memory managers, we achieve low heap fragmentation values combined with low power consumption.
  •  
18.
  • Anwar, Hassan, et al. (author)
  • Exploring Spiking Neural Network on Coarse-Grain Reconfigurable Architectures
  • 2014
  • In: ACM International Conference Proceeding Series. - New York, NY, USA : ACM. - 9781450328227 ; , s. 64-67
  • Conference paper (peer-reviewed)abstract
    • Today, reconfigurable architectures are becoming increas- ingly popular as the candidate platforms for neural net- works. Existing works, that map neural networks on re- configurable architectures, only address either FPGAs or Networks-on-chip, without any reference to the Coarse-Grain Reconfigurable Architectures (CGRAs). In this paper we investigate the overheads imposed by implementing spiking neural networks on a Coarse Grained Reconfigurable Ar- chitecture (CGRAs). Experimental results (using point to point connectivity) reveal that up to 1000 neurons can be connected, with an average response time of 4.4 msec.
  •  
19.
  • Azad, S. P., et al. (author)
  • Customization methodology of a Coarse Grained Reconfigurable architecture
  • 2015
  • In: NORCHIP 2014 - 32nd NORCHIP Conference. - 9781479954421
  • Conference paper (peer-reviewed)abstract
    • Mapping algorithms on CGRAs can lead to an inefficient implementation and hardware under-utilization if there is a mismatch between the granularity of reconfigurable processing unit and the algorithm. In this paper, we introduce a tool that takes the hardware configuration of a set of applications, identifies the unused parts of the CGRA, and let the user sweep the design space from fully programmable to fully customized by eliminating the unused components. User can select among multiple design points according to the application specification. This method is very useful to design multi-mode ASIC accelerators. The fully customized hardware generated using our tool has a negligible area and power overhead compared to the equivalent ASIC but can be generated significantly faster.
  •  
20.
  • Baccelli, Guido, et al. (author)
  • NACU : A Non-Linear Arithmetic Unit for Neural Networks
  • 2020
  • In: PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC). - : IEEE.
  • Conference paper (peer-reviewed)abstract
    • Reconfigurable architectures targeting neural networks are an attractive option. They allow multiple neural networks of different types to be hosted on the same hardware, in parallel or sequence. Reconfig-urability also grants the ability to morph into different micro-architectures to meet varying power-performance constraints. In this context, the need for a reconfigurable non-linear computational unit has not been widely researched. In this work, we present a formal and comprehensive method to select the optimal fixed-point representation to achieve the highest accuracy against the floating-point implementation benchmark. We also present a novel design of an optimised reconfigurable arithmetic unit for calculating non-linear functions. The unit can be dynamically configured to calculate the sigmoid, hyperbolic tangent, and exponential function using the same underlying hardware. We compare our work with the state-of-the-art and show that our unit can calculate all three functions without loss of accuracy.
  •  
21.
  • Badawi, Mohammad, et al. (author)
  • A Coarse-Grained Reconfigurable Protocol Processor
  • 2011
  • In: International Symposium on System-on-Chip, 2011. Proceedings.
  • Conference paper (peer-reviewed)abstract
    • Trade-off between flexibility and performance became an important factor for characterizing modern protocol processing architectures. While some solutions tend to be more flexible and less computational efficient like GPPs, other solutions like custom ASIC devices provide high computational efficiency while loosing the ability to cope with the diversity of current and evolving protocols. We propose a reconfigurable protocol processor that is flexible and highly adaptable to the needs of the required protocol with the ability to operate individually or as a multi-core integrating processors. We show how a common protocol processing task that consumes one third of RISC CPU time can be performed on our processor at high speed and low energy cost.
  •  
22.
  • Badawi, Mohammad, 1981- (author)
  • Adaptive Coarse-grain Reconfigurable Protocol Processing Architecture
  • 2016
  • Doctoral thesis (other academic/artistic)abstract
    • Digital signal processors and their variants have provided significant benefit to efficient implementation of Physical Layer (PHY) of Open Systems Interconnection (OSI) model’s seven-layer protocol processing stack compared to the general purpose processors. Protocol processors promise to provide a similar advantage for implementing higher layers in the (OSI)'s seven-layer model. This thesis addresses the problem of designing customizable coarse-grain reconfigurable protocol processing fabrics as a solution to achieving high performance and computational efficiency. A key requirement that this thesis addresses is the ability to not only adapt to varying applications and standards, and different modes in each standard but also to time varying load and performance demands while maintaining quality of service.This thesis presents a tile-based multicore protocol processing architecture that can be customized at design time to meet the requirements of the target application. The architecture can then be reconfigured at boot time and tuned to suit the desired use-case. This architecture includes a packet-oriented memory system that has deterministic access time and access energy costs, and hence can be accurately dimensioned to fulfill the requirements of the desired use-case. Moreover, to maintain quality of service as predicted, while minimizing the use of energy and resources, this architecture encompasses an elastic management scheme that controls run-time configuration to deploy processing resources based on use-case and traffic demands.To evaluate the architecture presented in this thesis, different case studies were conducted while quantitative and qualitative metrics were used for assessment. Energy-delay product, energy efficiency, area efficiency and throughput show the improvements that were achieved using the processing cores and the memory of the presented architecture, compared with other solutions. Furthermore, the results show the reduction in latency and power consumption required to evaluate controlling states when using the elastic management scheme. The elasticity of the scheme also resulted in reducing the total area required for the controllers that serve multiple processing cores in comparison with other designs. Finally, the results validate the ability of the presented architecture to support quality of service without misutilizing available energy during a real-life case study of a multi-participant Voice Over Internet Protocol (VOIP) call.
  •  
23.
  • Badawi, Mohammad, et al. (author)
  • Customizable Coarse-grained Energy-efficient Reconfigurable Packet Processing Architecture
  • 2014
  • In: Proceedings Of The 2014 IEEE 25th International Conference on Application-specific Systems, Architectures and Processors (ASAP). - : IEEE. ; , s. 30-35
  • Conference paper (peer-reviewed)abstract
    • In this paper, we present a highly customizable and rapidly reconfigurable multi-core packet processing architecture that provides energy and area efficiency while retaining flexibility. Presented architecture with its agile reconfigurability permits time-critical adaptability where resources can be re-clustered at run time in few cycles, hence, maintaining efficiency if requirements of the use-case change. We elaborate the flexibility and adaptability of our architecture and we report its evaluation results. For evaluation, we performed the widely-used UDP/IP and we compared our proposed architecture to low-power 32-bit general purpose processors, a custom ASIC implementation and a programmable protocol processor. Compared to GPP-based solutions, our architecture is 20-34 times more energy efficient while providing 2.4-4.1 times higher throughput. While retaining the programmability, the proposed solution achieved 78% of the energy efficiency of hardwired ASIC implementation. Compared to a programmable protocol processor, our solution has 2.6 times more throughput and requires only a third of the gate count. lastly, we quantified the worst-case time and average-case time required for time-critical adaptability when reconfiguration occurs during a real-life Voice-Over IP traffic.
  •  
24.
  • Badawi, Mohammad, et al. (author)
  • Elastic Management and QoS Provisioning Scheme for Adaptable Multi-core Protocol Processing Architecture
  • 2016
  • In: 19TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2016). - : IEEE. - 9781509028160 ; , s. 575-583
  • Conference paper (peer-reviewed)abstract
    • Adaptable protocol processing architectures can offer quality-of-service (QoS) while improving energy efficiency and resource utilization. However, a key condition for adaptable architectures to support QoS is that, the latency required for processor adaptation does not result in violating packet processing delay bound. Moreover, adaptation latency must not cause packets to accumulate until memory becomes full and packets are dropped. In this paper, we present an elastic management scheme for agile adaptable multi-core protocol processing architecture to facilitate processor adaptation when QoS has to be maintained. The proposed management scheme encompasses a set of reconfigurable finite state machines (FSMs) and each is dimensioned to associate single processing element (PE). During processor adaptation, the needed FSMs can rapidly be clustered to provide the control needed for the newly adapted structure. We use a real-life application to demonstrate how our proposed management scheme supports maintaining QoS during processor adaptation. We also quantify the time needed for processor adaptation as well as the reduction in energy, latency and area achieved when using our scheme.
  •  
25.
  • Badawi, Mohammad, et al. (author)
  • Quality-of-service-aware adaptation scheme for multi-core protocol processing architecture
  • 2017
  • In: Microprocessors and microsystems. - : Elsevier. - 0141-9331 .- 1872-9436. ; 54, s. 47-59
  • Journal article (peer-reviewed)abstract
    • Employing adaptable protocol processing architectures has shown a high potential in provisioning Quality-of-Service (QoS) while retaining efficient use of available energy budget. Nevertheless, successful QoS provisioning using adaptable protocol processing architectures requires adaption to be agile and to have low latency. That is, a long adaptation latency might lead to violating desired packet processing latency, desired throughput or loss of packets if the memory fails to accommodate packet accumulation. This paper presents an elastic management scheme to permit agile and QoS-aware adaptation of processing elements (PEs) within the protocol processing architecture, such that desired QoS is maintained. Moreover, our proposed scheme has the potential to reduce energy consumption since it employs the PEs upon demand. We quantify the latency required for PEs adaptation, the reduction in energy and the reduction in area that can be achieved using our scheme. We also consider two different real-life use cases to demonstrate the effectiveness of our proposed management scheme in maintaining QoS while conserving available energy.
  •  
26.
  • Badawi, Mohammad, et al. (author)
  • Service-Guaranteed Multi-Port PacketMemory for Parallel Protocol Processing Architecture
  • 2016
  • In: Proceedings - 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2016. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467387750 ; , s. 408-412
  • Conference paper (peer-reviewed)abstract
    • Parallel processing architectures have been increasingly utilized due to their potential for improving performance and energy efficiency. Unfortunately, the anticipated improvement often suffers from a limitation caused by memory access latency and latency variation, which consequently impact Quality of Service (QoS). This paper presents a service-guaranteed multi-port packet memory system to boost parallelism in protocol processing architectures. In this proposed memory system, all arriving packets are guaranteed a memory space, such that, a packet memory space can be allocated in a bounded number of cycles and each of its locations is accessible in a single cycle. We consider a real-time Voice Over Internet Protocol (VOIP) call as a case-study to evaluate our service-guaranteed memory system.
  •  
27.
  • Candaele, Bernard, et al. (author)
  • Mapping Optimisation for Scalable multi-core ARchiTecture : The MOSART approach
  • 2010
  • In: Proceedings - IEEE Annual Symposium on VLSI, ISVLSI 2010. - 9780769540764 ; , s. 518-523
  • Conference paper (peer-reviewed)abstract
    • The project will address two main challenges of prevailing architectures: 1) The global Interconnect and memory bottleneck due to a single, globally shared memory with high access times and power consumption; 2) The difficulties in programming heterogeneous, multi-core platforms, in particular in dynamically managing data structures in distributed memory. MOSART aims to overcome these through a multi-core architecture with distributed memory organisation, a Network-on-Chip (NoC) communication backbone and configurable processing cores that are scaled, optimised and customised together to achieve diverse energy, performance, cost and size requirements of different classes of applications. MOSART achieves this by: A) Providing platform support for management of abstract data structures Including middleware services and a run-time data manager for NoC based communication infrastructure; 2) Developing tool support for parallelizing and mapping applications on the multi-core target platform and customizing the processing cores for the application.
  •  
28.
  • Candaele, Bernard, et al. (author)
  • The MOSART Mapping Optimization for multi-core Architectures
  • 2011
  • In: VLSI 2010 Annual Symposium. - Dordrecht : Springer Publishing Company. ; , s. 181-195
  • Conference paper (peer-reviewed)abstract
    • MOSART project addresses two main challenges of prevailing architectures: (i) Theglobal interconnect and memory bottleneck due to a single, globally shared memorywith high access times and power consumption; (ii) The difficulties in programmingheterogeneous, multi-core platforms MOSART aims to overcome these through amulti-core architecture with distributed memory organization, a Network-on-Chip(NoC) communication backbone and configurable processing cores that are scaled,optimized and customized together to achieve diverse energy, performance, cost andsize requirements of different classes of applications. MOSART achieves this by:(i) Providing platform support for management of abstract data structures includingmiddleware services and a run-time data manager for NoC based communicationinfrastructure; (ii) Developing tool support for parallelizing and mapping applicationson the multi-core target platform and customizing the processing cores for theapplication.
  •  
29.
  • Chabloz, Jean-Michel, et al. (author)
  • A Flexible Communication Scheme for Rationally-Related Clock Frequencies
  • 2009
  • In: 2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN. - 9781424450299 ; , s. 109-116
  • Conference paper (peer-reviewed)abstract
    • As a replacement for the fast-fading Globally-Synchronous model, we have defined a flexible design style for SoCs, called GRLS, for Globally-Ratiochronous, Locally-Synchronous, which does not rely on global synchronization and is based on using rationally-related clock frequencies derived from the same source. In this paper, using the special periodical properties of rationally-related systems, we build a latency-insensitive, maximal-throughput, low-overhead communication method, based on the idea of using both clock edges to sample data at the Receiver. The validity of the method and its resistance to non-idealities such as jitter, misalignments and clock drifts are formally proven while experimental results including overhead are presented for 90 nm technology. Despite allowing much greater flexibility, the overhead of our method is comparable to that of state-of-the-art mesochronous communication techniques. We also show performances, complexity and overhead improvements over all other approaches that have so far been proposed for rationally-related clock frequencies.
  •  
30.
  • Chabloz, Jean-Michel, et al. (author)
  • A GALS Network-on-Chip based on Rationally-Related Frequencies
  • 2011
  • In: 2011 IEEE 29TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD). - LOS ALAMITOS : IEEE COMPUTER SOC. - 9781457719523 ; , s. 12-18
  • Conference paper (peer-reviewed)abstract
    • GALS Networks-on-Chip (NoCs) in which the frequency of every switch can be set independently would enable per-node DVFS without requiring asynchronous switch design. However, traditional GALS interfaces introduce high latency penalties and are therefore ill-suited for inter-switch links in a NoC. In this paper we introduce and study a GALS Network-on-Chip based on the Globally-Ratiochronous, Locally-Synchronous (GRLS) paradigm. GRLS constrains all switch frequencies to be rationally-related but enables the use of efficient interfaces which reduce the latency of the network 60% compared to GALS solutions and obtains better throughput-per-power ratios compared to synchronous and mesochronous solutions.
  •  
31.
  • Chabloz, Jean-Michel, et al. (author)
  • Distributed DVFS using rationally-related frequencies and discrete voltage levels
  • 2010
  • In: Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design. - New York, NY, USA : IEEE. - 9781450301466 ; , s. 247-252
  • Conference paper (peer-reviewed)abstract
    • We have defined a flexible latency-insensitive design style called Globally Ratiochronous Locally Synchronous (GRLS), based on quantized voltage levels and rationally-related clock frequencies. In this paper we present the infrastructure necessary to enable Distributed DVFS in such a system and analyze its overheads, quantitatively showing how, with minimal overheads, we obtain energy benefits that are close to those of a totally ideal GALS approach. The benefits that we show, coupled with the complexity and performance benefits of GRLS, which we briefly analyze, show how this approach is a strong competitor to GALS.
  •  
32.
  • Chabloz, Jean-Michel, 1982- (author)
  • Globally-Ratiochronous, Locally-Synchronous Systems
  • 2012
  • Doctoral thesis (other academic/artistic)abstract
    • It is well recognized in the literature that the fully-synchronous design style, once the best choice due especially to the simplicity of its design flow, is not suitable for present-days systems, which contain many more gates compared to their predecessors, and has to be superseded to meet the new needs of the industry. The alternative solution that has enjoyed more success in industry and the literature consists in breaking down a system into several fully-synchronous modules clocked with independent clocks. Such systems go under the name of Globally-non-Synchronous (GnS) and make no assumption on the phase alignment between the clocks in the individual modules. GnS design styles do not require a globally balanced clock tree and employ special synchronizers to achieve latency-insensitivity. The individual modules, whose sizes are relatively small, remain fully-synchronous, thus easy to design andmaintain. Two main classes of GnS systems have been proposed: the GALS (for Globally-Asynchronous, Locally-Synchronous) design style allows each module to be clocked at its own independent clock frequency; the mesochronous design style constrains all modules to run at the same frequency. GALS systems support per-module Dynamic Voltage-Frequency Scaling (DVFS), but GALS interfaces are complex and introduce high performance penalties; mesochronous systems do not support per-module DVFS but support simpler and faster interfaces. It is well recognized that neither of the two design styles can fully satisfy all the contrasting needs of the electronic industry, and often hybrid solutions are deployed as a trade-off. We propose Globally-Ratiochronous, Locally-Synchronous (GRLS) systems, where GRLS is a design style intermediate between the mesochronous and the GALS design paradigms: local frequencies in a GRLS system do not need to be identical, but are required to be rationally-related (such as one being 3/4 or 2/5 of the other). The periodic properties of rationally-related systems allow the deployment of interfaces that do not use any form of handshake and, thanks to this, are much more performant than GALS interfaces; on the other hand, GRLS supports quantized per-module DVFS. In this work we deploy and analyse all the components of the GRLS design style: the frequency regulation system, the voltage regulation system, and the GRLS latency-insensitive interfaces. We perform a theoretical analysis of DVFS efficiency in different GRLS systems, and then study a GRLS NoC-based platform. We also develop a complete GRLS power management system for a GRLS Network-on-Chip (NoC)-based platform. Experimental results show that GRLS performances are close to those of mesochronous systems and GRLS flexibility is close to that of GALS systems, which results in high figures of merit for GRLS systems. As an example, the GRLS NoC-based platform we study in this work has at least ≈ 21% lower latency-power product compared to alternative mesochronous-GALS hybrid platforms, and respectively ≈ 32% and ≈ 48% better latency-power product compared to mesochronous and GALS platforms.
  •  
33.
  • Chabloz, Jean-Michel, et al. (author)
  • Low-latency and low-overhead mesochronous and plesiochronous synchronizers
  • 2011
  • Conference paper (peer-reviewed)abstract
    • In this paper we present efficient Mesochronous and Plesiochronous interfaces targeting low-latency and low-overhead links. Our source-synchronous scheme can easily be integrated in traditional design flows, supports maximal throughput, has low latency and has an overhead of only three flipflops per data line. With one additional flipflop per data line, the Plesiochronous interface allows the synchronizer to cope with clock drifts. The simple synchronization scheme is validated through formal analysis and simulation.
  •  
34.
  • Chabloz, Jean-Michel, et al. (author)
  • Low-Latency Maximal-Throughput Communication Interfaces for Rationally Related Clock Domains
  • 2014
  • In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - 1063-8210 .- 1557-9999. ; 22:3, s. 641-654
  • Journal article (peer-reviewed)abstract
    • In this paper, we introduce a source-synchronous adaptive interface for the globally ratiochronous, locally synchronous design style, a subset of the globally asynchronous, locally synchronous (GALS) design style in which the frequencies of all clocks are not phase-aligned but are constrained to be rationally related, i.e., they are all submultiple of the same physical or virtual frequency. The interface can be designed using only standard cells and guarantees maximal throughput in addition to an average latency four times lower compared with state-of-the-art asynchronous first-input, first-output GALS interfaces. Several properties of the interface are formally stated and proved. We also demonstrate that the interface has a low area overhead, with only four flip-flops per data line, and is robust against nonidealities such as clock jitters and propagation delay misalignments. For a realistic link in 90-nm application-specific integrated circuit technology, we derive a 1-GHz upper bound for the least common multiple among the frequencies.
  •  
35.
  • Chabloz, Jean-Michel, et al. (author)
  • Low-latency no-handshake GALS interfaces for fast-receiver links
  • 2012
  • In: Proceedings of the IEEE International Conference on VLSI Design. - : IEEE. - 9780769546384 ; , s. 191-196
  • Conference paper (peer-reviewed)abstract
    • In this paper we introduce a novel interface for Globally-Asynchronous, Locally-Synchronous systems which does not use any form of handshake to cross the gap between the clock domains. In particular, links in which the Receiver runs faster than the Transmitter are targeted. The interface works by finding an approximate ratio between the clock frequencies. Then, ratiochronous synchronizers that can tolerate clock drifts are employed to transmit data from the Transmitter to the Receiver clock domain. Thanks to the periodic properties of rationally-related systems, no handshake is employed and the average latency of the interface is decreased ∌ 75% compared to state-of-the-art GALS interfaces. Additionally, the interface uses only standard cells and, save for a delay line, can be designed at Register Transfer Level.
  •  
36.
  • Chabloz, Jean-Michel, et al. (author)
  • Lowering the Latency of Interfaces for Rationally-Related Frequencies
  • 2010
  • In: 2010 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN. - 9781424489350 ; , s. 23-30
  • Conference paper (peer-reviewed)abstract
    • We have introduced the Globally-Ratiochronous, Locally-Synchronous (GRLS) design paradigm, a design style based on rationally-related frequencies, with the objective to overcome the limitations of traditional multi-frequency systems by providing a flexibility close that of Globally-Asynchronous, Locally-Synchronous (GALS) systems but introducing performance penalties and overheads close to those of mesochronous systems. In this paper we focus on performances and improve the latency figures of our original GRLS interfaces by introducing two new interfaces, called GRLS-F and GRLS-noF, the first suitable for blocks with long computation time and the second for blocks with short computation time. The latency figures of the original GRLS interfaces are improved up to 50% without increasing complexity. The average latency figures of the resulting interfaces are lower than 1 Receiver clock cycle, the latency of a synchronous interface.
  •  
37.
  • Chabloz, J. -M, et al. (author)
  • Power management architecture in McNoC
  • 2012
  • In: Scalable Multi-core Architectures. - New York, NY : Springer Science+Business Media B.V.. - 9781441967787 ; , s. 55-80
  • Book chapter (other academic/artistic)abstract
    • In this chapter we present the power management architecture of the McNoC platform. The power management architecture of McNoC offers distributed Dynamic Voltage Frequency Scaling (DVFS) and power down services to the platform at a fine level of granularity, allowing independent setting of frequency and supply voltage to all switch and resource nodes in the platform. The design style enables hierarchical physical design and solves the clock-domain-crossing problem with a solution based on rationally-related frequencies, which avoids the overhead associated with handshake. The architecture allows arbitrary power management regions to be defined and region-wide power management commands affecting all nodes in a region can be issued by the software layer that we call as Power Management Intelligence (PMINT).
  •  
38.
  • Daneshtalab, M., et al. (author)
  • Message from the chairs
  • 2013
  • In: MES '13Proceedings of the first International Workshop on Many-core Embedded Systems. - 9781450320634
  • Conference paper (peer-reviewed)
  •  
39.
  • Daneshtalab, Masoud, et al. (author)
  • Special issue on many-core embedded systems
  • 2014
  • In: Microprocessors and microsystems. - : Elsevier BV. - 0141-9331 .- 1872-9436. ; 38:6, s. 525-525
  • Journal article (other academic/artistic)
  •  
40.
  •  
41.
  •  
42.
  • Deb, Abhijit Kumar, et al. (author)
  • Hardware software codesign of DSP system using grammar based approach
  • 2001
  • In: VLSI Design, 2001. Fourteenth International Conference on. ; , s. 42-47
  • Conference paper (peer-reviewed)abstract
    • Embedded cores are gaining widespread use to deal with the complex DSP systems where flexibility is of utmost importance. The design of such a system offers several problems, which are not addressed by the existing methodology. The authors previously presented an integrated grammar based DSP design methodology that separates architectural and functional specification, can create a virtual prototype and has a smooth link to the implementation phase. In this paper we present the extension of the work to handle embedded cores. Here we the capture the host peripheral interface (HPI) of TMS320C6x core at higher level of abstraction and provide a single simulation environment, which facilitates faster analysis of hardware software components. Our results reveal that the proposed methodology offers simulation time speed-up of 5 times and design time speed-up of 8 times, while keeping the architectural specification separated from functionality
  •  
43.
  • Dhilleswararao, Pudi, et al. (author)
  • Efficient Implementation of 2-D Convolution on DRRA and DiMArch Architectures
  • 2023
  • In: Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, HEART 2023. - : Association for Computing Machinery (ACM). ; , s. 86-92
  • Conference paper (peer-reviewed)abstract
    • Convolution has been widely employed in image processing and computer vision applications such as picture augmentation, smoothing, and structure extraction. In addition, convolution operations are the most prevalent computing patterns in machine learning domains. Convolutions, for example, are used in a substantial chunk of state-of-the-art convolutional neural network operations. Therefore, effectively mapping convolution operations onto hardware architectures is crucial for achieving superior performance while accelerating convolutional neural networks. In this paper, we proposed various algorithms to efficiently map the 2-D convolution operation onto a dynamically reconfigurable resource array and distributed memory architecture. Furthermore, we have discussed the mapping of 2-D convolution on the target architecture for an input matrix of arbitrary size, as well as the generalization of the proposed approaches for multi-column DRRA architectures.
  •  
44.
  •  
45.
  •  
46.
  •  
47.
  • Ellervee, Peeter, et al. (author)
  • Exploiting data transfer locality in memory mapping
  • 1999
  • In: EUROMICRO Conference, 1999. Proceedings. 25th. ; , s. 14-21
  • Conference paper (peer-reviewed)abstract
    • System-level exploration of memory architectures is one of the key issues in successful implementation of data-transfer dominated applications. Usually, one of the main design bottlenecks is the memory access bandwidth. Transformations, rearranging the layout of the data records stored in memory, are very effective to improve the locality of the data transfers but usually lead to a large memory bit-wastage when not performed carefully. In this paper, a methodology which reduces memory bandwidth requirements without sacrificing storage space is proposed. The methodology exploits parallelism in the data-transfers to rearrange the layout of the data records. Distributed memory organization combined with our proposed layout rearrangement methodology allow to effectively reduce the memory bandwidth bottleneck in data-transfer dominated applications
  •  
48.
  • Ellervee, Peeter, et al. (author)
  • Exploring ASIC Design Space at System Level with a Neural Network Estimator
  • 1994
  • In: Proc. of IEEE ASIC-conference, 1994.
  • Conference paper (peer-reviewed)abstract
    • Estimators are critical tools in doing architectural level exploration of the design space. We present a novel approach to estimation based on the multilayer perceptron which builds the estimation function during the learning process and thus allows to describe arbitrary complex functions. We also describe how the control data flow graph is encoded for the neural network input and we present results of the first experiments made with realistic design examples.
  •  
49.
  •  
50.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-50 of 287
Type of publication
conference paper (213)
journal article (45)
doctoral thesis (11)
reports (8)
book chapter (4)
other publication (2)
show more...
licentiate thesis (2)
editorial collection (1)
editorial proceedings (1)
show less...
Type of content
peer-reviewed (249)
other academic/artistic (38)
Author/Editor
Hemani, Ahmed (225)
Hemani, Ahmed, 1961- (50)
Jantsch, Axel (44)
Tenhunen, Hannu (43)
Öberg, Johnny (41)
Ellervee, Peeter (36)
show more...
Paul, Kolin (30)
Stathis, Dimitrios (24)
Kumar, Shashi (24)
Plosila, Juha (20)
Farahini, Nasim (20)
Postula, Adam (20)
Svantesson, Bengt (19)
Abbas, Haider (17)
Yngström, Louise (16)
Yang, Yu (16)
Li, Shuo (15)
Jafri, Syed Mohammad ... (14)
Kumar, Anshul (13)
Jafri, Syed (11)
O'Nils, Mattias (10)
Daneshtalab, Masoud (10)
Chabloz, Jean-Michel (10)
Penolazzi, Sandro (10)
Hemani, Ahmed, Profe ... (9)
Tajammul, Muhammad A ... (9)
Lu, Zhonghai (8)
Zou, Zhuo (8)
Liu, Pei (8)
Lindqvist, Dan (8)
Meincke, Thomas (8)
Jafri, Syed M. A. H. (8)
Magnusson, Christer (7)
Sander, Ingo (7)
Badawi, Mohammad (7)
Xu, Jiawei (7)
Zheng, Li-Rong (6)
Lansner, Anders, Pro ... (6)
Lansner, Anders, Pro ... (6)
Shami, Muhammad Ali (6)
Wang, Deyu (6)
Malik, Jamshaid Sarw ... (6)
Malik, Omer (6)
Olsson, Thomas (5)
Nilsson, Peter (5)
Li, Feng (5)
Deb, Abhijit Kumar (5)
Sohofi, Hassan (5)
Isoaho, Jouni (5)
Mokhtari, Mehran (5)
show less...
University
Royal Institute of Technology (275)
Stockholm University (13)
Lund University (4)
Mid Sweden University (3)
Uppsala University (2)
Halmstad University (2)
show more...
Umeå University (1)
Linköping University (1)
Jönköping University (1)
show less...
Language
English (285)
Undefined language (2)
Research subject (UKÄ/SCB)
Engineering and Technology (226)
Natural sciences (52)
Medical and Health Sciences (2)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view