SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Hemani Ahmed) ;pers:(Liu Pei)"

Sökning: WFRF:(Hemani Ahmed) > Liu Pei

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Liu, Pei, et al. (författare)
  • 3D-stacked many-core architecture for biological sequence analysis problems
  • 2015
  • Ingår i: Embedded Computer Systems. - : IEEE conference proceedings. ; , s. 211-220
  • Konferensbidrag (refereegranskat)abstract
    • Sequence analysis plays critical role in bioinformatics, and most applications of which have compute intensive kernels consuming over 70% of total execution time. By exploiting the compute intensive execution stages of popular sequence analysis applications, we present and evaluate a VLSI architecture with a focus on those that target at biological sequences directly, including pairwise alignment, multiple sequence alignment, database search, and short read sequence mappings. Based on coarse grained reconfigurable array (CGRA) we propose the use of many-core and 3D-stacked technologies to gain further improvement over memory subsystem, which gives another order of magnitude speedup from high bandwidth and low access latency. We analyze our approach in terms of its throughput and efficiency for different application mappings. Initial experimental results are evaluated from a stripped down implementation in a commodity FPGA, and then we scale the results to estimate the performance of our architecture with 9 layers of 68 mm2 stacked wafers in 45-nm process. We demonstrate numerous estimated speedups better than any existed hardware accelerators for at least 39 times for the entire range of applications and datasets of interest. In comparison, the alternative FPGA based accelerators deliver only improvement for single application, while GPGPUs perform not well enough on accelerating program kernel with random memory access and integer addition/comparison operations.
  •  
2.
  • Liu, Pei, et al. (författare)
  • 3D-Stacked Many-Core Architecture for Biological Sequence Analysis Problems
  • 2017
  • Ingår i: International journal of parallel programming. - : SPRINGER/PLENUM PUBLISHERS. - 0885-7458 .- 1573-7640. ; 45:6, s. 1420-1460
  • Tidskriftsartikel (refereegranskat)abstract
    • Sequence analysis plays extremely important role in bioinformatics, and most applications of which have compute intensive kernels consuming over 70% of total execution time. By exploiting the compute intensive execution stages of popular sequence analysis applications, we present and evaluate a VLSI architecture with a focus on those that target at biological sequences directly, including pairwise sequence alignment, multiple sequence alignment, database search, and short read sequence mappings. Based on coarse grained reconfigurable array we propose the use of many-core and 3D-stacked technologies to gain further improvement over memory subsystem, which gives another order of magnitude speedup from high bandwidth and low access latency. We analyze our approach in terms of its throughput and efficiency for different application mappings. Initial experimental results are evaluated from a stripped down implementation in a commodity FPGA, and then we scale the results to estimate the performance of our architecture with 9 layers of stacked wafers in 45-nm process. We demonstrate numerous estimated speedups better than corresponding existed hardware accelerator platforms for at least 40 times for the entire range of applications and datasets of interest. In comparison, the alternative FPGA based accelerators deliver only improvement for single application, while GPGPUs perform not well enough on accelerating program kernel with random memory access and integer addition/comparison operations.
  •  
3.
  • Liu, Pei, et al. (författare)
  • A Coarse Grain Reconfigurable Architecture for sequence alignment problems in bio-informatics
  • 2010
  • Ingår i: Proceedings of the 2010 IEEE 8th Symposium on Application Specific Processors, SASP'10. - 9781424479535 ; , s. 50-57
  • Konferensbidrag (refereegranskat)abstract
    • A Coarse Grain Reconfigurable Architecture (CGRA) tailored for accelerating bio-informatics algorithms is proposed. The key innovation is a light weight bio-informatics processor that can be reconfigured to perform different Add Compare and Select operations of the popular sequencing algorithms. A programmable and scalable architectural platform instantiates an array of such processing elements and allows arbitrary partitioning and scheduling schemes and capable of solving complete sequencing algorithms including the sequential phases and deal with arbitrarily large sequences. The key difference of the proposed CGRA based solution compared to FPGA and GPU based solutions is a much better match of the architecture and algorithm for the core computational need as well as the system level architectural need. This claim is quantified for three popular sequencing algorithms: the Needleman-Wunsch, Smith-Waterman and HMMER. For the same degree of parallelism, we provide a 5 X and 15 X speed-up improvements compared to FPGA and GPU respectively. For the same size of silicon, the advantage grows by a factor of another 10 X.
  •  
4.
  • Liu, Pei, et al. (författare)
  • A Coarse-Grained Reconfigurable Processor for Sequencing and Phylogenetic Algorithms in Bioinformatics
  • 2011
  • Ingår i: Proceedings. - 9781457717345 ; , s. 190-197
  • Konferensbidrag (refereegranskat)abstract
    • A coarse-grained reconfigurable processor tailoredfor accelerating multiple bioinformatics algorithms isproposed. In this paper, a programmable and scalablearchitectural platform instantiates an array of coarse grainedlight weight processing elements, which allows arbitrarypartitioning, scheduling schemes and capable of solvingcomplete four popular bioinformatics algorithms: theNeedleman-Wunsch, Smith-Waterman, and HMMER onsequencing, and Maximum Likelihood on phylogenetic. Thekey difference of the proposed CGRA based solution comparedto FPGA and GPU based solutions is a much better match onarchitecture and algorithms for the core computational needs,as well as the system level architectural needs. For the samedegree of parallelism, we provide a 5X to 14X speed-upimprovements compared to FPGA solutions and 15X to 78Xcompared to GPU acceleration on 3 sequencing algorithms. Wealso provide 2.8X speed-up compared to FPGA with the sameamount of core logic and 70X compared to GPU with the samesilicon area for Maximum Likelihood.
  •  
5.
  • Liu, Pei, et al. (författare)
  • A Customized Many-Core Hardware Acceleration Platform for Short Read Mapping Problems Using Distributed Memory Interface with 3D-Stacked Architecture
  • 2017
  • Ingår i: Journal of Signal Processing Systems. - : Springer. - 1939-8018 .- 1939-8115. ; 87:3, s. 327-341
  • Tidskriftsartikel (refereegranskat)abstract
    • Rapidly developing Next Generation Sequencing technologies produce huge amounts of short reads that consisting randomly fragmented DNA base pair strings. Assembling of those short reads poses a challenge on the mapping of reads to a reference genome in terms of both sensitivity and execution time. In this paper, we propose a customized many-core hardware acceleration platform for short read mapping problems based on hash-index method. The processing core is highly customized to suite both 2-hit string matching and banded Smith-Waterman sequence alignment operations, while distributed memory interface with 3D-stacked architecture provides high bandwidth and low access latency for highly customized dataset partitioning and memory access scheduling. Conformal with original BFAST program, our design provides an amazingly 45,012 times speedup over software approach for single-end short reads and 21,102 times for paired-end short reads, while also beats similar single FPGA solution for 1466 times in case of single end reads. Optimized seed generation gives much better sensitivity while the performance boost is still impressive.
  •  
6.
  • Liu, Pei, et al. (författare)
  • A many-core hardware acceleration platform for short read mapping problem using distributed memory interface with 3D-stacked architecture
  • 2014
  • Ingår i: 2014 International Symposium on System-on-Chip, SoC 2014. ; , s. 1-8
  • Konferensbidrag (refereegranskat)abstract
    • Next Generation Sequencing technologies produce huge amounts of short reads consisting randomly fragmented DNA base pair strings, while assembling poses a challenge on the mapping of short reads to a reference genome in terms of both sensitivity and execution time. In this paper, we propose a many-core hardware acceleration platform for short read mapping based on hash-index method, which benefit from a distributed memory interface with 3D-stacked architecture for local memory access. Our design provides an amazingly 45012 times speedup over software approach for single end short reads and 21102 times for paired end reads, while also beats similar single FPGA solution for 1466 times in case of single end reads.
  •  
7.
  • Liu, Pei, et al. (författare)
  • A reconfigurable processor for phylogenetic inference
  • 2011
  • Ingår i: VLSI Design (VLSI Design), 2011 24th International Conference on. - : IEEE. - 9780769543482 - 9781612843278 ; , s. 226-231
  • Konferensbidrag (refereegranskat)abstract
    • A reconfigurable processor tailored for accelerating Phylogenetic Inference is proposed. In this paper, a programmable and scalable architectural platform instantiates an array of coarse grained light weight processing elements, which allows arbitrary partitioning, scheduling schemes and capable of solving complete Maximum Likelihood algorithm with arbitrarily of large sequences. The key difference of the proposed CGRA based solution compared to FPGA and GPU based solutions is a much better match of the architecture and algorithm for the core computational need as well as the system level architectural need. For the same degree of parallelism, we provide a 2.27X speed-up improvements compared to FPGA with the same amount of logic, and an 81.87X speed-up improvements compared to GPU with the same silicon area respectively.
  •  
8.
  • Liu, Pei, et al. (författare)
  • Improved Bioinformatics Processing Unit for Multiple Applications
  • 2012
  • Ingår i: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012. - : IEEE. - 9780769546766 ; , s. 390-396
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a coarse-grain reconfigurable unit for accelerating multiple widely used bioinformatics algorithms. Our design is a highly efficient, programmable bioinformatics processing unit, called BiCell v2. Based on a specialized multimode multiplier, this unit provides three different working modes, in order to accelerate four popular bioinformatics algorithms: Maximum Likelihood based phylogenetic inference, Needleman-Wunsch, Smith-Waterman, and HMMER in sequence alignment. BiCell v2 supports both single and double precision floating-point computation, which significantly increases the accuracy for bioinformatics algorithm acceleration but retains silicon area efficiency. Making use of this improved processing unit, our platform gives about 10X speedup compared to our previous design in single-precision, and 23.2X speedup comparing with GPGPU in the same precision.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy