Tyck till om SwePub Sök
här!
Sökning: WFRF:(Grabherr Manfred) >
microTaboo :
microTaboo : a general and practical solution to the k-disjoint problem
-
- Al-Jaff, Mohammed (författare)
- Uppsala universitet,Institutionen för medicinsk biokemi och mikrobiologi
-
- Sandström, Eric (författare)
- Uppsala universitet,Institutionen för medicinsk biokemi och mikrobiologi
-
- Grabherr, Manfred (författare)
- Uppsala universitet,Institutionen för medicinsk biokemi och mikrobiologi,Uppsala Univ, Bioinformat Infrastruct Life Sci, S-75123 Uppsala, Sweden.
-
(creator_code:org_t)
- 2017-05-02
- 2017
- Engelska.
-
Ingår i: BMC Bioinformatics. - : BIOMED CENTRAL LTD. - 1471-2105. ; 18
- Relaterad länk:
-
https://doi.org/10.1...
-
visa fler...
-
https://uu.diva-port... (primary) (Raw object)
-
https://bmcbioinform...
-
https://urn.kb.se/re...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- Background: A common challenge in bioinformatics is to identify short sub-sequences that are unique in a set of genomes or reference sequences, which can efficiently be achieved by k-mer (k consecutive nucleotides) counting. However, there are several areas that would benefit from a more stringent definition of "unique", requiring that these sub-sequences of length W differ by more than k mismatches (i.e. a Hamming distance greater than k) from any other sub-sequence, which we term the k-disjoint problem. Examples include finding sequences unique to a pathogen for probe-based infection diagnostics; reducing off-target hits for re-sequencing or genome editing; detecting sequence (e.g. phage or viral) insertions; and multiple substitution mutations. Since both sensitivity and specificity are critical, an exhaustive, yet efficient solution is desirable.Results: We present microTaboo, a method that allows for efficient and extensive sequence mining of unique (k-disjoint) sequences of up to 100 nucleotides in length. On a number of simulated and real data sets ranging from microbe-to mammalian-size genomes, we show that microTaboo is able to efficiently find all sub-sequences of a specified length W that do not occur within a threshold of k mismatches in any other sub-sequence. We exemplify that microTaboo has many practical applications, including point substitution detection, sequence insertion detection, padlock probe target search, and candidate CRISPR target mining.Conclusions: microTaboo implements a solution to the k-disjoint problem in an alignment-and assembly free manner. microTaboo is available for Windows, Mac OS X, and Linux, running Java 7 and higher, under the GNU GPLv3 license, at:https://MohammedAlJaff.github.io/microTaboo
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Bioinformatik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Bioinformatics (hsv//eng)
Nyckelord
- k-disjoint problem
- Software
- Sequence mining
Publikations- och innehållstyp
- ref (ämneskategori)
- art (ämneskategori)
Hitta via bibliotek
Till lärosätets databas