Sökning: onr:"swepub:oai:DiVA.org:su-131920" >
GenFamClust :
GenFamClust : an accurate, synteny-aware and reliable homology inference algorithm
-
- Ali, Raja Hashim, 1985- (författare)
- KTH,Beräkningsvetenskap och beräkningsteknik (CST),Lars Arvestad
-
- Muhammad, Sayyed Auwn (författare)
- KTH,Beräkningsvetenskap och beräkningsteknik (CST)
-
- Arvestad, Lars (författare)
- KTH,Stockholms universitet,Numerisk analys och datalogi (NADA),Science for Life Laboratory (SciLifeLab),Swedish e-Science Research Centre, Sweden,Beräkningsvetenskap och beräkningsteknik (CST)
-
(creator_code:org_t)
- 2016-06-04
- 2016
- Engelska.
-
Ingår i: BMC Evolutionary Biology. - : Springer Science and Business Media LLC. - 1471-2148. ; 16
- Relaterad länk:
-
https://doi.org/10.1...
-
visa fler...
-
https://bmcevolbiol....
-
https://urn.kb.se/re...
-
https://doi.org/10.1...
-
https://urn.kb.se/re...
-
visa färre...
Abstract
Ämnesord
Stäng
- Background: Homology inference is pivotal to evolutionary biology and is primarily based on significant sequence similarity, which, in general, is a good indicator of homology. Algorithms have also been designed to utilize conservation in gene order as an indication of homologous regions. We have developed GenFamClust, a method based on quantification of both gene order conservation and sequence similarity. Results: In this study, we validate GenFamClust by comparing it to well known homology inference algorithms on a synthetic dataset. We applied several popular clustering algorithms on homologs inferred by GenFamClust and other algorithms on a metazoan dataset and studied the outcomes. Accuracy, similarity, dependence, and other characteristics were investigated for gene families yielded by the clustering algorithms. GenFamClust was also applied to genes from a set of complete fungal genomes and gene families were inferred using clustering. The resulting gene families were compared with a manually curated gold standard of pillars from the Yeast Gene Order Browser. We found that the gene-order component of GenFamClust is simple, yet biologically realistic, and captures local synteny information for homologs. Conclusions: The study shows that GenFamClust is a more accurate, informed, and comprehensive pipeline to infer homologs and gene families than other commonly used homology and gene-family inference methods.
Ämnesord
- NATURVETENSKAP -- Biologi (hsv//swe)
- NATURAL SCIENCES -- Biological Sciences (hsv//eng)
- NATURVETENSKAP -- Biologi -- Bioinformatik och systembiologi (hsv//swe)
- NATURAL SCIENCES -- Biological Sciences -- Bioinformatics and Systems Biology (hsv//eng)
Nyckelord
- Homology inference
- Gene synteny
- Gene similarity
- Gene family
- Clustering
- Gene order conservation
- Computer Science
Publikations- och innehållstyp
- ref (ämneskategori)
- art (ämneskategori)
Hitta via bibliotek
Till lärosätets databas