Search: onr:"swepub:oai:DiVA.org:su-131920" >
GenFamClust :
GenFamClust : an accurate, synteny-aware and reliable homology inference algorithm
-
- Ali, Raja Hashim, 1985- (author)
- KTH,Beräkningsvetenskap och beräkningsteknik (CST),Lars Arvestad
-
- Muhammad, Sayyed Auwn (author)
- KTH,Beräkningsvetenskap och beräkningsteknik (CST)
-
- Arvestad, Lars (author)
- KTH,Stockholms universitet,Numerisk analys och datalogi (NADA),Science for Life Laboratory (SciLifeLab),Swedish e-Science Research Centre, Sweden,Beräkningsvetenskap och beräkningsteknik (CST)
-
(creator_code:org_t)
- 2016-06-04
- 2016
- English.
-
In: BMC Evolutionary Biology. - : Springer Science and Business Media LLC. - 1471-2148. ; 16
- Related links:
-
https://doi.org/10.1...
-
show more...
-
https://bmcevolbiol....
-
https://urn.kb.se/re...
-
https://doi.org/10.1...
-
https://urn.kb.se/re...
-
show less...
Abstract
Subject headings
Close
- Background: Homology inference is pivotal to evolutionary biology and is primarily based on significant sequence similarity, which, in general, is a good indicator of homology. Algorithms have also been designed to utilize conservation in gene order as an indication of homologous regions. We have developed GenFamClust, a method based on quantification of both gene order conservation and sequence similarity. Results: In this study, we validate GenFamClust by comparing it to well known homology inference algorithms on a synthetic dataset. We applied several popular clustering algorithms on homologs inferred by GenFamClust and other algorithms on a metazoan dataset and studied the outcomes. Accuracy, similarity, dependence, and other characteristics were investigated for gene families yielded by the clustering algorithms. GenFamClust was also applied to genes from a set of complete fungal genomes and gene families were inferred using clustering. The resulting gene families were compared with a manually curated gold standard of pillars from the Yeast Gene Order Browser. We found that the gene-order component of GenFamClust is simple, yet biologically realistic, and captures local synteny information for homologs. Conclusions: The study shows that GenFamClust is a more accurate, informed, and comprehensive pipeline to infer homologs and gene families than other commonly used homology and gene-family inference methods.
Subject headings
- NATURVETENSKAP -- Biologi (hsv//swe)
- NATURAL SCIENCES -- Biological Sciences (hsv//eng)
- NATURVETENSKAP -- Biologi -- Bioinformatik och systembiologi (hsv//swe)
- NATURAL SCIENCES -- Biological Sciences -- Bioinformatics and Systems Biology (hsv//eng)
Keyword
- Homology inference
- Gene synteny
- Gene similarity
- Gene family
- Clustering
- Gene order conservation
- Computer Science
Publication and Content Type
- ref (subject category)
- art (subject category)
Find in a library
To the university's database