SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:1471 2105 "

Sökning: L773:1471 2105

  • Resultat 1-10 av 237
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Fontes, Magnus, et al. (författare)
  • The projection score - an evaluation criterion for variable subset selection in PCA visualization
  • 2011
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 12
  • Tidskriftsartikel (refereegranskat)abstract
    • Background In many scientific domains, it is becoming increasingly common to collect high-dimensional data sets, often with an exploratory aim, to generate new and relevant hypotheses. The exploratory perspective often makes statistically guided visualization methods, such as Principal Component Analysis (PCA), the methods of choice. However, the clarity of the obtained visualizations, and thereby the potential to use them to formulate relevant hypotheses, may be confounded by the presence of the many non-informative variables. For microarray data, more easily interpretable visualizations are often obtained by filtering the variable set, for example by removing the variables with the smallest variances or by only including the variables most highly related to a specific response. The resulting visualization may depend heavily on the inclusion criterion, that is, effectively the number of retained variables. To our knowledge, there exists no objective method for determining the optimal inclusion criterion in the context of visualization. Results We present the projection score, which is a straightforward, intuitively appealing measure of the informativeness of a variable subset with respect to PCA visualization. This measure can be universally applied to find suitable inclusion criteria for any type of variable filtering. We apply the presented measure to find optimal variable subsets for different filtering methods in both microarray data sets and synthetic data sets. We note also that the projection score can be applied in general contexts, to compare the informativeness of any variable subsets with respect to visualization by PCA. Conclusions We conclude that the projection score provides an easily interpretable and universally applicable measure of the informativeness of a variable subset with respect to visualization by PCA, that can be used to systematically find the most interpretable PCA visualization in practical exploratory analysis.
  •  
2.
  • Malmström, Lars, et al. (författare)
  • 2DDB – a bioinformatics solution for analysis of quantitative proteomics data
  • 2006
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 7:158
  • Tidskriftsartikel (refereegranskat)abstract
    • Background We present 2DDB, a bioinformatics solution for storage, integration and analysis of quantitative proteomics data. As the data complexity and the rate with which it is produced increases in the proteomics field, the need for flexible analysis software increases. Results 2DDB is based on a core data model describing fundamentals such as experiment description and identified proteins. The extended data models are built on top of the core data model to capture more specific aspects of the data. A number of public databases and bioinformatical tools have been integrated giving the user access to large amounts of relevant data. A statistical and graphical package, R, is used for statistical and graphical analysis. The current implementation handles quantitative data from 2D gel electrophoresis and multidimensional liquid chromatography/mass spectrometry experiments. Conclusion The software has successfully been employed in a number of projects ranging from quantitative liquid-chromatography-mass spectrometry based analysis of transforming growth factor-beta stimulated fi-broblasts to 2D gel electrophoresis/mass spectrometry analysis of biopsies from human cervix. The software is available for download at SourceForge.
  •  
3.
  • Soneson, Charlotte, et al. (författare)
  • Integrative analysis of gene expression and copy number alterations using canonical correlation analysis
  • 2010
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 11:191, s. 1-20
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: With the rapid development of new genetic measurement methods, several types of genetic alterations can be quantified in a high-throughput manner. While the initial focus has been on investigating each data set separately, there is an increasing interest in studying the correlation structure between two or more data sets. Multivariate methods based on Canonical Correlation Analysis (CCA) have been proposed for integrating paired genetic data sets. The high dimensionality of microarray data imposes computational difficulties, which have been addressed for instance by studying the covariance structure of the data, or by reducing the number of variables prior to applying the CCA. In this work, we propose a new method for analyzing high-dimensional paired genetic data sets, which mainly emphasizes the correlation structure and still permits efficient application to very large data sets. The method is implemented by translating a regularized CCA to its dual form, where the computational complexity depends mainly on the number of samples instead of the number of variables. The optimal regularization parameters are chosen by cross-validation. We apply the regularized dual CCA, as well as a classical CCA preceded by a dimension-reducing Principal Components Analysis (PCA), to a paired data set of gene expression changes and copy number alterations in leukemia. Results: Using the correlation-maximizing methods, regularized dual CCA and PCA+CCA, we show that without pre-selection of known disease-relevant genes, and without using information about clinical class membership, an exploratory analysis singles out two patient groups, corresponding to well-known leukemia subtypes. Furthermore, the variables showing the highest relevance to the extracted features agree with previous biological knowledge concerning copy number alterations and gene expression changes in these subtypes. Finally, the correlation-maximizing methods are shown to yield results which are more biologically interpretable than those resulting from a covariance-maximizing method, and provide different insight compared to when each variable set is studied separately using PCA. Conclusions: We conclude that regularized dual CCA as well as PCA+CCA are useful methods for exploratory analysis of paired genetic data sets, and can be efficiently implemented also when the number of variables is very large.
  •  
4.
  • Eklund, Martin, 1978-, et al. (författare)
  • An eScience-Bayes strategy for analyzing omics data
  • 2010
  • Ingår i: BMC Bioinformatics. - : BioMed Central. - 1471-2105. ; 11, s. 282-
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: The omics fields promise to revolutionize our understanding of biology and biomedicine. However, their potential is compromised by the challenge to analyze the huge datasets produced. Analysis of omics data is plagued by the curse of dimensionality, resulting in imprecise estimates of model parameters and performance. Moreover, the integration of omics data with other data sources is difficult to shoehorn into classical statistical models. This has resulted in ad hoc approaches to address specific problems. Results: We present a general approach to omics data analysis that alleviates these problems. By combining eScience and Bayesian methods, we retrieve scientific information and data from multiple sources and coherently incorporate them into large models. These models improve the accuracy of predictions and offer new insights into the underlying mechanisms. This "eScience-Bayes" approach is demonstrated in two proof-of-principle applications, one for breast cancer prognosis prediction from transcriptomic data and one for protein-protein interaction studies based on proteomic data. Conclusions: Bayesian statistics provide the flexibility to tailor statistical models to the complex data structures in omics biology as well as permitting coherent integration of multiple data sources. However, Bayesian methods are in general computationally demanding and require specification of possibly thousands of prior distributions. eScience can help us overcome these difficulties. The eScience-Bayes thus approach permits us to fully leverage on the advantages of Bayesian methods, resulting in models with improved predictive performance that gives more information about the underlying biological system.
  •  
5.
  • Khan, Mehmood Alam, et al. (författare)
  • fastphylo : Fast tools for phylogenetics
  • 2013
  • Ingår i: BMC Bioinformatics. - : BioMed Central. - 1471-2105. ; 14:1, s. 334-
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Distance methods are ubiquitous tools in phylogenetics. Their primary purpose may be to reconstruct evolutionary history, but they are also used as components in bioinformatic pipelines. However, poor computational efficiency has been a constraint on the applicability of distance methods on very large problem instances. Results: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency. Conclusions: Fastphylo is a fast, memory efficient, and easy to use software suite. Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.
  •  
6.
  • Rantalainen, Mattias, et al. (författare)
  • Piecewise multivariate modelling of sequential metabolic profiling data
  • 2008
  • Ingår i: BMC Bioinformatics. - : BioMed Central. - 1471-2105. ; 9
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints.Results: A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted.Conclusion: The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
  •  
7.
  • Wagener, Johannes, et al. (författare)
  • XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services
  • 2009
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 10, s. 279-
  • Tidskriftsartikel (refereegranskat)abstract
    • BACKGROUND:Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. RESULTS:We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. CONCLUSION:XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.
  •  
8.
  • Westholm, Jakub Orzechowski, et al. (författare)
  • Genome-scale study of the importance of binding site context for transcription factor binding and gene regulation.
  • 2008
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 9, s. 484-
  • Tidskriftsartikel (refereegranskat)abstract
    • BACKGROUND The rate of mRNA transcription is controlled by transcription factors that bind to specific DNA motifs in promoter regions upstream of protein coding genes. Recent results indicate that not only the presence of a motif but also motif context (for example the orientation of a motif or its location relative to the coding sequence) is important for gene regulation. RESULTS In this study we present ContextFinder, a tool that is specifically aimed at identifying cases where motif context is likely to affect gene regulation. We used ContextFinder to examine the role of motif context in S. cerevisiae both for DNA binding by transcription factors and for effects on gene expression. For DNA binding we found significant patterns of motif location bias, whereas motif orientations did not seem to matter. Motif context appears to affect gene expression even more than it affects DNA binding, as biases in both motif location and orientation were more frequent in promoters of co-expressed genes. We validated our results against data on nucleosome positioning, and found a negative correlation between preferred motif locations and nucleosome occupancy. CONCLUSION We conclude that the requirement for stable binding of transcription factors to DNA and their subsequent function in gene regulation can impose constraints on motif context.
  •  
9.
  • Kuhn, Thomas, et al. (författare)
  • CDK-Taverna : an open workflow environment for cheminformatics
  • 2010
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 11, s. 159-
  • Tidskriftsartikel (refereegranskat)abstract
    • Background Small molecules are of increasing interest for bioinformatics in areas such as metabolomics and drug discovery. The recent release of large open access chemistry databases generates a demand for flexible tools to process them and discover new knowledge. To freely support open science based on these data resources, it is desirable for the processing tools to be open-source and available for everyone. Results Here we describe a novel combination of the workflow engine Taverna and the cheminformatics library Chemistry Development Kit (CDK) resulting in a open source workflow solution for cheminformatics. We have implemented more than 160 different workers to handle specific cheminformatics tasks. We describe the applications of CDK-Taverna in various usage scenarios. Conclusions The combination of the workflow engine Taverna and the Chemistry Development Kit provides the first open source cheminformatics workflow solution for the biosciences. With the Taverna-community working towards a more powerful workflow engine and a more user-friendly user interface, CDK-Taverna has the potential to become a free alternative to existing proprietary workflow tools.
  •  
10.
  • Anisimov, Sergey, et al. (författare)
  • Incidence of "quasi-ditags" in catalogs generated by Serial Analysis of Gene Expression (SAGE)
  • 2004
  • Ingår i: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 5
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Serial Analysis of Gene Expression (SAGE) is a functional genomic technique that quantitatively analyzes the cellular transcriptome. The analysis of SAGE libraries relies on the identification of ditags from sequencing files; however, the software used to examine SAGE libraries cannot distinguish between authentic versus false ditags ("quasi-ditags"). Results: We provide examples of quasi-ditags that originate from cloning and sequencing artifacts (i.e. genomic contamination or random combinations of nucleotides) that are included in SAGE libraries. We have employed a mathematical model to predict the frequency of quasi-ditags in random nucleotide sequences, and our data show that clones containing less than or equal to 2 ditags (which include chromosomal cloning artifacts) should be excluded from the analysis of SAGE catalogs. Conclusions: Cloning and sequencing artifacts contaminating SAGE libraries could be eliminated using simple pre-screening procedure to increase the reliability of the data.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 237
Typ av publikation
tidskriftsartikel (235)
konferensbidrag (1)
forskningsöversikt (1)
Typ av innehåll
refereegranskat (235)
övrigt vetenskapligt/konstnärligt (2)
Författare/redaktör
Trygg, Johan (7)
Lagergren, Jens (7)
Arvestad, Lars (7)
Kristiansson, Erik, ... (4)
Nielsen, Jens B, 196 ... (4)
Eklund, Martin (4)
visa fler...
Gustafsson, Mats G. (4)
Spjuth, Ola (4)
Warringer, Jonas, 19 ... (4)
Repsilber, Dirk, 197 ... (4)
Mostad, Petter, 1964 (3)
Hellander, Andreas (3)
Sonnhammer, Erik L L (3)
Persson, Bengt (3)
Bengtsson, Henrik (3)
Sjödin, Andreas (3)
Friedman, Ran (3)
Sennblad, Bengt (3)
Orešič, Matej, 1967- (3)
Höglund, Mattias (3)
Häkkinen, Jari (3)
Vallon-Christersson, ... (3)
Fontes, Magnus (3)
Vezzi, Francesco (3)
Vihinen, Mauno (3)
Eriksson, Daniel (2)
Roca, J (2)
Nilsson, R. Henrik, ... (2)
Larsson, Karl-Henrik ... (2)
Nilsson, Mats (2)
Sunnerhagen, Per, 19 ... (2)
Harris, RA (2)
Sonnhammer, ELL (2)
Lundeberg, Joakim (2)
Maier, D (2)
Agathangelidis, Andr ... (2)
Rosenquist, Richard (2)
Stamatopoulos, Kosta ... (2)
Carninci, P (2)
Buetti-Dinh, Antoine ... (2)
Kiani, NA (2)
Veerla, Srinivas (2)
Lindgren, David (2)
Ringnér, Markus (2)
Alexeyenko, Andrey (2)
Delhomme, Nicolas (2)
Policriti, Alberto (2)
Ali, Raja Hashim (2)
Muhammad, Sayyed Auw ... (2)
Cloarec, Olivier (2)
visa färre...
Lärosäte
Uppsala universitet (60)
Karolinska Institutet (52)
Lunds universitet (32)
Göteborgs universitet (31)
Kungliga Tekniska Högskolan (26)
Stockholms universitet (23)
visa fler...
Chalmers tekniska högskola (19)
Linköpings universitet (18)
Umeå universitet (17)
Örebro universitet (11)
Sveriges Lantbruksuniversitet (7)
Högskolan i Skövde (5)
Linnéuniversitetet (5)
Högskolan i Halmstad (1)
Malmö universitet (1)
Gymnastik- och idrottshögskolan (1)
Naturhistoriska riksmuseet (1)
visa färre...
Språk
Engelska (237)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (147)
Medicin och hälsovetenskap (46)
Teknik (9)
Lantbruksvetenskap (4)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy