SwePub - search: AMNE:(NATURVETENSKAP) AMNE:(Da...

Enumeration	Reference	Cover	Find
1.	Buckland, Philip I., 1973-, et al. (author) SEAD - The Strategic Environmental Archaeology Database : Progress Report Spring 2014 2014 Reports (other academic/artistic)abstract This report provides an overview of the progress and results of the VR:KFI infrastructure projects 2007-7494 and (825-)2010-5976. It should be considered as a status report in an on-going long-term research infrastructure development project.
2.	Liu, Yuanhua, 1971, et al. (author) Considering the importance of user profiles in interface design 2009 In: User Interfaces. ; , s. 23- Book chapter (other academic/artistic)abstract User profile is a popular term widely employed during product design processes by industrial companies. Such a profile is normally intended to represent real users of a product. The ultimate purpose of a user profile is actually to help designers to recognize or learn about the real user by presenting them with a description of a real user’s attributes, for instance; the user’s gender, age, educational level, attitude, technical needs and skill level. The aim of this chapter is to provide information on the current knowledge and research about user profile issues, as well as to emphasize the importance of considering these issues in interface design. In this chapter, we mainly focus on how users’ difference in expertise affects their performance or activity in various interaction contexts. Considering the complex interaction situations in practice, novice and expert users’ interactions with medical user interfaces of different technical complexity will be analyzed as examples: one focuses on novice and expert users’ difference when interacting with simple medical interfaces, and the other focuses on differences when interacting with complex medical interfaces. Four issues will be analyzed and discussed: (1) how novice and expert users differ in terms of performance during the interaction; (2) how novice and expert users differ in the perspective of cognitive mental models during the interaction; (3) how novice and expert users should be defined in practice; and (4) what are the main differences between novice and expert users’ implications for interface design. Besides describing the effect of users’ expertise difference during the interface design process, we will also pinpoint some potential problems for the research on interface design, as well as some future challenges that academic researchers and industrial engineers should face in practice.
3.	Munappy, Aiswarya Raj, 1990, et al. (author) On the Trade-off Between Robustness and Complexity in Data Pipelines 2021 In: Quality of Information and Communications Technology. - Cham : Springer. - 9783030853464 - 9783030853471 ; 1439, s. 401-415 Conference paper (peer-reviewed)abstract Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, transforming, combining, validating, and loading data for further analysis and visualization. However, the robustness of data pipelines is equally important since unhealthy data pipelines can add more noise to the input data. This paper identifies the essential elements for a robust data pipeline and analyses the trade-off between data pipeline robustness and complexity.
4.	Nilsson, R. Henrik, 1976, et al. (author) Mycobiome diversity: high-throughput sequencing and identification of fungi. 2019 In: Nature reviews. Microbiology. - : Springer Science and Business Media LLC. - 1740-1534 .- 1740-1526. ; 17, s. 95-109 Research review (peer-reviewed)abstract Fungi are major ecological players in both terrestrial and aquatic environments by cycling organic matter and channelling nutrients across trophic levels. High-throughput sequencing (HTS) studies of fungal communities are redrawing the map of the fungal kingdom by hinting at its enormous - and largely uncharted - taxonomic and functional diversity. However, HTS approaches come with a range of pitfalls and potential biases, cautioning against unwary application and interpretation of HTS technologies and results. In this Review, we provide an overview and practical recommendations for aspects of HTS studies ranging from sampling and laboratory practices to data processing and analysis. We also discuss upcoming trends and techniques in the field and summarize recent and noteworthy results from HTS studies targeting fungal communities and guilds. Our Review highlights the need for reproducibility and public data availability in the study of fungal communities. If the associated challenges and conceptual barriers are overcome, HTS offers immense possibilities in mycology and elsewhere.
5.	Sanli, Kemal, et al. (author) Metagenomic Sequencing of Marine Periphyton: Taxonomic and Functional Insights into Biofilm Communities 2015 In: Frontiers in Microbiology. - : Frontiers Media SA. - 1664-302X. ; 6:1192 Journal article (peer-reviewed)abstract Periphyton communities are complex phototrophic, multispecies biofilms that develop on surfaces in aquatic environments. These communities harbor a large diversity of organisms comprising viruses, bacteria, algae, fungi, protozoans and metazoans. However, thus far the total biodiversity of periphyton has not been described. In this study, we use metagenomics to characterize periphyton communities from the marine environment of the Swedish west coast. Although we found approximately ten times more eukaryotic rRNA marker gene sequences compared to prokaryotic, the whole metagenome-based similarity searches showed that bacteria constitute the most abundant phyla in these biofilms. We show that marine periphyton encompass a range of heterotrophic and phototrophic organisms. Heterotrophic bacteria, including the majority of proteobacterial clades and Bacteroidetes, and eukaryotic macro-invertebrates were found to dominate periphyton. The phototrophic groups comprise Cyanobacteria and the alpha-proteobacterial genus Roseobacter, followed by different micro- and macro-algae. We also assess the metabolic pathways that predispose these communities to an attached lifestyle. Functional indicators of the biofilm form of life in periphyton involve genes coding for enzymes that catalyze the production and degradation of extracellular polymeric substances, mainly in the form of complex sugars such as starch and glycogen-like meshes together with chitin. Genes for 278 different transporter proteins were detected in the metagenome, constituting the most abundant protein complexes. Finally, genes encoding enzymes that participate in anaerobic pathways, such as denitrification and methanogenesis, were detected suggesting the presence of anaerobic or low-oxygen micro-zones within the biofilms.
6.	Isaksson, Martin, et al. (author) Adaptive Expert Models for Federated Learning 2023 In: <em>Lecture Notes in Computer Science </em>Volume 13448 Pages 1 - 16 2023. - Cham : Springer Science and Business Media Deutschland GmbH. - 9783031289958 ; 13448 LNAI, s. 1-16 Conference paper (peer-reviewed)abstract Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive. However, the state-of-the-art solutions in this framework are not optimal when data is heterogeneous and non-IID. We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data by balancing exploration and exploitation of several global models. To achieve our aim of personalization, we use a Mixture of Experts (MoE) that learns to group clients that are similar to each other, while using the global models more efficiently. We show that our approach achieves an accuracy up to 29.78% better than the state-of-the-art and up to 4.38% better compared to a local model in a pathological non-IID setting, even though we tune our approach in the IID setting. © 2023, The Author(s)
7.	Abarenkov, Kessy, et al. (author) Protax-fungi: A web-based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences 2018 In: New Phytologist. - : Wiley. - 0028-646X .- 1469-8137. ; 220:2, s. 517-525 Journal article (peer-reviewed)abstract © 2018 New Phytologist Trust. Incompleteness of reference sequence databases and unresolved taxonomic relationships complicates taxonomic placement of fungal sequences. We developed Protax-fungi, a general tool for taxonomic placement of fungal internal transcribed spacer (ITS) sequences, and implemented it into the PlutoF platform of the UNITE database for molecular identification of fungi. With empirical data on root- and wood-associated fungi, Protax-fungi reliably identified (with at least 90% identification probability) the majority of sequences to the order level but only around one-fifth of them to the species level, reflecting the current limited coverage of the databases. Protax-fungi outperformed the Sintax and Rdb classifiers in terms of increased accuracy and decreased calibration error when applied to data on mock communities representing species groups with poor sequence database coverage. We applied Protax-fungi to examine the internal consistencies of the Index Fungorum and UNITE databases. This revealed inconsistencies in the taxonomy database as well as mislabelling and sequence quality problems in the reference database. The according improvements were implemented in both databases. Protax-fungi provides a robust tool for performing statistically reliable identifications of fungi in spite of the incompleteness of extant reference sequence databases and unresolved taxonomic relationships.
8.	Fu, Keren, et al. (author) Deepside: A general deep framework for salient object detection 2019 In: Neurocomputing. - : Elsevier BV. - 0925-2312 .- 1872-8286. ; 356, s. 69-82 Journal article (peer-reviewed)abstract Deep learning-based salient object detection techniques have shown impressive results compared to con- ventional saliency detection by handcrafted features. Integrating hierarchical features of Convolutional Neural Networks (CNN) to achieve fine-grained saliency detection is a current trend, and various deep architectures are proposed by researchers, including “skip-layer” architecture, “top-down” architecture, “short-connection” architecture and so on. While these architectures have achieved progressive improve- ment on detection accuracy, it is still unclear about the underlying distinctions and connections between these schemes. In this paper, we review and draw underlying connections between these architectures, and show that they actually could be unified into a general framework, which simply just has side struc- tures with different depths. Based on the idea of designing deeper side structures for better detection accuracy, we propose a unified framework called Deepside that can be deeply supervised to incorporate hierarchical CNN features. Additionally, to fuse multiple side outputs from the network, we propose a novel fusion technique based on segmentation-based pooling, which severs as a built-in component in the CNN architecture and guarantees more accurate boundary details of detected salient objects. The effectiveness of the proposed Deepside scheme against state-of-the-art models is validated on 8 benchmark datasets.
9.	Robinson, Jonathan, 1986, et al. (author) An atlas of human metabolism 2020 In: Science Signaling. - : American Association for the Advancement of Science (AAAS). - 1945-0877 .- 1937-9145. ; 13:624 Journal article (peer-reviewed)abstract Genome-scale metabolic models (GEMs) are valuable tools to study metabolism and provide a scaffold for the integrative analysis of omics data. Researchers have developed increasingly comprehensive human GEMs, but the disconnect among different model sources and versions impedes further progress. We therefore integrated and extensively curated the most recent human metabolic models to construct a consensus GEM, Human1. We demonstrated the versatility of Human1 through the generation and analysis of cell- and tissue-specific models using transcriptomic, proteomic, and kinetic data. We also present an accompanying web portal, Metabolic Atlas (https://www.metabolicatlas.org/), which facilitates further exploration and visualization of Human1 content. Human1 was created using a version-controlled, open-source model development framework to enable community-driven curation and refinement. This framework allows Human1 to be an evolving shared resource for future studies of human health and disease.
10.	Gerken, Jan, 1991, et al. (author) Equivariance versus augmentation for spherical images 2022 In: Proceedings of Machine Learning Resaerch. ; , s. 7404-7421 Conference paper (peer-reviewed)abstract We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are trained and evaluated on single or multiple items from the MNIST- or FashionMNIST dataset projected onto the sphere. For the task of image classification, which is inherently rotationally invariant, we find that by considerably increasing the amount of data augmentation and the size of the networks, it is possible for the standard CNNs to reach at least the same performance as the equivariant network. In contrast, for the inherently equivariant task of semantic segmentation, the non-equivariant networks are consistently outperformed by the equivariant networks with significantly fewer parameters. We also analyze and compare the inference latency and training times of the different networks, enabling detailed tradeoff considerations between equivariant architectures and data augmentation for practical problems.
11.	Martinsson, John, et al. (author) Automatic blood glucose prediction with confidence using recurrent neural networks 2018 In: CEUR Workshop Proceedings. - : CEUR. ; 2148, s. 64-68 Conference paper (peer-reviewed)abstract Low-cost sensors continuously measuring blood glucose levels in intervals of a few minutes and mobile platforms combined with machine-learning (ML) solutions enable personalized precision health and disease management. ML solutions must be adapted to different sensor technologies, analysis tasks and individuals. This raises the issue of scale for creating such adapted ML solutions. We present an approach for predicting blood glucose levels for diabetics up to one hour into the future. The approach is based on recurrent neural networks trained in an end-to-end fashion, requiring nothing but the glucose level history for the patient. The model outputs the prediction along with an estimate of its certainty, helping users to interpret the predicted levels. The approach needs no feature engineering or data pre-processing, and is computationally inexpensive.
12.	Hamon, Thierry, et al. (author) Combining Compositionality and Pagerank for the Identification of Semantic Relations between Biomedical Words 2012 In: BioNLP. - 9781937284206 - 1937284204 ; , s. 109-117 Conference paper (peer-reviewed)abstract The acquisition of semantic resources and relations is an important task for several applications, such as query expansion, information retrieval and extraction, machine translation. However, their validity should also be computed and indicated, especially for automatic systems and applications. We exploit the compositionality based methods for the acquisition of synonymy relations and of indicators of these synonyms. We then apply pagerank-derived algorithm to the obtained semantic graph in order to filter out the acquired synonyms. Evaluation performed with two independent experts indicates that the quality of synonyms is systematically improved by 10 to 15% after their filtering.
13.	Buckland, Philip I., 1973-, et al. (author) The Strategic Environmental Archaeology Database : a resource for international, multiproxy and transdisciplinary studies of environmental and climatic change 2015 Conference paper (peer-reviewed)abstract Climate and environmental change are global challenges which require global data and infrastructure to investigate. These challenges also require a multi-proxy approach, integrating evidence from Quaternary science and archaeology with information from studies on modern ecology and physical processes among other disciplines. The Strategic Environmental Archaeology Database (SEAD http://www.sead.se) is a Swedish based international research e-infrastructure for storing, managing, analysing and disseminating palaeoenvironmental data from an almost unlimited number of analysis methods. The system currently makes available raw data from over 1500 sites (>5300 datasets) and the analysis of Quaternary fossil insects, plant macrofossils, pollen, geochemistry and sediment physical properties, dendrochronology and wood anatomy, ceramic geochemistry and bones, along with numerous dating methods. This capacity will be expanded in the near future to include isotopes, multi-spectral and archaeo-metalurgical data. SEAD also includes expandable climate and environment calibration datasets, a complete bibliography and extensive metadata and services for linking these data to other resources. All data is available as Open Access through http://qsead.sead.se and downloadable software. SEAD is maintained and managed at the Environmental Archaeology Lab and HUMlab at Umea University, Sweden. Development and data ingestion is progressing in cooperation with The Laboratory for Ceramic Research and the National Laboratory for Wood Anatomy and Dendrochronology at Lund University, Sweden, the Archaeological Research Laboratory, Stockholm University, the Geoarchaeological Laboratory, Swedish National Historical Museums Agency and several international partners and research projects. Current plans include expanding its capacity to serve as a data source for any system and integration with the Swedish National Heritage Board's information systems. SEAD is partnered with the Neotoma palaeoecology database (http://www.neotomadb.org) and a new initiative for building cyberinfrastructure for transdisciplinary research and visualization of the long-term human ecodynamics of the North Atlantic funded by the National Science Foundation (NSF).
14.	Abarenkov, Kessy, et al. (author) Annotating public fungal ITS sequences from the built environment according to the MIxS-Built Environment standard – a report from a May 23-24, 2016 workshop (Gothenburg, Sweden) 2016 In: MycoKeys. - : Pensoft Publishers. - 1314-4057 .- 1314-4049. ; 16, s. 1-15 Journal article (peer-reviewed)abstract Recent molecular studies have identified substantial fungal diversity in indoor environments. Fungi and fungal particles have been linked to a range of potentially unwanted effects in the built environment, including asthma, decay of building materials, and food spoilage. The study of the built mycobiome is hampered by a number of constraints, one of which is the poor state of the metadata annotation of fungal DNA sequences from the built environment in public databases. In order to enable precise interrogation of such data – for example, “retrieve all fungal sequences recovered from bathrooms” – a workshop was organized at the University of Gothenburg (May 23-24, 2016) to annotate public fungal barcode (ITS) sequences according to the MIxS-Built Environment annotation standard (http://gensc.org/mixs/). The 36 participants assembled a total of 45,488 data points from the published literature, including the addition of 8,430 instances of countries of collection from a total of 83 countries, 5,801 instances of building types, and 3,876 instances of surface-air contaminants. The results were implemented in the UNITE database for molecular identification of fungi (http://unite.ut.ee) and were shared with other online resources. Data obtained from human/animal pathogenic fungi will furthermore be verified on culture based metadata for subsequent inclusion in the ISHAM-ITS database (http://its.mycologylab.org).
15.	Buckland, Philip I., 1973-, et al. (author) BugsCEP, an entomological database twenty-five years on 2014 In: Antenna (Journal of the Royal Entomological Society). - London : Royal Entomological Society of London. - 0140-1890. ; 38:1, s. 21-28 Journal article (peer-reviewed)
16.	Boulund, Fredrik, et al. (author) Computational and Statistical Considerations in the Analysis of Metagenomic Data 2018 In: Metagenomics: Perspectives, Methods, and Applications. - 9780081022689 ; , s. 81-102 Book chapter (other academic/artistic)abstract In shotgun metagenomics, microbial communities are studied by random DNA fragments sequenced directly from environmental and clinical samples. The resulting data is massive, potentially consisting of billions of sequence reads describing millions of microbial genes. The data interpretation is therefore nontrivial and dependent on dedicated computational and statistical methods. In this chapter we discuss the many challenges associated with the analysis of shotgun metagenomic data. First, we address computational issues related to the quantification of genes in metagenomes. We describe algorithms for efficient sequence comparisons, recommended practices for setting up data workflows and modern high-performance computer resources that can be used to perform the analysis. Next, we outline the statistical aspects, including removal of systematic errors and how to identify differences between microbial communities from different experimental conditions. We conclude by underlining the increasing importance of efficient and reliable computational and statistical solutions in the analysis of large metagenomic datasets.
17.	Mamontov, Eugen, 1955, et al. (author) Mathematical models and numerical simulation results for oncogeny: Specific problems and different tools for various user communities 2008 In: Abstracts Booklet. CancerSim2008 — Euroconference on Modelling and Simulation of Cancer Growth and Therapy, 19-21 May, Turin, Italy; N. Bellomo and G. Forni (Chairmen) http://www.cancersim2008.org. Conference paper (peer-reviewed)
18.	Willighagen, Egon, 1974-, et al. (author) Linking the Resource Description Framework to cheminformatics and proteochemometrics 2011 In: Journal of Biomedical Semantics. - 2041-1480. ; 2:Suppl 1, s. 6- Journal article (peer-reviewed)abstract BACKGROUND :Semantic web technologies are finding their way into the life sciences. Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet. The semantic web technology Resource Description Framework (RDF) and related methods show to be sufficiently versatile to change that situation.RESULTS :The work presented here focuses on linking RDF approaches to existing molecular chemometrics fields, including cheminformatics, QSAR modeling and proteochemometrics. Applications are presented that link RDF technologies to methods from statistics and cheminformatics, including data aggregation, visualization, chemical identification, and property prediction. They demonstrate how this can be done using various existing RDF standards and cheminformatics libraries. For example, we show how IC50 and Ki values are modeled for a number of biological targets using data from the ChEMBL database.CONCLUSIONS :We have shown that existing RDF standards can suitably be integrated into existing molecular chemometrics methods. Platforms that unite these technologies, like Bioclipse, makes this even simpler and more transparent. Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility. The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.
19.	Spjuth, Ola, 1977-, et al. (author) E-Science technologies in a workflow for personalized medicine using cancer screening as a case study 2017 In: JAMIA Journal of the American Medical Informatics Association. - : Oxford University Press. - 1067-5027 .- 1527-974X. ; 24:5, s. 950-957 Journal article (peer-reviewed)abstract Objective: We provide an e-Science perspective on the workflow from risk factor discovery and classification of disease to evaluation of personalized intervention programs. As case studies, we use personalized prostate and breast cancer screenings.Materials and Methods: We describe an e-Science initiative in Sweden, e-Science for Cancer Prevention and Control (eCPC), which supports biomarker discovery and offers decision support for personalized intervention strategies. The generic eCPC contribution is a workflow with 4 nodes applied iteratively, and the concept of e-Science signifies systematic use of tools from the mathematical, statistical, data, and computer sciences.Results: The eCPC workflow is illustrated through 2 case studies. For prostate cancer, an in-house personalized screening tool, the Stockholm-3 model (S3M), is presented as an alternative to prostate-specific antigen testing alone. S3M is evaluated in a trial setting and plans for rollout in the population are discussed. For breast cancer, new biomarkers based on breast density and molecular profiles are developed and the US multicenter Women Informed to Screen Depending on Measures (WISDOM) trial is referred to for evaluation. While current eCPC data management uses a traditional data warehouse model, we discuss eCPC-developed features of a coherent data integration platform.Discussion and Conclusion: E-Science tools are a key part of an evidence-based process for personalized medicine. This paper provides a structured workflow from data and models to evaluation of new personalized intervention strategies. The importance of multidisciplinary collaboration is emphasized. Importantly, the generic concepts of the suggested eCPC workflow are transferrable to other disease domains, although each disease will require tailored solutions.
20.	Younes, Sara (author) Uncovering biomarkers and molecular heterogeneity of complex diseases : Utilizing the power of Data Science 2021 Doctoral thesis (other academic/artistic)abstract Uncovering causal drivers of complex diseases is yet a difficult challenge. Unlike single-gene disorders complex diseases are heterogeneous and are caused by a combination of genetic, environmental, and lifestyle factors which complicates the identification of patient subgroups and the disease causal drivers. In order to study the dimensions of complex diseases analyzing different omics data is a necessity.The main goal of this thesis is to provide computational approaches for analyzing omics data of two complex diseases; mainly, Acute Myeloid Leukaemia (AML) and Systemic Lupus Erythematosus (SLE). Additionally, we aim at providing a method that would deal with integration issues that usually arise when combining complex diseases omics (specifically metabolomics) data from multiple data sources. AML is a cancer of the myeloid blood cells that is known for its heterogeneity. Patients usually respond to treatment and achieve a complete remission state. However, a majority of patients relapse or develop treatment resistance. In paper I, we focus on investigating recurrent genomic alterations in adult and pediatric relapsed and primary resistant AML that may explain disease progression. In paper II, we characterize changes in the transcriptome of AML over the course of the disease, incorporating machine learning analysis.SLE is a heterogeneous autoimmune disease characterized by unpredictable periods of flares. The flares are presented as different SLE disease activities (DA). Studies on the combinatorial effects of genes towards the manifestation of SLE DAs in patients’ subgroups have been limited. In paper III, we analyze gene expression data of pediatric SLE using interpretable machine learning. The aim was to study the co-predictive transcriptomic factors driving disease progression, discover the disease subtypes, and explore the relationship between transcriptomics factors and the phenotypes associated with the discovered subtypes.Recently, Metabolomics has been a crucial dimension in major multi-omics complex disease studies. Small-compound databases contain a large amount of information for metabolites. However, the existing redundancy of information in the databases leads to major standardization issues. In paper IV, we aim at resolving the inconsistencies that exist when linking and combining metabolomics data from several databases by introducing the new R package MetaFetcheR.
21.	de Dios, Eddie, et al. (author) Introduction to Deep Learning in Clinical Neuroscience 2022 In: Acta Neurochirurgica, Supplement. - Cham : Springer International Publishing. - 2197-8395 .- 0065-1419. ; 134, s. 79-89 Book chapter (other academic/artistic)abstract The use of deep learning (DL) is rapidly increasing in clinical neuroscience. The term denotes models with multiple sequential layers of learning algorithms, architecturally similar to neural networks of the brain. We provide examples of DL in analyzing MRI data and discuss potential applications and methodological caveats. Important aspects are data pre-processing, volumetric segmentation, and specific task-performing DL methods, such as CNNs and AEs. Additionally, GAN-expansion and domain mapping are useful DL techniques for generating artificial data and combining several smaller datasets. We present results of DL-based segmentation and accuracy in predicting glioma subtypes based on MRI features. Dice scores range from 0.77 to 0.89. In mixed glioma cohorts, IDH mutation can be predicted with a sensitivity of 0.98 and specificity of 0.97. Results in test cohorts have shown improvements of 5–7% in accuracy, following GAN-expansion of data and domain mapping of smaller datasets. The provided DL examples are promising, although not yet in clinical practice. DL has demonstrated usefulness in data augmentation and for overcoming data variability. DL methods should be further studied, developed, and validated for broader clinical use. Ultimately, DL models can serve as effective decision support systems, and are especially well-suited for time-consuming, detail-focused, and data-ample tasks.
22.	Lindgren, Erik, 1980, et al. (author) Analysis of industrial X-ray computed tomography data with deep neural networks 2021 In: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE. - 0277-786X .- 1996-756X. ; 11840 Conference paper (peer-reviewed)abstract X-ray computed tomography (XCT) is increasingly utilized industrially at material- and process development as well as in non-destructive quality control; XCT is important to many emerging manufacturing technologies, for example metal additive manufacturing. These trends lead to increased needs of safe automatic or semi-automatic data interpretation, considered an open research question for many critical high value industrial products such as within the aerospace industry. By safe, we mean that the interpretation is not allowed to unawarely or unexpectedly fail; specifically the algorithms must react sensibly to inputs dissimilar to the training data, so called out-of-distribution (OOD) inputs. In this work we explore data interpretation with deep neural networks to address: robust safe data interpretation which includes a confidence estimate with respect to OOD data, an OOD detector; generation of realistic synthetic material aw indications for the material science and nondestructive evaluation community. We have focused on industrial XCT related challenges, addressing difficulties with spatially correlated X-ray quantum noise. Results are reported on training auto-encoders (AE) and generative adversarial networks (GAN), on a publicly available XCT dataset of additively manufactured metal. We demonstrate that adding modeled X-ray noise during training reduces artefacts in the generated imperfection indications as well as improves the OOD detector performance. In addition, we show that the OOD detector can detect real and synthetic OOD data and still model the accepted in-distribution data down to the X-ray noise levels.
23.	Al Sabbagh, Khaled, 1987, et al. (author) Improving Data Quality for Regression Test Selection by Reducing Annotation Noise 2020 In: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020. ; , s. 191-194 Conference paper (peer-reviewed)abstract Big data and machine learning models have been increasingly used to support software engineering processes and practices. One example is the use of machine learning models to improve test case selection in continuous integration. However, one of the challenges in building such models is the identification and reduction of noise that often comes in large data. In this paper, we present a noise reduction approach that deals with the problem of contradictory training entries. We empirically evaluate the effectiveness of the approach in the context of selective regression testing. For this purpose, we use a curated training set as input to a tree-based machine learning ensemble and compare the classification precision, recall, and f-score against a non-curated set. Our study shows that using the noise reduction approach on the training instances gives better results in prediction with an improvement of 37% on precision, 70% on recall, and 59% on f-score.
24.	Murtagh, Fionn, et al. (author) Core conflictual relationship : text mining to discover what and when 2018 In: Language and Psychoanalysis. - Edinburgh : University of Edinburgh. - 2049-324X. ; 7:2, s. 4-28 Journal article (peer-reviewed)abstract Following detailed presentation of the Core Conflictual Relationship Theme (CCRT), there is the objective of relevant methods for what has been described as verbalization and visualization of data. Such is also termed data mining and text mining, and knowledge discovery in data. The Correspondence Analysis methodology, also termed Geometric Data Analysis, is shown in a case study to be comprehensive and revealing. Quite innovative here is how the analysis process is structured. For both illustrative and revealing aspects of the case study here, relatively extensive dream reports are used. The dream reports are from an open source repository of dream reports, and the current study proposes a possible framework for the analysis of dream report narratives, and further, how such an analysis could be relevant within the psychotherapeutic context. This Geometric Data Analysis here confirms the validity of CCRT method.
25.	van der Tak, F. F. S., et al. (author) The Leiden Atomic and Molecular Database (LAMDA): Current status, recent updates, and future plans 2020 In: Atoms. - : MDPI AG. - 2218-2004. ; 8:2 Research review (peer-reviewed)abstract The Leiden Atomic and Molecular Database (LAMDA) collects spectroscopic information and collisional rate coefficients for molecules, atoms, and ions of astrophysical and astrochemical interest. We describe the developments of the database since its inception in 2005, and outline our plans for the near future. Such a database is constrained both by the nature of its uses and by the availability of accurate data: we suggest ways to improve the synergies among users and suppliers of data. We summarize some recent developments in computation of collisional cross sections and rate coefficients. We consider atomic and molecular data that are needed to support astrophysics and astrochemistry with upcoming instruments that operate in the mid-and far-infrared parts of the spectrum.
26.	Daoud, Adel, 1981, et al. (author) Using Satellite Images and Deep Learning to Measure Health and Living Standards in India 2023 In: Social Indicators Research. - : SPRINGER. - 0303-8300 .- 1573-0921. ; 167:1-3, s. 475-505 Journal article (peer-reviewed)abstract Using deep learning with satellite images enhances our understanding of human development at a granular spatial and temporal level. Most studies have focused on Africa and on a narrow set of asset-based indicators. This article leverages georeferenced village-level census data from across 40% of the population of India to train deep models that predicts 16 indicators of human well-being from Landsat 7 imagery. Based on the principles of transfer learning, the census-based model is used as a feature extractor to train another model that predicts an even larger set of developmental variables—over 90 variables—included in two rounds of the National Family Health Survey (NFHS). The census-based-feature-extractor model outperforms the current standard in the literature for most of these NFHS variables. Overall, the results show that combining satellite data with Indian Census data unlocks rich information for training deep models that track human development at an unprecedented geographical and temporal resolution.
27.	Issa Mattos, David, 1990, et al. (author) Statistical Models for the Analysis of Optimization Algorithms with Benchmark Functions 2021 In: IEEE Transactions on Evolutionary Computation. - : IEEE. - 1089-778X .- 1941-0026. ; 25:6, s. 1163-1177 Journal article (peer-reviewed)abstract Frequentist statistical methods, such as hypothesis testing, are standard practices in studies that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test assumptions or without controlling for familywise errors in multiple group comparisons, among several other problems. Bayesian data analysis (BDA) addresses many of the previously mentioned shortcomings but its use is not widely spread in the analysis of empirical data in the evolutionary computing community. This article provides three main contributions. First, we motivate the need for utilizing BDA and provide an overview of this topic. Second, we discuss the practical aspects of BDA to ensure that our models are valid and the results are transparent. Finally, we provide five statistical models that can be used to answer multiple research questions. The online Appendix provides a step-by-step guide on how to perform the analysis of the models discussed in this article, including the code for the statistical models, the data transformations, and the discussed tables and figures.
28.	Henriksson, Jens, 1991, et al. (author) Performance analysis of out-of-distribution detection on trained neural networks 2020 In: Information and Software Technology. - : Elsevier B.V.. - 0950-5849 .- 1873-6025. Journal article (peer-reviewed)abstract Context: Deep Neural Networks (DNN) have shown great promise in various domains, for example to support pattern recognition in medical imagery. However, DNNs need to be tested for robustness before being deployed in safety critical applications. One common challenge occurs when the model is exposed to data samples outside of the training data domain, which can yield to outputs with high confidence despite no prior knowledge of the given input. Objective: The aim of this paper is to investigate how the performance of detecting out-of-distribution (OOD) samples changes for outlier detection methods (e.g., supervisors) when DNNs become better on training samples. Method: Supervisors are components aiming at detecting out-of-distribution samples for a DNN. The experimental setup in this work compares the performance of supervisors using metrics and datasets that reflect the most common setups in related works. Four different DNNs with three different supervisors are compared during different stages of training, to detect at what point during training the performance of the supervisors begins to deteriorate. Results: Found that the outlier detection performance of the supervisors increased as the accuracy of the underlying DNN improved. However, all supervisors showed a large variation in performance, even for variations of network parameters that marginally changed the model accuracy. The results showed that understanding the relationship between training results and supervisor performance is crucial to improve a model's robustness. Conclusion: Analyzing DNNs for robustness is a challenging task. Results showed that variations in model parameters that have small variations on model predictions can have a large impact on the out-of-distribution detection performance. This kind of behavior needs to be addressed when DNNs are part of a safety critical application and hence, the necessary safety argumentation for such systems need be structured accordingly.
29.	Lindén, Joakim, et al. (author) Evaluating the Robustness of ML Models to Out-of-Distribution Data Through Similarity Analysis 2023 In: Commun. Comput. Info. Sci.. - : Springer Science and Business Media Deutschland GmbH. - 9783031429408 ; , s. 348-359, s. 348-359 Conference paper (peer-reviewed)abstract In Machine Learning systems, several factors impact the performance of a trained model. The most important ones include model architecture, the amount of training time, the dataset size and diversity. We present a method for analyzing datasets from a use-case scenario perspective, detecting and quantifying out-of-distribution (OOD) data on dataset level. Our main contribution is the novel use of similarity metrics for the evaluation of the robustness of a model by introducing relative Fréchet Inception Distance (FID) and relative Kernel Inception Distance (KID) measures. These relative measures are relative to a baseline in-distribution dataset and are used to estimate how the model will perform on OOD data (i.e. estimate the model accuracy drop). We find a correlation between our proposed relative FID/relative KID measure and the drop in Average Precision (AP) accuracy on unseen data.
30.	Mashad Nemati, Hassan, 1982-, et al. (author) Bayesian Network Representation of Meaningful Patterns in Electricity Distribution Grids 2016 In: 2016 IEEE International Energy Conference (ENERGYCON). - : IEEE. - 9781467384636 Conference paper (peer-reviewed)abstract The diversity of components in electricity distribution grids makes it impossible, or at least very expensive, to deploy monitoring and fault diagnostics to every individual element. Therefore, power distribution companies are looking for cheap and reliable approaches that can help them to estimate the condition of their assets and to predict the when and where the faults may occur. In this paper we propose a simplified representation of failure patterns within historical faults database, which facilitates visualization of association rules using Bayesian Networks. Our approach is based on exploring the failure history and detecting correlations between different features available in those records. We show that a small subset of the most interesting rules is enough to obtain a good and sufficiently accurate approximation of the original dataset. A Bayesian Network created from those rules can serve as an easy to understand visualization of the most relevant failure patterns. In addition, by varying the threshold values of support and confidence that we consider interesting, we are able to control the tradeoff between accuracy of the model and its complexity in an intuitive way. © 2016 IEEE
31.	Jayasiri, Subashini C., et al. (author) The Faces of Fungi database: fungal names linked with morphology, phylogeny and human impacts 2015 In: Fungal diversity. - : Springer Science and Business Media LLC. - 1560-2745 .- 1878-9129. ; 74:1, s. 3-18 Journal article (peer-reviewed)abstract Taxonomic names are key links between various databases that store information on different organisms. Several global fungal nomenclural and taxonomic databases (notably Index Fungorum, Species Fungorum and MycoBank) can be sourced to find taxonomic details about fungi, while DNA sequence data can be sourced from NCBI, EBI and UNITE databases. Although the sequence data may be linked to a name, the quality of the metadata is variable and generally there is no corresponding link to images, descriptions or herbarium material. There is generally no way to establish the accuracy of the names in these genomic databases, other than whether the submission is from a reputable source. To tackle this problem, a new database (FacesofFungi), accessible at www.facesoffungi.org (FoF) has been established. This fungal database allows deposition of taxonomic data, phenotypic details and other useful data, which will enhance our current taxonomic understanding and ultimately enable mycologists to gain better and updated insights into the current fungal classification system. In addition, the database will also allow access to comprehensive metadata including descriptions of voucher and type specimens. This database is user-friendly, providing links and easy access between taxonomic ranks, with the classification system based primarily on molecular data (from the literature and via updated web-based phylogenetic trees), and to a lesser extent on morphological data when molecular data are unavailable. In FoF species are not only linked to the closest phylogenetic representatives, but also relevant data is provided, wherever available, on various applied aspects, such as ecological, industrial, quarantine and chemical uses. The data include the three main fungal groups (Ascomycota, Basidiomycota, Basal fungi) and fungus-like organisms. The FoF webpage is an output funded by the Mushroom Research Foundation which is an NGO with seven directors with mycological expertise. The webpage has 76 curators, and with the help of these specialists, FoF will provide an updated natural classification of the fungi, with illustrated accounts of species linked to molecular data. The present paper introduces the FoF database to the scientific community and briefly reviews some of the problems associated with classification and identification of the main fungal groups. The structure and use of the database is then explained. We would like to invite all mycologists to contribute to these web pages.
32.	Lidstrom, D, et al. (author) Agent based match racing simulations : Starting practice 2022 In: SNAME 24th Chesapeake Sailing Yacht Symposium, CSYS 2022. - : Society of Naval Architects and Marine Engineers. Conference paper (peer-reviewed)abstract Match racing starts in sailing are strategically complex and of great importance for the outcome of a race. With the return of the America's Cup to upwind starts and the World Match Racing Tour attracting young and development sailors, the tactical skills necessary to master the starts could be trained and learned by means of computer simulations to assess a large range of approaches to the starting box. This project used game theory to model the start of a match race, intending to develop and study strategies using Monte-Carlo tree search to estimate the utility of a player's potential moves throughout a race. Strategies that utilised the utility estimated in different ways were defined and tested against each other through means of simulation and with an expert advice on match racing start strategy from a sailor's perspective. The results show that the strategies that put greater emphasis on what the opponent might do, perform better than those that did not. It is concluded that Monte-Carlo tree search can provide a basis for decision making in match races and that it has potential for further use.
33.	Strannegård, Claes, 1962, et al. (author) Ecosystem Models Based on Artificial Intelligence 2022 In: 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022. - : IEEE. Conference paper (peer-reviewed)abstract Ecosystem models can be used for understanding general phenomena of evolution, ecology, and ethology. They can also be used for analyzing and predicting the ecological consequences of human activities on specific ecosystems, e.g., the effects of agriculture, forestry, construction, hunting, and fishing. We argue that powerful ecosystem models need to include reasonable models of the physical environment and of animal behavior. We also argue that several well-known ecosystem models are unsatisfactory in this regard. Then we present the open-source ecosystem simulator Ecotwin, which is built on top of the game engine Unity. To model a specific ecosystem in Ecotwin, we first generate a 3D Unity model of the physical environment, based on topographic or bathymetric data. Then we insert digital 3D models of the organisms of interest into the environment model. Each organism is equipped with a genome and capable of sexual or asexual reproduction. An organism dies if it runs out of some vital resource or reaches its maximum age. The animal models are equipped with behavioral models that include sensors, actions, reward signals, and mechanisms of learning and decision-making. Finally, we illustrate how Ecotwin works by building and running one terrestrial and one marine ecosystem model.
34.	Natalino Da Silva, Carlos, 1987, et al. (author) Microservice-Based Unsupervised Anomaly Detection Loop for Optical Networks 2016 In: Optics InfoBase Conference Papers. ; 2016 Conference paper (peer-reviewed)abstract Unsupervised learning (UL) is a technique to detect previously unseen anomalies without needing labeled datasets. We propose the integration of a scalable UL-based inference component in the monitoring loop of an SDN-controlled optical network.
35.	Uhen, Mark D., et al. (author) The EarthLife Consortium API: an extensible, open-source service foraccessing fossil data and taxonomies from multiple communitypaleodata resources 2021 In: Frontiers of Biogeography. - : International Biogeography Society. ; 13:2 Journal article (peer-reviewed)abstract Paleobiologists and paleoecologists interested in studying biodiversity dynamics over broadspatial and temporal scales have built multiple community-curated data resources, eachemphasizing a particular spatial domain, timescale, or taxonomic group(s). This multiplicity ofdata resources is understandable, given the enormous diversity of life across Earth's history,but creates a barrier to achieving a truly global understanding of the diversity and distributionof life across time. Here we present the Earth Life Consortium Application ProgrammingInterface (ELC API), a lightweight data service designed to search and retrieve fossil occurrenceand taxonomic information from across multiple paleobiological resources. Key endpointsinclude Occurrences (returns spatiotemporal locations of fossils for selected taxa), Locales(returns information about sites with fossil data), References (returns bibliographicinformation), and Taxonomy (returns names of subtaxa associated with selected taxa). Dataobjects are returned as JSON or CSV format. The ELC API supports tectonic-driven shifts ingeographic position back to 580 Ma using services from Macrostrat and GPlates. The ELC APIhas been implemented first for the Paleobiology Database and Neotoma PaleoecologyDatabase, with a test extension to the Strategic Environmental Archaeology Database. The ELCAPI is designed to be readily extensible to other paleobiological data resources, with allendpoints fully documented and following open-source standards (e.g., Swagger, OGC). Thebroader goal is to help build an interlinked and federated ecosystem of paleobiological andpaleoenvironmental data resources, which together provide paleobiologists, macroecologists,biogeographers, and other interested scientists with full coverage of the diversity anddistribution of life across time.
36.	A Slip of the Machinic Tongue : Performative Soundscape Installation 2022 Artistic work (peer-reviewed)abstract The performance was premiered at 'Digital Existence III: Living with Automation' an international conference at Sigtuna Foundation in Sweden, organized by BioMe project led by prof. Amanda Lagerkvist at Uppsala University and featuring Joanna Zylinska, Katherine Hayles, Nick Couldry, Sarah Pink and others. The performance is based on an essay written for a book 'The Computer as Seen at the End of the Human Age', edited by Olle Essvik (Rojal Förlag, Göteborg 2022). A dialog between several repurposed 'smart' speakers is accompanied by a spontaneously evolving soundscape generated by those devices' electromagnetic fields and micro-currents that make them operational. The piece is a component of an artistic inquiry into other, imperceptible voices and sonic realms that underlie and sustain contemporary technologies of voice recognition, synthesis and biometric capture. The project is part of BioMe: Existential Challenges and Ethical Imperatives of Bio-metric AI in Everyday Lifeworlds led by professor Amanda Lagerkvist at the Informatics and Media Department of Uppsala University.
37.	Jurcevic, Sanja, 1971-, et al. (author) Bioinformatics analysis of miRNAs in the neuroblastoma 11q-deleted region reveals a role of miR-548l in both 11q-deleted and MYCN amplified tumour cells 2022 In: Scientific Reports. - : Springer Nature. - 2045-2322. ; 12:1 Journal article (peer-reviewed)abstract Neuroblastoma is a childhood tumour that is responsible for approximately 15% of all childhood cancer deaths. Neuroblastoma tumours with amplification of the oncogene MYCN are aggressive, however, another aggressive subgroup without MYCN amplification also exists; rather, they have a deleted region at chromosome arm 11q. Twenty-six miRNAs are located within the breakpoint region of chromosome 11q and have been checked for a possible involvement in development of neuroblastoma due to the genomic alteration. Target genes of these miRNAs are involved in pathways associated with cancer, including proliferation, apoptosis and DNA repair. We could show that miR-548l found within the 11q region is downregulated in neuroblastoma cell lines with 11q deletion or MYCN amplification. In addition, we showed that the restoration of miR-548l level in a neuroblastoma cell line led to a decreased proliferation of these cells as well as a decrease in the percentage of cells in the S phase. We also found that miR-548l overexpression suppressed cell viability and promoted apoptosis, while miR-548l knockdown promoted cell viability and inhibited apoptosis in neuroblastoma cells. Our results indicate that 11q-deleted neuroblastoma and MYCN amplified neuroblastoma coalesce by downregulating miR-548l.
38.	Zhang, Jin, et al. (author) Deep Learning-Based Conformal Prediction of Toxicity 2021 In: Journal of Chemical Information and Modeling. - : American Chemical Society (ACS). - 1549-9596 .- 1549-960X. ; 61:6, s. 2648-2657 Journal article (peer-reviewed)abstract Predictive modeling for toxicity can help reduce risks in a range of applications and potentially serve as the basis for regulatory decisions. However, the utility of these predictions can be limited if the associated uncertainty is not adequately quantified. With recent studies showing great promise for deep learning-based models also for toxicity predictions, we investigate the combination of deep learning-based predictors with the conformal prediction framework to generate highly predictive models with well-defined uncertainties. We use a range of deep feedforward neural networks and graph neural networks in a conformal prediction setting and evaluate their performance on data from the Tox21 challenge. We also compare the results from the conformal predictors to those of the underlying machine learning models. The results indicate that highly predictive models can be obtained that result in very efficient conformal predictors even at high confidence levels. Taken together, our results highlight the utility of conformal predictors as a convenient way to deliver toxicity predictions with confidence, adding both statistical guarantees on the model performance as well as better predictions of the minority class compared to the underlying models.
39.	Kuhn, Thomas, et al. (author) CDK-Taverna : an open workflow environment for cheminformatics 2010 In: BMC Bioinformatics. - : Springer Science and Business Media LLC. - 1471-2105. ; 11, s. 159- Journal article (peer-reviewed)abstract Background Small molecules are of increasing interest for bioinformatics in areas such as metabolomics and drug discovery. The recent release of large open access chemistry databases generates a demand for flexible tools to process them and discover new knowledge. To freely support open science based on these data resources, it is desirable for the processing tools to be open-source and available for everyone. Results Here we describe a novel combination of the workflow engine Taverna and the cheminformatics library Chemistry Development Kit (CDK) resulting in a open source workflow solution for cheminformatics. We have implemented more than 160 different workers to handle specific cheminformatics tasks. We describe the applications of CDK-Taverna in various usage scenarios. Conclusions The combination of the workflow engine Taverna and the Chemistry Development Kit provides the first open source cheminformatics workflow solution for the biosciences. With the Taverna-community working towards a more powerful workflow engine and a more user-friendly user interface, CDK-Taverna has the potential to become a free alternative to existing proprietary workflow tools.
40.	Wiqvist, Samuel, et al. (author) Partially Exchangeable Networks and architectures for learning summary statistics in Approximate Bayesian Computation 2019 In: Proceedings of the 36th International Conference on Machine Learning. - : PMLR. ; 2019-June, s. 11795-11804 Conference paper (peer-reviewed)abstract We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.
41.	Checinska, Aleksandra, et al. (author) Microbiomes of the dust particles collected from the International Space Station and Spacecraft Assembly Facilities 2015 In: Microbiome. - : Springer Science and Business Media LLC. - 2049-2618. ; 3 Journal article (peer-reviewed)abstract Background - The International Space Station (ISS) is a unique built environment due to the effects of microgravity, space radiation, elevated carbon dioxide levels, and especially continuous human habitation. Understanding the composition of the ISS microbial community will facilitate further development of safety and maintenance practices. The primary goal of this study was to characterize the viable microbiome of the ISS-built environment. A second objective was to determine if the built environments of Earth-based cleanrooms associated with space exploration are an appropriate model of the ISS environment. Results - Samples collected from the ISS and two cleanrooms at the Jet Propulsion Laboratory (JPL, Pasadena, CA) were analyzed by traditional cultivation, adenosine triphosphate (ATP), and propidium monoazide–quantitative polymerase chain reaction (PMA-qPCR) assays to estimate viable microbial populations. The 16S rRNA gene Illumina iTag sequencing was used to elucidate microbial diversity and explore differences between ISS and cleanroom microbiomes. Statistical analyses showed that members of the phyla Actinobacteria, Firmicutes, and Proteobacteria were dominant in the samples examined but varied in abundance. Actinobacteria were predominant in the ISS samples whereas Proteobacteria, least abundant in the ISS, dominated in the cleanroom samples. The viable bacterial populations seen by PMA treatment were greatly decreased. However, the treatment did not appear to have an effect on the bacterial composition (diversity) associated with each sampling site. Conclusions - The results of this study provide strong evidence that specific human skin-associated microorganisms make a substantial contribution to the ISS microbiome, which is not the case in Earth-based cleanrooms. For example, Corynebacterium and Propionibacterium (Actinobacteria) but not Staphylococcus (Firmicutes) species are dominant on the ISS in terms of viable and total bacterial community composition. The results obtained will facilitate future studies to determine how stable the ISS environment is over time. The present results also demonstrate the value of measuring viable cell diversity and population size at any sampling site. This information can be used to identify sites that can be targeted for more stringent cleaning. Finally, the results will allow comparisons with other built sites and facilitate future improvements on the ISS that will ensure astronaut health.
42.	Stathis, Dimitrios, et al. (author) eBrainII : a 3 kW Realtime Custom 3D DRAM Integrated ASIC Implementation of a Biologically Plausible Model of a Human Scale Cortex 2020 In: Journal of Signal Processing Systems. - : Springer. - 1939-8018 .- 1939-8115. ; 92:11, s. 1323-1343 Journal article (peer-reviewed)abstract The Artificial Neural Networks (ANNs), like CNN/DNN and LSTM, are not biologically plausible. Despite their initial success, they cannot attain the cognitive capabilities enabled by the dynamic hierarchical associative memory systems of biological brains. The biologically plausible spiking brain models, e.g., cortex, basal ganglia, and amygdala, have a greater potential to achieve biological brain like cognitive capabilities. Bayesian Confidence Propagation Neural Network (BCPNN) is a biologically plausible spiking model of the cortex. A human-scale model of BCPNN in real-time requires 162 TFlop/s, 50 TBs of synaptic weight storage to be accessed with a bandwidth of 200 TBs. The spiking bandwidth is relatively modest at 250 GBs/s. A hand-optimized implementation of rodent scale BCPNN has been done on Tesla K80 GPUs require 3 kWs, we extrapolate from that a human scale network will require 3 MWs. These power numbers rule out such implementations for field deployment as cognition engines in embedded systems. The key innovation that this paper reports is that it is feasible and affordable to implement real-time BCPNN as a custom tiled application-specific integrated circuit (ASIC) in 28 nm technology with custom 3D DRAM - eBrainII - that consumes 3 kW for human scale and 12 watts for rodent scale. Such implementations eminently fulfill the demands for field deployment.
43.	Ludowieg, Andres Regal, et al. (author) Using Machine Learning to Predict Freight Vehicles' Demand for Loading Zones in Urban Environments 2023 In: Transportation Research Record. - : SAGE Publications. - 0361-1981 .- 2169-4052. ; 2677:1, s. 829-842 Journal article (peer-reviewed)abstract This paper studies demand for public loading zones in urban environments and seeks to develop a machine learning algorithm to predict their demand. Understanding and predicting demand for public loading zones can: (i) support better management of the loading zones and (ii) provide better pre-advice so that transport operators can plan their routes in an optimal way. The methods used are linear regression analysis and neural networks. Six months of parking data from the city of Vic in Spain are used to calibrate and test the models, where the parking data is transformed into a time-series format with forecasting targets. For each loading zone, a different model is calibrated to test which model has the best performance for the loading zone's particular demand pattern. To evaluate each model's performance, both root mean square error and mean absolute error are computed. The results show that, for different loading zone demand patterns, different models are better suited. As the prediction horizon increases, predicting further into the future, the neural network approaches start to give better predictions than linear models.
44.	Johansson, Simon, 1994, et al. (author) Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction 2022 In: Molecular Informatics. - : Wiley. - 1868-1743 .- 1868-1751. ; 41:12 Journal article (peer-reviewed)abstract Computer aided synthesis planning, suggesting synthetic routes for molecules of interest, is a rapidly growing field. The machine learning methods used are often dependent on access to large datasets for training, but finite experimental budgets limit how much data can be obtained from experiments. This suggests the use of schemes for data collection such as active learning, which identifies the data points of highest impact for model accuracy, and which has been used in recent studies with success. However, little has been done to explore the robustness of the methods predicting reaction yield when used together with active learning to reduce the amount of experimental data needed for training. This study aims to investigate the influence of machine learning algorithms and the number of initial data points on reaction yield prediction for two public high-throughput experimentation datasets. Our results show that active learning based on output margin reached a pre-defined AUROC faster than random sampling on both datasets. Analysis of feature importance of the trained machine learning models suggests active learning had a larger influence on the model accuracy when only a few features were important for the model prediction.
45.	2019 Journal article (peer-reviewed)
46.	Tedersoo, Leho, et al. (author) Standardizing metadata and taxonomic identification in metabarcoding studies 2015 In: GigaScience. - : Oxford University Press (OUP). - 2047-217X .- 2047-217X. ; 4 Journal article (peer-reviewed)abstract High-throughput sequencing-based metabarcoding studies produce vast amounts of ecological data, but a lack of consensus on standardization of metadata and how to refer to the species recovered severely hampers reanalysis and comparisons among studies. Here we propose an automated workflow covering data submission, compression, storage and public access to allow easy data retrieval and inter-study communication. Such standardized and readily accessible datasets facilitate data management, taxonomic comparisons and compilation of global metastudies.
47.	Kerren, Andreas, 1971-, et al. (author) Network Visualization for Integrative Bioinformatics 2014 In: Approaches in Integrative Bioinformatics. - Berlin Heidelberg : Springer. - 9783642412806 - 9783642412813 ; , s. 173-202 Book chapter (peer-reviewed)abstract Approaches to investigate biological processes have been of strong interest in the past few years and are the focus of several research areas like systems biology. Biological networks as representations of such processes are crucial for an extensive understanding of living beings. Due to their size and complexity, their growth and continuous change, as well as their compilation from databases on demand, researchers very often request novel network visualization, interaction and exploration techniques. In this chapter, we first provide background information that is needed for the interactive visual analysis of various biological networks. Fields such as (information) visualization, visual analytics and automatic layout of networks are highlighted and illustrated by a number of examples. Then, the state of the art in network visualization for the life sciences is presented together with a discussion of standards for the graphical representation of cellular networks and biological processes.
48.	Kerren, Andreas, 1971-, et al. (author) Why Integrate InfoVis and SciVis? : An Example from Systems Biology 2014 In: IEEE Computer Graphics and Applications. - : IEEE. - 0272-1716 .- 1558-1756. ; 34:6, s. 69-73 Journal article (other academic/artistic)abstract The more-or-less artificial barrier between information visualization and scientific visualization hinders knowledge discovery. Having an integrated view of many aspects of the target data, including a seamlessly interwoven visual display of structural abstract data and 3D spatial information, could lead to new discoveries, insights, and scientific questions. Such a view also could reduce the user’s cognitive load—that is, reduce the effort the user expends when comparing views.
49.	Lampa, Samuel, 1983- (author) Reproducible Data Analysis in Drug Discovery with Scientific Workflows and the Semantic Web 2018 Doctoral thesis (other academic/artistic)abstract The pharmaceutical industry is facing a research and development productivity crisis. At the same time we have access to more biological data than ever from recent advancements in high-throughput experimental methods. One suggested explanation for this apparent paradox has been that a crisis in reproducibility has affected also the reliability of datasets providing the basis for drug development. Advanced computing infrastructures can to some extent aid in this situation but also come with their own challenges, including increased technical debt and opaqueness from the many layers of technology required to perform computations and manage data. In this thesis, a number of approaches and methods for dealing with data and computations in early drug discovery in a reproducible way are developed. This has been done while striving for a high level of simplicity in their implementations, to improve understandability of the research done using them. Based on identified problems with existing tools, two workflow tools have been developed with the aim to make writing complex workflows particularly in predictive modelling more agile and flexible. One of the tools is based on the Luigi workflow framework, while the other is written from scratch in the Go language. We have applied these tools on predictive modelling problems in early drug discovery to create reproducible workflows for building predictive models, including for prediction of off-target binding in drug discovery. We have also developed a set of practical tools for working with linked data in a collaborative way, and publishing large-scale datasets in a semantic, machine-readable format on the web. These tools were applied on demonstrator use cases, and used for publishing large-scale chemical data. It is our hope that the developed tools and approaches will contribute towards practical, reproducible and understandable handling of data and computations in early drug discovery.
50.	Yu, Tao, 1986, et al. (author) Big data in yeast systems biology 2019 In: FEMS Yeast Research. - : Oxford University Press (OUP). - 1567-1356 .- 1567-1364. ; 19:7 Research review (peer-reviewed)abstract Systems biology uses computational and mathematical modeling to study complex interactions in a biological system. The yeast Saccharomyces cerevisiae, which has served as both an important model organism and cell factory, has pioneered both the early development of such models and modeling concepts, and the more recent integration of multi-omics big data in these models to elucidate fundamental principles of biology. Here, we review the advancement of big data technologies to gain biological insight in three aspects of yeast systems biology: gene expression dynamics, cellular metabolism and the regulation network between gene expression and metabolism. The role of big data and complementary modeling approaches, including the expansion of genome-scale metabolic models and machine learning methodologies, are discussed as key drivers in the rapid advancement of yeast systems biology.

Skapa referenser, mejla, bekava och länka

Permalink

Träfflista för sökning "AMNE:(NATURVETENSKAP) AMNE:(Data och informationsvetenskap) AMNE:(Bioinformatik) "

Refine your search

Year