SwePub - sökning: hsv:(NATURVETENSKAP) hsv:(Data...

Numrering	Referens	Omslagsbild	Hitta
1.	Buckland, Philip I., 1973-, et al. (författare) SEAD - The Strategic Environmental Archaeology Database : Progress Report Spring 2014 2014 Rapport (övrigt vetenskapligt/konstnärligt)abstract This report provides an overview of the progress and results of the VR:KFI infrastructure projects 2007-7494 and (825-)2010-5976. It should be considered as a status report in an on-going long-term research infrastructure development project.
2.	Liu, Yuanhua, 1971, et al. (författare) Considering the importance of user profiles in interface design 2009 Ingår i: User Interfaces. ; , s. 23- Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract User profile is a popular term widely employed during product design processes by industrial companies. Such a profile is normally intended to represent real users of a product. The ultimate purpose of a user profile is actually to help designers to recognize or learn about the real user by presenting them with a description of a real user’s attributes, for instance; the user’s gender, age, educational level, attitude, technical needs and skill level. The aim of this chapter is to provide information on the current knowledge and research about user profile issues, as well as to emphasize the importance of considering these issues in interface design. In this chapter, we mainly focus on how users’ difference in expertise affects their performance or activity in various interaction contexts. Considering the complex interaction situations in practice, novice and expert users’ interactions with medical user interfaces of different technical complexity will be analyzed as examples: one focuses on novice and expert users’ difference when interacting with simple medical interfaces, and the other focuses on differences when interacting with complex medical interfaces. Four issues will be analyzed and discussed: (1) how novice and expert users differ in terms of performance during the interaction; (2) how novice and expert users differ in the perspective of cognitive mental models during the interaction; (3) how novice and expert users should be defined in practice; and (4) what are the main differences between novice and expert users’ implications for interface design. Besides describing the effect of users’ expertise difference during the interface design process, we will also pinpoint some potential problems for the research on interface design, as well as some future challenges that academic researchers and industrial engineers should face in practice.
3.	Al Sabbagh, Khaled, 1987, et al. (författare) Improving Data Quality for Regression Test Selection by Reducing Annotation Noise 2020 Ingår i: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020. ; , s. 191-194 Konferensbidrag (refereegranskat)abstract Big data and machine learning models have been increasingly used to support software engineering processes and practices. One example is the use of machine learning models to improve test case selection in continuous integration. However, one of the challenges in building such models is the identification and reduction of noise that often comes in large data. In this paper, we present a noise reduction approach that deals with the problem of contradictory training entries. We empirically evaluate the effectiveness of the approach in the context of selective regression testing. For this purpose, we use a curated training set as input to a tree-based machine learning ensemble and compare the classification precision, recall, and f-score against a non-curated set. Our study shows that using the noise reduction approach on the training instances gives better results in prediction with an improvement of 37% on precision, 70% on recall, and 59% on f-score.
4.	Fu, Keren, et al. (författare) Deepside: A general deep framework for salient object detection 2019 Ingår i: Neurocomputing. - : Elsevier BV. - 0925-2312 .- 1872-8286. ; 356, s. 69-82 Tidskriftsartikel (refereegranskat)abstract Deep learning-based salient object detection techniques have shown impressive results compared to con- ventional saliency detection by handcrafted features. Integrating hierarchical features of Convolutional Neural Networks (CNN) to achieve fine-grained saliency detection is a current trend, and various deep architectures are proposed by researchers, including “skip-layer” architecture, “top-down” architecture, “short-connection” architecture and so on. While these architectures have achieved progressive improve- ment on detection accuracy, it is still unclear about the underlying distinctions and connections between these schemes. In this paper, we review and draw underlying connections between these architectures, and show that they actually could be unified into a general framework, which simply just has side struc- tures with different depths. Based on the idea of designing deeper side structures for better detection accuracy, we propose a unified framework called Deepside that can be deeply supervised to incorporate hierarchical CNN features. Additionally, to fuse multiple side outputs from the network, we propose a novel fusion technique based on segmentation-based pooling, which severs as a built-in component in the CNN architecture and guarantees more accurate boundary details of detected salient objects. The effectiveness of the proposed Deepside scheme against state-of-the-art models is validated on 8 benchmark datasets.
5.	Gerken, Jan, 1991, et al. (författare) Equivariance versus augmentation for spherical images 2022 Ingår i: Proceedings of Machine Learning Resaerch. ; , s. 7404-7421 Konferensbidrag (refereegranskat)abstract We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are trained and evaluated on single or multiple items from the MNIST- or FashionMNIST dataset projected onto the sphere. For the task of image classification, which is inherently rotationally invariant, we find that by considerably increasing the amount of data augmentation and the size of the networks, it is possible for the standard CNNs to reach at least the same performance as the equivariant network. In contrast, for the inherently equivariant task of semantic segmentation, the non-equivariant networks are consistently outperformed by the equivariant networks with significantly fewer parameters. We also analyze and compare the inference latency and training times of the different networks, enabling detailed tradeoff considerations between equivariant architectures and data augmentation for practical problems.
6.	Isaksson, Martin, et al. (författare) Adaptive Expert Models for Federated Learning 2023 Ingår i: <em>Lecture Notes in Computer Science </em>Volume 13448 Pages 1 - 16 2023. - Cham : Springer Science and Business Media Deutschland GmbH. - 9783031289958 ; 13448 LNAI, s. 1-16 Konferensbidrag (refereegranskat)abstract Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive. However, the state-of-the-art solutions in this framework are not optimal when data is heterogeneous and non-IID. We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data by balancing exploration and exploitation of several global models. To achieve our aim of personalization, we use a Mixture of Experts (MoE) that learns to group clients that are similar to each other, while using the global models more efficiently. We show that our approach achieves an accuracy up to 29.78% better than the state-of-the-art and up to 4.38% better compared to a local model in a pathological non-IID setting, even though we tune our approach in the IID setting. © 2023, The Author(s)
7.	Lindén, Joakim, et al. (författare) Evaluating the Robustness of ML Models to Out-of-Distribution Data Through Similarity Analysis 2023 Ingår i: Commun. Comput. Info. Sci.. - : Springer Science and Business Media Deutschland GmbH. - 9783031429408 ; , s. 348-359, s. 348-359 Konferensbidrag (refereegranskat)abstract In Machine Learning systems, several factors impact the performance of a trained model. The most important ones include model architecture, the amount of training time, the dataset size and diversity. We present a method for analyzing datasets from a use-case scenario perspective, detecting and quantifying out-of-distribution (OOD) data on dataset level. Our main contribution is the novel use of similarity metrics for the evaluation of the robustness of a model by introducing relative Fréchet Inception Distance (FID) and relative Kernel Inception Distance (KID) measures. These relative measures are relative to a baseline in-distribution dataset and are used to estimate how the model will perform on OOD data (i.e. estimate the model accuracy drop). We find a correlation between our proposed relative FID/relative KID measure and the drop in Average Precision (AP) accuracy on unseen data.
8.	Lidstrom, D, et al. (författare) Agent based match racing simulations : Starting practice 2022 Ingår i: SNAME 24th Chesapeake Sailing Yacht Symposium, CSYS 2022. - : Society of Naval Architects and Marine Engineers. Konferensbidrag (refereegranskat)abstract Match racing starts in sailing are strategically complex and of great importance for the outcome of a race. With the return of the America's Cup to upwind starts and the World Match Racing Tour attracting young and development sailors, the tactical skills necessary to master the starts could be trained and learned by means of computer simulations to assess a large range of approaches to the starting box. This project used game theory to model the start of a match race, intending to develop and study strategies using Monte-Carlo tree search to estimate the utility of a player's potential moves throughout a race. Strategies that utilised the utility estimated in different ways were defined and tested against each other through means of simulation and with an expert advice on match racing start strategy from a sailor's perspective. The results show that the strategies that put greater emphasis on what the opponent might do, perform better than those that did not. It is concluded that Monte-Carlo tree search can provide a basis for decision making in match races and that it has potential for further use.
9.	Strannegård, Claes, 1962, et al. (författare) Ecosystem Models Based on Artificial Intelligence 2022 Ingår i: 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022. - : IEEE. Konferensbidrag (refereegranskat)abstract Ecosystem models can be used for understanding general phenomena of evolution, ecology, and ethology. They can also be used for analyzing and predicting the ecological consequences of human activities on specific ecosystems, e.g., the effects of agriculture, forestry, construction, hunting, and fishing. We argue that powerful ecosystem models need to include reasonable models of the physical environment and of animal behavior. We also argue that several well-known ecosystem models are unsatisfactory in this regard. Then we present the open-source ecosystem simulator Ecotwin, which is built on top of the game engine Unity. To model a specific ecosystem in Ecotwin, we first generate a 3D Unity model of the physical environment, based on topographic or bathymetric data. Then we insert digital 3D models of the organisms of interest into the environment model. Each organism is equipped with a genome and capable of sexual or asexual reproduction. An organism dies if it runs out of some vital resource or reaches its maximum age. The animal models are equipped with behavioral models that include sensors, actions, reward signals, and mechanisms of learning and decision-making. Finally, we illustrate how Ecotwin works by building and running one terrestrial and one marine ecosystem model.
10.	Boulund, Fredrik, et al. (författare) Computational and Statistical Considerations in the Analysis of Metagenomic Data 2018 Ingår i: Metagenomics: Perspectives, Methods, and Applications. - 9780081022689 ; , s. 81-102 Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract In shotgun metagenomics, microbial communities are studied by random DNA fragments sequenced directly from environmental and clinical samples. The resulting data is massive, potentially consisting of billions of sequence reads describing millions of microbial genes. The data interpretation is therefore nontrivial and dependent on dedicated computational and statistical methods. In this chapter we discuss the many challenges associated with the analysis of shotgun metagenomic data. First, we address computational issues related to the quantification of genes in metagenomes. We describe algorithms for efficient sequence comparisons, recommended practices for setting up data workflows and modern high-performance computer resources that can be used to perform the analysis. Next, we outline the statistical aspects, including removal of systematic errors and how to identify differences between microbial communities from different experimental conditions. We conclude by underlining the increasing importance of efficient and reliable computational and statistical solutions in the analysis of large metagenomic datasets.
11.	Wiqvist, Samuel, et al. (författare) Partially Exchangeable Networks and architectures for learning summary statistics in Approximate Bayesian Computation 2019 Ingår i: Proceedings of the 36th International Conference on Machine Learning. - : PMLR. ; 2019-June, s. 11795-11804 Konferensbidrag (refereegranskat)abstract We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.
12.	Robinson, Jonathan, 1986, et al. (författare) An atlas of human metabolism 2020 Ingår i: Science Signaling. - : American Association for the Advancement of Science (AAAS). - 1945-0877 .- 1937-9145. ; 13:624 Tidskriftsartikel (refereegranskat)abstract Genome-scale metabolic models (GEMs) are valuable tools to study metabolism and provide a scaffold for the integrative analysis of omics data. Researchers have developed increasingly comprehensive human GEMs, but the disconnect among different model sources and versions impedes further progress. We therefore integrated and extensively curated the most recent human metabolic models to construct a consensus GEM, Human1. We demonstrated the versatility of Human1 through the generation and analysis of cell- and tissue-specific models using transcriptomic, proteomic, and kinetic data. We also present an accompanying web portal, Metabolic Atlas (https://www.metabolicatlas.org/), which facilitates further exploration and visualization of Human1 content. Human1 was created using a version-controlled, open-source model development framework to enable community-driven curation and refinement. This framework allows Human1 to be an evolving shared resource for future studies of human health and disease.
13.	Abarenkov, Kessy, et al. (författare) Protax-fungi: A web-based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences 2018 Ingår i: New Phytologist. - : Wiley. - 0028-646X .- 1469-8137. ; 220:2, s. 517-525 Tidskriftsartikel (refereegranskat)abstract © 2018 New Phytologist Trust. Incompleteness of reference sequence databases and unresolved taxonomic relationships complicates taxonomic placement of fungal sequences. We developed Protax-fungi, a general tool for taxonomic placement of fungal internal transcribed spacer (ITS) sequences, and implemented it into the PlutoF platform of the UNITE database for molecular identification of fungi. With empirical data on root- and wood-associated fungi, Protax-fungi reliably identified (with at least 90% identification probability) the majority of sequences to the order level but only around one-fifth of them to the species level, reflecting the current limited coverage of the databases. Protax-fungi outperformed the Sintax and Rdb classifiers in terms of increased accuracy and decreased calibration error when applied to data on mock communities representing species groups with poor sequence database coverage. We applied Protax-fungi to examine the internal consistencies of the Index Fungorum and UNITE databases. This revealed inconsistencies in the taxonomy database as well as mislabelling and sequence quality problems in the reference database. The according improvements were implemented in both databases. Protax-fungi provides a robust tool for performing statistically reliable identifications of fungi in spite of the incompleteness of extant reference sequence databases and unresolved taxonomic relationships.
14.	Johansson, Simon, 1994, et al. (författare) Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction 2022 Ingår i: Molecular Informatics. - : Wiley. - 1868-1743 .- 1868-1751. ; 41:12 Tidskriftsartikel (refereegranskat)abstract Computer aided synthesis planning, suggesting synthetic routes for molecules of interest, is a rapidly growing field. The machine learning methods used are often dependent on access to large datasets for training, but finite experimental budgets limit how much data can be obtained from experiments. This suggests the use of schemes for data collection such as active learning, which identifies the data points of highest impact for model accuracy, and which has been used in recent studies with success. However, little has been done to explore the robustness of the methods predicting reaction yield when used together with active learning to reduce the amount of experimental data needed for training. This study aims to investigate the influence of machine learning algorithms and the number of initial data points on reaction yield prediction for two public high-throughput experimentation datasets. Our results show that active learning based on output margin reached a pre-defined AUROC faster than random sampling on both datasets. Analysis of feature importance of the trained machine learning models suggests active learning had a larger influence on the model accuracy when only a few features were important for the model prediction.
15.	2019 Tidskriftsartikel (refereegranskat)
16.	de Dios, Eddie, et al. (författare) Introduction to Deep Learning in Clinical Neuroscience 2022 Ingår i: Acta Neurochirurgica, Supplement. - Cham : Springer International Publishing. - 2197-8395 .- 0065-1419. ; 134, s. 79-89 Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract The use of deep learning (DL) is rapidly increasing in clinical neuroscience. The term denotes models with multiple sequential layers of learning algorithms, architecturally similar to neural networks of the brain. We provide examples of DL in analyzing MRI data and discuss potential applications and methodological caveats. Important aspects are data pre-processing, volumetric segmentation, and specific task-performing DL methods, such as CNNs and AEs. Additionally, GAN-expansion and domain mapping are useful DL techniques for generating artificial data and combining several smaller datasets. We present results of DL-based segmentation and accuracy in predicting glioma subtypes based on MRI features. Dice scores range from 0.77 to 0.89. In mixed glioma cohorts, IDH mutation can be predicted with a sensitivity of 0.98 and specificity of 0.97. Results in test cohorts have shown improvements of 5–7% in accuracy, following GAN-expansion of data and domain mapping of smaller datasets. The provided DL examples are promising, although not yet in clinical practice. DL has demonstrated usefulness in data augmentation and for overcoming data variability. DL methods should be further studied, developed, and validated for broader clinical use. Ultimately, DL models can serve as effective decision support systems, and are especially well-suited for time-consuming, detail-focused, and data-ample tasks.
17.	Fredriksson, Teodor, 1992, et al. (författare) Machine learning models for automatic labeling: A systematic literature review 2020 Ingår i: ICSOFT 2020 - Proceedings of the 15th International Conference on Software Technologies. - : SCITEPRESS - Science and Technology Publications. ; , s. 552-566 Konferensbidrag (refereegranskat)abstract Automatic labeling is a type of classification problem. Classification has been studied with the help of statistical methods for a long time. With the explosion of new better computer processing units (CPUs) and graphical processing units (GPUs) the interest in machine learning has grown exponentially and we can use both statistical learning algorithms as well as deep neural networks (DNNs) to solve the classification tasks. Classification is a supervised machine learning problem and there exists a large amount of methodology for performing such task. However, it is very rare in industrial applications that data is fully labeled which is why we need good methodology to obtain error-free labels. The purpose of this paper is to examine the current literature on how to perform labeling using ML, we will compare these models in terms of popularity and on what datatypes they are used on. We performed a systematic literature review of empirical studies for machine learning for labeling. We identified 43 primary studies relevant to our search. From this we were able to determine the most common machine learning models for labeling. Lack of unlabeled instances is a major problem for industry as supervised learning is the most widely used. Obtaining labels is costly in terms of labor and financial costs. Based on our findings in this review we present alternate ways for labeling data for use in supervised learning tasks.
18.	Furia, Carlo A, 1979, et al. (författare) Applying Bayesian Analysis Guidelines to Empirical Software Engineering Data: The Case of Programming Languages and Code Quality 2022 Ingår i: ACM Transactions on Software Engineering and Methodology. - : Association for Computing Machinery (ACM). - 1049-331X .- 1557-7392. ; 31:3 Tidskriftsartikel (refereegranskat)abstract Statistical analysis is the tool of choice to turn data into information and then information into empirical knowledge. However, the process that goes from data to knowledge is long, uncertain, and riddled with pitfalls. To be valid, it should be supported by detailed, rigorous guidelines that help ferret out issues with the data or model and lead to qualified results that strike a reasonable balance between generality and practical relevance. Such guidelines are being developed by statisticians to support the latest techniques for Bayesian data analysis. In this article, we frame these guidelines in a way that is apt to empirical research in software engineering.To demonstrate the guidelines in practice, we apply them to reanalyze a GitHub dataset about code quality in different programming languages. The dataset's original analysis [Ray et al. 55] and a critical reanalysis [Berger et al. 6] have attracted considerable attention-in no small part because they target a topic (the impact of different programming languages) on which strong opinions abound. The goals of our reanalysis are largely orthogonal to this previous work, as we are concerned with demonstrating, on data in an interesting domain, how to build a principled Bayesian data analysis and to showcase its benefits. In the process, we will also shed light on some critical aspects of the analyzed data and of the relationship between programming languages and code quality-such as the impact of project-specific characteristics other than the used programming language.The high-level conclusions of our exercise will be that Bayesian statistical techniques can be applied to analyze software engineering data in a way that is principled, flexible, and leads to convincing results that inform the state-of-The-Art while highlighting the boundaries of its validity. The guidelines can support building solid statistical analyses and connecting their results. Thus, they can help buttress continued progress in empirical software engineering research.
19.	Hamon, Thierry, et al. (författare) Combining Compositionality and Pagerank for the Identification of Semantic Relations between Biomedical Words 2012 Ingår i: BioNLP. - 9781937284206 - 1937284204 ; , s. 109-117 Konferensbidrag (refereegranskat)abstract The acquisition of semantic resources and relations is an important task for several applications, such as query expansion, information retrieval and extraction, machine translation. However, their validity should also be computed and indicated, especially for automatic systems and applications. We exploit the compositionality based methods for the acquisition of synonymy relations and of indicators of these synonyms. We then apply pagerank-derived algorithm to the obtained semantic graph in order to filter out the acquired synonyms. Evaluation performed with two independent experts indicates that the quality of synonyms is systematically improved by 10 to 15% after their filtering.
20.	Lindgren, Erik, 1980, et al. (författare) Analysis of industrial X-ray computed tomography data with deep neural networks 2021 Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE. - 0277-786X .- 1996-756X. ; 11840 Konferensbidrag (refereegranskat)abstract X-ray computed tomography (XCT) is increasingly utilized industrially at material- and process development as well as in non-destructive quality control; XCT is important to many emerging manufacturing technologies, for example metal additive manufacturing. These trends lead to increased needs of safe automatic or semi-automatic data interpretation, considered an open research question for many critical high value industrial products such as within the aerospace industry. By safe, we mean that the interpretation is not allowed to unawarely or unexpectedly fail; specifically the algorithms must react sensibly to inputs dissimilar to the training data, so called out-of-distribution (OOD) inputs. In this work we explore data interpretation with deep neural networks to address: robust safe data interpretation which includes a confidence estimate with respect to OOD data, an OOD detector; generation of realistic synthetic material aw indications for the material science and nondestructive evaluation community. We have focused on industrial XCT related challenges, addressing difficulties with spatially correlated X-ray quantum noise. Results are reported on training auto-encoders (AE) and generative adversarial networks (GAN), on a publicly available XCT dataset of additively manufactured metal. We demonstrate that adding modeled X-ray noise during training reduces artefacts in the generated imperfection indications as well as improves the OOD detector performance. In addition, we show that the OOD detector can detect real and synthetic OOD data and still model the accepted in-distribution data down to the X-ray noise levels.
21.	Martinsson, John, et al. (författare) Automatic blood glucose prediction with confidence using recurrent neural networks 2018 Ingår i: CEUR Workshop Proceedings. - : CEUR. ; 2148, s. 64-68 Konferensbidrag (refereegranskat)abstract Low-cost sensors continuously measuring blood glucose levels in intervals of a few minutes and mobile platforms combined with machine-learning (ML) solutions enable personalized precision health and disease management. ML solutions must be adapted to different sensor technologies, analysis tasks and individuals. This raises the issue of scale for creating such adapted ML solutions. We present an approach for predicting blood glucose levels for diabetics up to one hour into the future. The approach is based on recurrent neural networks trained in an end-to-end fashion, requiring nothing but the glucose level history for the patient. The model outputs the prediction along with an estimate of its certainty, helping users to interpret the predicted levels. The approach needs no feature engineering or data pre-processing, and is computationally inexpensive.
22.	Mashad Nemati, Hassan, 1982-, et al. (författare) Bayesian Network Representation of Meaningful Patterns in Electricity Distribution Grids 2016 Ingår i: 2016 IEEE International Energy Conference (ENERGYCON). - : IEEE. - 9781467384636 Konferensbidrag (refereegranskat)abstract The diversity of components in electricity distribution grids makes it impossible, or at least very expensive, to deploy monitoring and fault diagnostics to every individual element. Therefore, power distribution companies are looking for cheap and reliable approaches that can help them to estimate the condition of their assets and to predict the when and where the faults may occur. In this paper we propose a simplified representation of failure patterns within historical faults database, which facilitates visualization of association rules using Bayesian Networks. Our approach is based on exploring the failure history and detecting correlations between different features available in those records. We show that a small subset of the most interesting rules is enough to obtain a good and sufficiently accurate approximation of the original dataset. A Bayesian Network created from those rules can serve as an easy to understand visualization of the most relevant failure patterns. In addition, by varying the threshold values of support and confidence that we consider interesting, we are able to control the tradeoff between accuracy of the model and its complexity in an intuitive way. © 2016 IEEE
23.	Munappy, Aiswarya Raj, 1990, et al. (författare) On the Trade-off Between Robustness and Complexity in Data Pipelines 2021 Ingår i: Quality of Information and Communications Technology. - Cham : Springer. - 9783030853464 - 9783030853471 ; 1439, s. 401-415 Konferensbidrag (refereegranskat)abstract Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, transforming, combining, validating, and loading data for further analysis and visualization. However, the robustness of data pipelines is equally important since unhealthy data pipelines can add more noise to the input data. This paper identifies the essential elements for a robust data pipeline and analyses the trade-off between data pipeline robustness and complexity.
24.	Ranjbar, Arian, 1992, et al. (författare) Scene Novelty Prediction from Unsupervised Discriminative Feature Learning 2020 Ingår i: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC). Konferensbidrag (refereegranskat)abstract Deep learning approaches are widely explored in safety-critical autonomous driving systems on various tasks. Network models, trained on big data, map input to probable prediction results. However, it is unclear how to get a measure of confidence on this prediction at the test time. Our approach to gain this additional information is to estimate how similar test data is to the training data that the model was trained on. We map training instances onto a feature space that is the most discriminative among them. We then model the entire training set as a Gaussian distribution in that feature space. The novelty of the test data is characterized by its low probability of being in that distribution, or equivalently a large Mahalanobis distance in the feature space. Our distance metric in the discriminative feature space achieves a better novelty prediction performance than the state-of-the-art methods on most classes in CIFAR-10 and ImageNet. Using semantic segmentation as a proxy task often needed for autonomous driving, we show that our unsupervised novelty prediction correlates with the performance of a segmentation network trained on full pixel-wise annotations. These experimental results demonstrate potential applications of our method upon identifying scene familiarity and quantifying the confidence in autonomous driving actions.
25.	Shavalieva, Gulnara, 1987, et al. (författare) Knowledge mining from scientific literature for acute aquatic toxicity: classification for hybrid predictive modelling 2022 Ingår i: Computer Aided Chemical Engineering. - 1570-7946. ; 51, s. 1465-1470 Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract This work proposes a systematic method consisting of state-of-the-art text processing approaches and human-machine interaction for the extraction of useful sentences and data in tabular, graphical, and numerical form, containing information particularly relevant for hybrid modelling. It is applied to the domain of acute aquatic toxicity of chemicals, which is particularly relevant for the safety, health, and environmental hazard assessment of chemicals. Nearly 400 papers from 2000-2021 were identified and processed with the proposed method. The results indicate that the vast amount of knowledge can be efficiently processed in orders of magnitude faster than conventional methods without loss of detail and interpretation depth. The information is in a form that can be useful in hybrid modelling with respect to model and predictor selection, prioritization, and constraints, addressing data gaps, and validating and interpreting model performance.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) hsv:(Bioinformatik) "

Avgränsa träffmängd

År