SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) srt2:(2020-2024)"

Sökning: hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) > (2020-2024)

  • Resultat 1-10 av 16480
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Norlund, Tobias, 1991, et al. (författare)
  • Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?
  • 2021
  • Ingår i: Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 149-162, Punta Cana, Dominican Republic. - : Association for Computational Linguistics.
  • Konferensbidrag (refereegranskat)abstract
    • Large language models are known to suffer from the hallucination problem in that they are prone to output statements that are false or inconsistent, indicating a lack of knowledge. A proposed solution to this is to provide the model with additional data modalities that complements the knowledge obtained through text. We investigate the use of visual data to complement the knowledge of large language models by proposing a method for evaluating visual knowledge transfer to text for uni- or multimodal language models. The method is based on two steps, 1) a novel task querying for knowledge of memory colors, i.e. typical colors of well-known objects, and 2) filtering of model training data to clearly separate knowledge contributions. Additionally, we introduce a model architecture that involves a visual imagination step and evaluate it with our proposed method. We find that our method can successfully be used to measure visual knowledge transfer capabilities in models and that our novel model architecture shows promising results for leveraging multimodal knowledge in a unimodal setting.
  •  
2.
  •  
3.
  • Sweidan, Dirar, et al. (författare)
  • Predicting Customer Churn in Retailing
  • 2022
  • Ingår i: Proceedings 21st IEEE International Conference on Machine Learning and Applications ICMLA 2022. - : IEEE. - 9781665462839 - 9781665462846 ; , s. 635-640
  • Konferensbidrag (refereegranskat)abstract
    • Customer churn is one of the most challenging problems for digital retailers. With significantly higher costs for acquiring new customers than retaining existing ones, knowledge about which customers are likely to churn becomes essential. This paper reports a case study where a data-driven approach to churn prediction is used for predicting churners and gaining insights about the problem domain. The real-world data set used contains approximately 200 000 customers, describing each customer using more than 50 features. In the pre-processing, exploration, modeling and analysis, attributes related to recency, frequency, and monetary concepts are identified and utilized. In addition, correlations and feature importance are used to discover and understand churn indicators. One important finding is that the churn rate highly depends on the number of previous purchases. In the segment consisting of customers with only one previous purchase, more than 75% will churn, i.e., not make another purchase in the coming year. For customers with at least four previous purchases, the corresponding churn rate is around 25%. Further analysis shows that churning customers in general, and as expected, make smaller purchases and visit the online store less often. In the experimentation, three modeling techniques are evaluated, and the results show that, in particular, Gradient Boosting models can predict churners with relatively high accuracy while obtaining a good balance between precision and recall. 
  •  
4.
  • Al Sabbagh, Khaled, 1987, et al. (författare)
  • Improving Data Quality for Regression Test Selection by Reducing Annotation Noise
  • 2020
  • Ingår i: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020. ; , s. 191-194
  • Konferensbidrag (refereegranskat)abstract
    • Big data and machine learning models have been increasingly used to support software engineering processes and practices. One example is the use of machine learning models to improve test case selection in continuous integration. However, one of the challenges in building such models is the identification and reduction of noise that often comes in large data. In this paper, we present a noise reduction approach that deals with the problem of contradictory training entries. We empirically evaluate the effectiveness of the approach in the context of selective regression testing. For this purpose, we use a curated training set as input to a tree-based machine learning ensemble and compare the classification precision, recall, and f-score against a non-curated set. Our study shows that using the noise reduction approach on the training instances gives better results in prediction with an improvement of 37% on precision, 70% on recall, and 59% on f-score.
  •  
5.
  • Fredriksson, Teodor, 1992, et al. (författare)
  • Machine learning models for automatic labeling: A systematic literature review
  • 2020
  • Ingår i: ICSOFT 2020 - Proceedings of the 15th International Conference on Software Technologies. - : SCITEPRESS - Science and Technology Publications. ; , s. 552-566
  • Konferensbidrag (refereegranskat)abstract
    • Automatic labeling is a type of classification problem. Classification has been studied with the help of statistical methods for a long time. With the explosion of new better computer processing units (CPUs) and graphical processing units (GPUs) the interest in machine learning has grown exponentially and we can use both statistical learning algorithms as well as deep neural networks (DNNs) to solve the classification tasks. Classification is a supervised machine learning problem and there exists a large amount of methodology for performing such task. However, it is very rare in industrial applications that data is fully labeled which is why we need good methodology to obtain error-free labels. The purpose of this paper is to examine the current literature on how to perform labeling using ML, we will compare these models in terms of popularity and on what datatypes they are used on. We performed a systematic literature review of empirical studies for machine learning for labeling. We identified 43 primary studies relevant to our search. From this we were able to determine the most common machine learning models for labeling. Lack of unlabeled instances is a major problem for industry as supervised learning is the most widely used. Obtaining labels is costly in terms of labor and financial costs. Based on our findings in this review we present alternate ways for labeling data for use in supervised learning tasks.
  •  
6.
  • Somanath, Sanjay, 1994, et al. (författare)
  • Towards Urban Digital Twins: A Workflow for Procedural Visualization Using Geospatial Data
  • 2024
  • Ingår i: Remote Sensing. - 2072-4292. ; 16:11
  • Tidskriftsartikel (refereegranskat)abstract
    • A key feature for urban digital twins (DTs) is an automatically generated detailed 3D representation of the built and unbuilt environment from aerial imagery, footprints, LiDAR, or a fusion of these. Such 3D models have applications in architecture, civil engineering, urban planning, construction, real estate, Geographical Information Systems (GIS), and many other areas. While the visualization of large-scale data in conjunction with the generated 3D models is often a recurring and resource-intensive task, an automated workflow is complex, requiring many steps to achieve a high-quality visualization. Methods for building reconstruction approaches have come a long way, from previously manual approaches to semi-automatic or automatic approaches. This paper aims to complement existing methods of 3D building generation. First, we present a literature review covering different options for procedural context generation and visualization methods, focusing on workflows and data pipelines. Next, we present a semi-automated workflow that extends the building reconstruction pipeline to include procedural context generation using Python and Unreal Engine. Finally, we propose a workflow for integrating various types of large-scale urban analysis data for visualization. We conclude with a series of challenges faced in achieving such pipelines and the limitations of the current approach. However, the steps for a complete, end-to-end solution involve further developing robust systems for building detection, rooftop recognition, and geometry generation and importing and visualizing data in the same 3D environment, highlighting a need for further research and development in this field.
  •  
7.
  • Bergström, Gustav, et al. (författare)
  • Evaluating the layout quality of UML class diagrams using machine learning
  • 2022
  • Ingår i: Journal of Systems and Software. - : Elsevier BV. - 0164-1212. ; 192
  • Tidskriftsartikel (refereegranskat)abstract
    • UML is the de facto standard notation for graphically representing software. UML diagrams are used in the analysis, construction, and maintenance of software systems. Mostly, UML diagrams capture an abstract view of a (piece of a) software system. A key purpose of UML diagrams is to share knowledge about the system among developers. The quality of the layout of UML diagrams plays a crucial role in their comprehension. In this paper, we present an automated method for evaluating the layout quality of UML class diagrams. We use machine learning based on features extracted from the class diagram images using image processing. Such an automated evaluator has several uses: (1) From an industrial perspective, this tool could be used for automated quality assurance for class diagrams (e.g., as part of a quality monitor integrated into a DevOps toolchain). For example, automated feedback can be generated once a UML diagram is checked in the project repository. (2) In an educational setting, the evaluator can grade the layout aspect of student assignments in courses on software modeling, analysis, and design. (3) In the field of algorithm design for graph layouts, our evaluator can assess the layouts generated by such algorithms. In this way, this evaluator opens up the road for using machine learning to learn good layouting algorithms. Approach.: We use machine learning techniques to build (linear) regression models based on features extracted from the class diagram images using image processing. As ground truth, we use a dataset of 600+ UML Class Diagrams for which experts manually label the quality of the layout. Contributions.: This paper makes the following contributions: (1) We show the feasibility of the automatic evaluation of the layout quality of UML class diagrams. (2) We analyze which features of UML class diagrams are most strongly related to the quality of their layout. (3) We evaluate the performance of our layout evaluator. (4) We offer a dataset of labeled UML class diagrams. In this dataset, we supply for every diagram the following information: (a) a manually established ground truth of the quality of the layout, (b) an automatically established value for the layout-quality of the diagram (produced by our classifier), and (c) the values of key features of the layout of the diagram (obtained by image processing). This dataset can be used for replication of our study and others to build on and improve on this work. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
  •  
8.
  • Brunetta, Carlo, 1992 (författare)
  • Cryptographic Tools for Privacy Preservation
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Data permeates every aspect of our daily life and it is the backbone of our digitalized society. Smartphones, smartwatches and many more smart devices measure, collect, modify and share data in what is known as the Internet of Things. Often, these devices don’t have enough computation power/storage space thus out-sourcing some aspects of the data management to the Cloud. Outsourcing computation/storage to a third party poses natural questions regarding the security and privacy of the shared sensitive data. Intuitively, Cryptography is a toolset of primitives/protocols of which security prop- erties are formally proven while Privacy typically captures additional social/legislative requirements that relate more to the concept of “trust” between people, “how” data is used and/or “who” has access to data. This thesis separates the concepts by introducing an abstract model that classifies data leaks into different types of breaches. Each class represents a specific requirement/goal related to cryptography, e.g. confidentiality or integrity, or related to privacy, e.g. liability, sensitive data management and more. The thesis contains cryptographic tools designed to provide privacy guarantees for different application scenarios. In more details, the thesis: (a) defines new encryption schemes that provide formal privacy guarantees such as theoretical privacy definitions like Differential Privacy (DP), or concrete privacy-oriented applications covered by existing regulations such as the European General Data Protection Regulation (GDPR); (b) proposes new tools and procedures for providing verifiable computation’s guarantees in concrete scenarios for post-quantum cryptography or generalisation of signature schemes; (c) proposes a methodology for utilising Machine Learning (ML) for analysing the effective security and privacy of a crypto-tool and, dually, proposes a secure primitive that allows computing specific ML algorithm in a privacy-preserving way; (d) provides an alternative protocol for secure communication between two parties, based on the idea of communicating in a periodically timed fashion.
  •  
9.
  • Dodig-Crnkovic, Gordana, 1955 (författare)
  • Cognitive Architectures Based on Natural Info-Computation
  • 2022
  • Ingår i: Studies in Applied Philosophy, Epistemology and Rational Ethics. - Cham : Springer. - 2192-6255 .- 2192-6263. ; , s. 3-13, s. 3-13
  • Bokkapitel (refereegranskat)abstract
    • At the time when the first models of cognitive architectures have been proposed, some forty years ago, understanding of cognition, embodiment and evolution was substantially different from today’s. So was the state of the art of information physics, information chemistry, bioinformatics, neuroinformatics, computational neuroscience, complexity theory, self-organization, theory of evolution, as well as the basic concepts of information and computation. Novel developments support a constructive interdisciplinary framework for cognitive architectures based on natural morphological computing, where interactions between constituents at different levels of organization of matter-energy and their corresponding time-dependent dynamics, lead to complexification of agency and increased cognitive capacities of living organisms that unfold through evolution. Proposed info-computational framework for naturalizing cognition considers present updates (generalizations) of the concepts of information, computation, cognition, and evolution in order to attain an alignment with the current state of the art in corresponding research fields. Some important open questions are suggested for future research with implications for further development of cognitive and intelligent technologies.
  •  
10.
  • Laaber, C., et al. (författare)
  • Applying test case prioritization to software microbenchmarks
  • 2021
  • Ingår i: Empirical Software Engineering. - : Springer Science and Business Media LLC. - 1382-3256 .- 1573-7616. ; 26:6
  • Tidskriftsartikel (refereegranskat)abstract
    • Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 16480
Typ av publikation
konferensbidrag (7468)
tidskriftsartikel (6471)
bokkapitel (638)
doktorsavhandling (634)
forskningsöversikt (327)
licentiatavhandling (257)
visa fler...
rapport (236)
proceedings (redaktörskap) (155)
annan publikation (152)
samlingsverk (redaktörskap) (62)
bok (59)
konstnärligt arbete (34)
patent (5)
recension (5)
visa färre...
Typ av innehåll
refereegranskat (14079)
övrigt vetenskapligt/konstnärligt (2300)
populärvet., debatt m.m. (81)
Författare/redaktör
Bosch, Jan, 1967 (110)
Liwicki, Marcus (99)
Sandkuhl, Kurt, 1963 ... (99)
Vyatkin, Valeriy (91)
Andersson, Karl, 197 ... (87)
Torra, Vicenç (81)
visa fler...
Staron, Miroslaw, 19 ... (68)
Mendez, Daniel (62)
Taheri, Javid (59)
Khan, Fahad (54)
Nikolakopoulos, Geor ... (54)
Dobnik, Simon, 1977 (54)
Lv, Zhihan, Dr. 1984 ... (54)
Cajander, Åsa, Profe ... (53)
Markidis, Stefano (50)
Knauss, Eric, 1977 (50)
Dignum, Frank (48)
Hossain, Mohammad Sh ... (48)
Tiwari, Prayag, 1991 ... (48)
Magnusson, Johan, 19 ... (47)
Främling, Kary, 1965 ... (46)
Kerren, Andreas, Dr. ... (46)
Gionis, Aristides (45)
Matviienko, Andrii (45)
Horkoff, Jennifer, 1 ... (44)
Calvanese, Diego (44)
Khan, Salman (44)
Weyns, Danny (43)
Feldt, Robert, 1972 (43)
Kampik, Timotheus, 1 ... (43)
Monperrus, Martin (43)
Kragic, Danica, 1971 ... (42)
Leite, Iolanda (42)
Berger, Thorsten, 19 ... (41)
Borg, Markus (41)
Nowaczyk, Sławomir, ... (40)
Woźniak, Paweł W., 1 ... (40)
Borin, Lars, 1957 (40)
Skantze, Gabriel, 19 ... (40)
Runeson, Per (40)
Nielsen, Jens B, 196 ... (39)
Bernardy, Jean-Phili ... (39)
Natalino Da Silva, C ... (39)
Gorschek, Tony, 1972 ... (39)
Kävrestad, Joakim, 1 ... (39)
Elmroth, Erik (39)
Wymeersch, Henk, 197 ... (38)
Hansen, Preben (38)
Hossain, Mohammad Sh ... (37)
Monti, Paolo, 1973- (37)
visa färre...
Lärosäte
Chalmers tekniska högskola (2839)
Kungliga Tekniska Högskolan (2588)
Göteborgs universitet (1489)
Uppsala universitet (1412)
Linköpings universitet (1213)
Stockholms universitet (1116)
visa fler...
Umeå universitet (1096)
Luleå tekniska universitet (903)
Lunds universitet (901)
Blekinge Tekniska Högskola (628)
Mälardalens universitet (606)
Linnéuniversitetet (524)
RISE (495)
Högskolan i Skövde (458)
Karlstads universitet (417)
Jönköping University (414)
Örebro universitet (399)
Malmö universitet (396)
Högskolan i Halmstad (320)
Mittuniversitetet (288)
Karolinska Institutet (161)
Högskolan Dalarna (117)
Högskolan i Gävle (100)
Sveriges Lantbruksuniversitet (94)
Södertörns högskola (76)
Högskolan i Borås (71)
Högskolan Väst (47)
Högskolan Kristianstad (40)
VTI - Statens väg- och transportforskningsinstitut (32)
Handelshögskolan i Stockholm (31)
Försvarshögskolan (24)
Kungl. Musikhögskolan (19)
IVL Svenska Miljöinstitutet (9)
Institutet för språk och folkminnen (8)
Naturhistoriska riksmuseet (5)
Stockholms konstnärliga högskola (3)
Konstfack (2)
Gymnastik- och idrottshögskolan (2)
Sophiahemmet Högskola (2)
Röda Korsets Högskola (1)
Enskilda Högskolan Stockholm (1)
visa färre...
Språk
Engelska (16211)
Svenska (229)
Tyska (11)
Portugisiska (11)
Norska (2)
Odefinierat språk (2)
visa fler...
Estniska (2)
Japanska (2)
Mongoliskt språk (2)
Franska (1)
Danska (1)
Ryska (1)
Spanska (1)
Finska (1)
Ungerska (1)
Nygrekiska (1)
Turkiska (1)
visa färre...
Forskningsämne (UKÄ/SCB)
Naturvetenskap (16468)
Teknik (3635)
Samhällsvetenskap (1440)
Humaniora (762)
Medicin och hälsovetenskap (622)
Lantbruksvetenskap (68)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy