SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Pickerill M) "

Sökning: WFRF:(Pickerill M)

  • Resultat 1-3 av 3
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  •  
2.
  •  
3.
  • Pickerill, P., et al. (författare)
  • PHANTOM: Curating GitHub for engineered software projects using time-series clustering
  • 2020
  • Ingår i: Empirical Software Engineering. - : Springer Science and Business Media LLC. - 1382-3256 .- 1573-7616. ; 25:4, s. 2897-2929
  • Tidskriftsartikel (refereegranskat)abstract
    • Context Within the field of Mining Software Repositories, there are numerous methods employed to filter datasets in order to avoid analysing low-quality projects. Unfortunately, the existing filtering methods have not kept up with the growth of existing data sources, such as GitHub, and researchers often rely on quick and dirty techniques to curate datasets. Objective The objective of this study is to develop a method capable of filtering large quantities of software projects in a resource-efficient way. Method This study follows the Design Science Research (DSR) methodology. The proposed method, PHANTOM, extracts five measures from Git logs. Each measure is transformed into a time-series, which is represented as a feature vector for clustering using the k-means algorithm. Results Using the ground truth from a previous study, PHANTOM was shown to be able to rediscover the ground truth on the training dataset, and was able to identify "engineered" projects with up to 0.87 Precision and 0.94 Recall on the validation dataset. PHANTOM downloaded and processed the metadata of 1,786,601 GitHub repositories in 21.5 days using a single personal computer, which is over 33% faster than the previous study which used a computer cluster of 200 nodes. The possibility of applying the method outside of the open-source community was investigated by curating 100 repositories owned by two companies. Conclusions It is possible to use an unsupervised approach to identify engineered projects. PHANTOM was shown to be competitive compared to the existing supervised approaches while reducing the hardware requirements by two orders of magnitude.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-3 av 3

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy