SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:prod.swepub.kib.ki.se:146514526"
 

Sökning: id:"swepub:oai:prod.swepub.kib.ki.se:146514526" > Can synthetic data ...

Can synthetic data be a proxy for real clinical trial data? A validation study

Azizi, Z (författare)
Zheng, CY (författare)
Mosquera, L (författare)
visa fler...
Pilote, L (författare)
El Emam, K (författare)
visa färre...
2021-04-16
2021
Engelska.
Ingår i: BMJ open. - : BMJ. - 2044-6055. ; 11:4, s. e043497-
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • There are increasing requirements to make research data, especially clinical trial data, more broadly available for secondary analyses. However, data availability remains a challenge due to complex privacy requirements. This challenge can potentially be addressed using synthetic data.SettingReplication of a published stage III colon cancer trial secondary analysis using synthetic data generated by a machine learning method.ParticipantsThere were 1543 patients in the control arm that were included in our analysis.Primary and secondary outcome measuresAnalyses from a study published on the real dataset were replicated on synthetic data to investigate the relationship between bowel obstruction and event-free survival. Information theoretic metrics were used to compare the univariate distributions between real and synthetic data. Percentage CI overlap was used to assess the similarity in the size of the bivariate relationships, and similarly for the multivariate Cox models derived from the two datasets.ResultsAnalysis results were similar between the real and synthetic datasets. The univariate distributions were within 1% of difference on an information theoretic metric. All of the bivariate relationships had CI overlap on the tau statistic above 50%. The main conclusion from the published study, that lack of bowel obstruction has a strong impact on survival, was replicated directionally and the HR CI overlap between the real and synthetic data was 61% for overall survival (real data: HR 1.56, 95% CI 1.11 to 2.2; synthetic data: HR 2.03, 95% CI 1.44 to 2.87) and 86% for disease-free survival (real data: HR 1.51, 95% CI 1.18 to 1.95; synthetic data: HR 1.63, 95% CI 1.26 to 2.1).ConclusionsThe high concordance between the analytical results and conclusions from synthetic and real data suggests that synthetic data can be used as a reasonable proxy for real clinical trial datasets.Trial registration numberNCT00079274.

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

  • BMJ open (Sök värdpublikationen i LIBRIS)

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Azizi, Z
Zheng, CY
Mosquera, L
Pilote, L
El Emam, K
Artiklar i publikationen
BMJ open
Av lärosätet
Karolinska Institutet

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy