SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Utökad sökning

onr:"swepub:oai:DiVA.org:uu-169968"
 

Sökning: onr:"swepub:oai:DiVA.org:uu-169968" > A strand specific h...

A strand specific high resolution normalization method for chip-sequencing data employing multiple experimental control measurements

Enroth, Stefan (författare)
Uppsala universitet,Genomik,Centrum för bioinformatik,Science for Life Laboratory, SciLifeLab
Andersson, Claes R. (författare)
Uppsala universitet,Institutionen för medicinska vetenskaper
Andersson, Robin (författare)
Uppsala universitet,Centrum för bioinformatik,Science for Life Laboratory, SciLifeLab
visa fler...
Wadelius, Claes (författare)
Uppsala universitet,Medicinsk genetik,Science for Life Laboratory, SciLifeLab
Gustafsson, Mats G. (författare)
Uppsala universitet,Institutionen för medicinska vetenskaper,Beräknings- och systembiologi,Science for Life Laboratory, SciLifeLab
Komorowski, Jan (författare)
Uppsala universitet,Beräknings- och systembiologi,Science for Life Laboratory, SciLifeLab
visa färre...
 (creator_code:org_t)
2012-01-16
2012
Engelska.
Ingår i: Algorithms for Molecular Biology. - : Springer Science and Business Media LLC. - 1748-7188. ; 7, s. 2-
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • Background: High-throughput sequencing is becoming the standard tool for investigating protein-DNA interactions or epigenetic modifications. However, the data generated will always contain noise due to e. g. repetitive regions or non-specific antibody interactions. The noise will appear in the form of a background distribution of reads that must be taken into account in the downstream analysis, for example when detecting enriched regions (peak-calling). Several reported peak-callers can take experimental measurements of background tag distribution into account when analysing a data set. Unfortunately, the background is only used to adjust peak calling and not as a preprocessing step that aims at discerning the signal from the background noise. A normalization procedure that extracts the signal of interest would be of universal use when investigating genomic patterns.Results: We formulated such a normalization method based on linear regression and made a proof-of-concept implementation in R and C++. It was tested on simulated as well as on publicly available ChIP-seq data on binding sites for two transcription factors, MAX and FOXA1 and two control samples, Input and IgG. We applied three different peak-callers to (i) raw (un-normalized) data using statistical background models and (ii) raw data with control samples as background and (iii) normalized data without additional control samples as background. The fraction of called regions containing the expected transcription factor binding motif was largest for the normalized data and evaluation with qPCR data for FOXA1 suggested higher sensitivity and specificity using normalized data over raw data with experimental background.Conclusions: The proposed method can handle several control samples allowing for correction of multiple sources of bias simultaneously. Our evaluation on both synthetic and experimental data suggests that the method is successful in removing background noise.

Ämnesord

NATURVETENSKAP  -- Biologi -- Biokemi och molekylärbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Biochemistry and Molecular Biology (hsv//eng)

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy