December 7 @ 10:30 AM - 11:30 AM - LWSN B155
Date: Friday, December 7, 2012
Time: 10:30 AM
Place: LWSN B155
Speaker: Sunduz Keles, Department of Statistics, University of Wisconsin-Madison
"Statistical and computational aspects of ChIP-seq data analysis: From design to biological discovery"
Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) has been successfully used for genome-wide profiling of transcription factor binding sites, histone modifications, and nucleosome occupancy in many model organisms and humans. Because the compact genomes of prokaryotes harbor many binding sites separated by only few base pairs, applications of ChIP-Seq in this domain have not reached their full potential. Applications in prokaryotic genomes are further hampered by the fact that well studied data analysis methods for ChIP-Seq do not result in a resolution required for deciphering the locations of nearby binding events. We generated single-end tag (SET) and paired-end tag (PET) ChIP-Seq data for sigma70 factor in Escherichia coli (E. coli). Direct comparison of these datasets revealed that although PET assay enables higher resolution identification of binding events, standard ChIP-Seq analysis methods are not equipped to utilize PET-specific featur!
es of the data. To address this problem, we developed dPeak as a high resolution binding site identification (deconvolution) algorithm. dPeak implements a probabilistic model that accurately describes ChIP-Seq data generation process for both the SET and PET assays. For SET data, dPeak outperforms or performs comparably to the state-of-the-art high-resolution ChIP-Seq peak deconvolution algorithms such as PICS, GPS, and GEM. When coupled with PET data, dPeak significantly outperforms SET-based analysis with any of the current state-of-the-art methods. Experimental validations of a subset of dPeak predictions from sigma70 PET ChIP-seq data indicate that dPeak can estimate locations of binding events with as high as 2 to 21bp resolution. Theoretical calculations based on our probabilistic model indicate when and how PET gains advantage over SET assay. Results from our model and experimental study have implications for ChIP-exo experiments, differential occupancy analysis with!
ChIP-Seq, and ChIP-Seq based cis-regulatory module construction.