HyCCAPP: Hybridization Capture of Chromatin-Associated Proteins for Proteomics
Overview. HyCCAPP (Hybridization Capture of Chromatin-Associated Proteins for Proteomics) is a strategy we have recently developed to identify the proteins that are associated with specific regions of the genome (REF). In HyCCAPP, cells are formaldehyde-crosslinked to stabilize DNA:protein interactions, followed by cell lysis, chromatin fragmentation, and sequence-specific DNA hybridization to capture and purify desired target regions of the genome. The associated proteins are then identified by mass spectrometry. The technology was developed and shown to be powerful and effective for both multicopy and single copy loci in yeast. We are presently working to extend its capabilities in several new directions:
- Apply HyCCAPP to determine the changes that occur in proteins bound to relevant yeast loci in response to stress (graduate student Sean Dai – a collaboration w Professor Audrey Gasch)
- Apply HyCCAPP to the analysis of multi-copy loci in mammalian cells (graduate student Katie Buxton)
- Evaluate the effect of cell cycle synchronization on HyCCAPP capture efficiency in yeast (postdoctoral fellow Michele Spiniello)
- Use the same basic approach to pull down specific RNAs and identify the associated RNA-binding proteins. This is being pursued on both HIV RNAs (graduate student Rachel Knoener – a collaboration with graduate student Jordan Becker and Professor Nathan Sherer) and long noncoding RNAs (a collaboration with Professor David Jarrard).
- Seek to increase the chromatin capture efficiency in order to decrease the cell number requirements and thereby enable the analysis of single copy loci in mammalian cells.
- Develop increased specificity of hybridization capture using “shielded covalent probes” (a collaboration with Professors Jeff Vieregg and Niles Pierce).
DNA−protein interactions are fundamental to the control of genome expression and play critical roles in mediating DNA replication1, chromatin organization/segregation2, and the transcription of genes and noncoding RNAs3, 4. Some proteins associate with DNA in a site- and sequence-specific manner, such as transcription factors that recognize specific cis regulatory elements to modulate the transcription of nearby genes. Other proteins, such as histones and cohesins, bind much of the genome, often with periodic binding patterns5, 6. Numerous technologies exist to study these interactions, including DNase footprinting7, formaldehyde-assisted isolation of regulatory elements (FAIRE)8, and chromatin immunoprecipitation (ChIP)9. DNase footprinting and FAIRE provide information on protein occupancy by revealing sites of the genome protected by or depleted of bound proteins7, 8. ChIPseq is used extensively to examine sites across the genome that are bound by DNA-associated proteins10, 11. However, a major limitation of ChIP is that it is protein-centric, in that a candidate protein must be chosen as a potential DNA binder. Therefore, while ChIP-seq reveals the specific binding patterns of a given protein within the genome, it does not provide information about other proteins bound to those DNA regions. In the absence of a means to discover the proteins bound to specific chromosomal loci in living cells, much of how the genome is replicated, protected, and expressed will remain obscure.
To address this problem, we have developed the HyCCAPP technology, a corollary to ChIP that is DNA-centric; that is, rather than isolating DNA−protein complexes through capture of the protein, we capture a DNA locus of interest by sequence-specific DNA hybridization, and identify the proteins that are interacting with it by mass spectrometry10. Importantly, this technology is able not only to identify already known DNA interactors, such as those that are studied using ChIP, but also is able to reveal new and previously unsuspected proteins. Conceptually HyCCAPP is very similar to ChIP analysis, differing only in the nature of the affinity capture step (DNA hybridization as opposed to antibody binding). In practice, however, it is much more difficult. First, in contrast to ChIP, where captured DNA is amplified by the polymerase chain reaction (PCR) for subsequent analysis, no amplification is possible for captured protein, and thus much less material is available for mass spectrometric (MS) analysis. Second, the expression levels of different proteins within cells vary by many orders of magnitude, and in some cases important proteins are present at exceedingly low levels. This dramatically complicates their separation and MS analysis. Despite these difficulties there have been two previous reports of DNA-centric capture approaches. Déjardin and Kingston described the “Proteomics of Isolated Chromatin segments” (PICh) strategy to identify proteins bound to multicopy human or Drosophila telomeric regions using locked nucleic acid (LNA) probes11,12, and Byrum et al. developed an approach referred to as “Chromatin Affinity Purification with Mass Spectrometry” (ChAP-MS), in which a transcription factor (LexA) binding site is engineered into the yeast genome at one specific locus and affinity captured via binding of a LexA−Protein A conjugate13. They employed this methodology to identify proteins associated with the yeast GAL1 gene promoter13. While constituting an important early proof-of-principle, the Kingston approach has thus far been limited to the capture of very high copy number short repetitive sequences11, 12, and the ChAP-MS approach requires either engineering of the target genome to introduce a protein binding site13 or introduction of a transcription activator-like (TAL) fusion protein14.
HyCCAPP seeks to address these limitations, combining (i) sequence-specific hybridization capture of DNA fragments of interest directly from a cleared cell lysate, (ii) state-of-the-art mass spectrometric analysis, and (iii) a bioinformatics analysis pipeline to statistically differentiate between real and background signal. We used HyCCAPP to produce locus-specific protein lists for four genomic regions in S. cerevisiae: the 25S and 5S regions within the rDNA locus (∼150−200 copies/cell), the X element adjacent to the telomeres (∼35 copies/cell), and the GAL1-10 promoter (single copy/cell)10. These locus-specific protein lists included many, although not all, previously identified protein interactors, as well as numerous previously unknown interactors that provide new insights into function. Validation of the previously unknown binding proteins by chromatin immunoprecipitation (ChIP) of TAP-tagged proteins confirmed the findings.
While we are gratified by the success of HyCCAPP, which already comprises an important new tool in the arsenal of the functional genomicist, in its current state of development it is still very limited compared to its potential. At present HyCCAPP is able to reveal proteins associated with single copy regions in yeast genomes: the primary goal of this proposal is to extend this capability to the robust multiplexed analysis of single copy regions in mammalian genomes. The significance of such a technology and its potential impact on the biological research community is tremendous, akin to a new type of microscope, able to peer into the genome and reveal the central molecular players responsible for regulation and control. Without detailed knowledge of the proteins interacting across the genome, humankind will never be able to understand the mechanisms responsible for the critical regulatory processes that are central to the development of complex multicellular organisms such as ourselves.
The HyCCAPP Process: From Cells to Identified Locus-Specific Binding Proteins.
HyCCAPP consists of the following steps (see Figure 1): Crosslinking and lysis of cells using formaldehyde and mechanical disruption; Hybridization capture, including chromatin purification and fragmentation by sonication, hybridization capture of target chromatin fragments using biotinylated or desthiobiotinylated complementary oligonucleotide probes, followed by isolation of the protein:DNA complexes using magnetic streptavidin beads; Elution of captured proteins and removal of background using biotin/desthiobiotin exchange or hybridization-based strand displacement; and Protein analysis, using mass spectrometry for protein identification and quantification, and statistical data analysis to distinguish specific protein binders from residual background.
1. Sun, J.Y. and Kong, D.C., "DNA replication origins, ORC/DNA interaction, and assembly of pre-replication complex in eukaryotes." Acta Biochim Biophys Sin, 2010, 42(7), 433-439.
2. Goshima, G.; Saitoh, S.; and Yanagida, M., "Proper metaphase spindle length is determined by centromere proteins Mis12 and Mis6 required for faithful chromosome segregation." Genes Dev, 1999, 13(13), 1664-1677.
3. Martens, J.A.; Laprade, L.; and Winston, F., "Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene." Nature, 2004, 429(6991), 571-574.
4. Xu, Z.Y.; Wei, W.; Gagneur, J.; Perocchi, F.; Clauder-Munster, S.; Camblong, J.; Guffanti, E.; Stutz, F.; Huber, W.; and Steinmetz, L.M., "Bidirectional promoters generate pervasive transcription in yeast." Nature, 2009, 457(7232), 1033-U7.
5. Glynn, E.F.; Megee, P.C.; Yu, H.G.; Mistrot, C.; Unal, E.; Koshland, D.E.; DeRisi, J.L.; and Gerton, J.L., "Genome-wide mapping of the cohesin complex in the yeast Saccharomyces cerevisiae." PLoS Biol, 2004, 2(9), 1325-1339.
6. Arya, G.; Maitra, A.; and Grigoryev, S.A., "A structural perspective on the where, how, why, and what of nucleosome positioning." J Biomol Struct Dyn, 2010, 27(6), 803-820.
7. Galas, D.J. and Schmitz, A., "DNAse footprinting - simple method for detection of protein-DNA binding specificity." Nucleic acids research, 1978, 5(9), 3157-3170.
8. Simon, J.M.; Giresi, P.G.; Davis, I.J.; and Lieb, J.D., "Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA." Nature protocols, 2012, 7(2), 256-267.
9. Solomon, M.J.; Larsen, P.L.; and Varshavsky, A., "Mapping protein DNA interactions in vivo with formaldehyde - evidence that histone-H4 is retained on a highly transcribed gene." Cell, 1988, 53(6), 937-947.
10. Kennedy-Darling, J.; Guillen-Ahlers, H.; Shortreed, M.R.; Scalf, M.; Frey, B.L.; Kendziorski, C.; Olivier, M.; Gasch, A.P.; and Smith, L.M., "Discovery of Chromatin-Associated Proteins via Sequence-Specific Capture and Mass Spectrometric Protein Identification in Saccharomyces cerevisiae." Journal of proteome research, 2014. PMC4123949.
11. Dejardin, J. and Kingston, R.E., "Purification of proteins associated with specific genomic Loci." Cell, 2009, 136(1), 175-86. PMC3395431.
12. Antao, J.M.; Mason, J.M.; Dejardin, J.; and Kingston, R.E., "Protein landscape at Drosophila melanogaster telomere-associated sequence repeats." Mol Cell Biol, 2012, 32(12), 2170-2182.
13. Byrum, S.D.; Raman, A.; Taverna, S.D.; and Tackett, A.J., "ChAP-MS: A method for identification of proteins and histone posttranslational modifications at a single genomic locus." Cell Reports, 2012, 2(1), 198-205.
14. Byrum, S.D.; Taverna, S.D.; and Tackett, A.J., "Purification of a specific native genomic locus for proteomic analysis." Nucleic acids research, 2013, 41(20), 6.