Open chromatin and ATAC-seq

Nathan Sheffield, PhD
www.databio.org/slides

Outline

Regulatory DNA
Basic analysis
|
|

20%
60%
20%
|

Chromatin accessibility protocols
◁ Questions ▷

What is regulatory DNA?


Regulatory DNA is a decision-maker

Challenges to studying regulatory DNA

  • Variation: age, cell-type, environment, disease
  • Amount: 1-2% protein coding vs 8-20%? regulatory
  • Target: what gene does it affect?
  • Function: is it a promoter, silencer, insulator, enhancer?
  • Rigidity: genetic code vs TF motifs
Genetic code
Transcription factor motif
We can computationally identify genes and even predict function. Regulatory DNA is more difficult.

Chromatin accessibility

Chromatin accessibility is the degree to which nuclear macromolecules are able to physically contact chromatinized DNA...
[It] is determined by the occupancy and topological organization of nucleosomes as well as other chromatin-binding factors that occlude access to DNA.
Klemm et al. 2019

How can we identify regulatory DNA?


https://en.wikipedia.org/wiki/Chromatin

How can we identify regulatory DNA?


Alberts 2002

How can we identify regulatory DNA?

  • ChIP: Chromatin immunoprecipitation
  • DNase: classic 'gold standard' to identify open chromatin
  • ATAC: Assay for transposase-accessible chromatin
  • FAIRE: Formaldehyde-assisted isolation of regulatory elements
Trends

ChIP-seq

DNase-seq: Biology

ATAC-seq: Experiment (Buenrostro et al. 2013)

Transposase Tn5 protein (Reznikoff 2008)

Chromatin and transcription factors (Thurman et al. 2012)

Chromatin accessibility biology summary

  • Open chromatin usually coincides with active regulatory DNA
  • ... but exact annotation or binding is not provided
  • The advantage and disadvantage of ChIP seq is in its target. It also requires antibodies and provides more diffuse signal
  • ATAC is pronounced 'attack'

Basic data analysis steps

  • 1. Trim adapters (fastq -> fastq)
  • 2. Align reads (fastq -> bam)
  • 3. Shift reads (bam -> bam)
  • 4. Call peaks (bam -> bed)
ATAC-seq: 9 base duplication (Reznikoff 2008)

Tn5 molecular biology


Thank You


nsheff · databio.org · nsheffield@virginia.edu