Research overview

by Nathan Sheffield

Slides at http://databio.org/research_overview/

Every cell in your body has the same genome*

Yet your cells are diverse...

How can a single set of instructions
give rise to such a complex multicellular being?

The answer lies in the process of differentiation,
wherein cells assume individual functions and roles

By packaging DNA differently,
each cell uses only a subset of instructions
Rosa and Shaw 2013. Biology

The epigenome


If we can measure how DNA is packaged,
we can understand what a cell is doing

DNA sequencing technology is also useful
for measuring quantitative cellular signals





But huge, diverse datasets lead to computational challenges.
My research is to develop and apply computational methods
to organize, analyze, and understand large epigenomic data.

Biological questions

  • How are genes regulated?
  • What DNA use makes two cell types different?
  • What DNA use makes a cancer cell proliferate?
  • How does DNA use change in a dynamic process, like cell division, differentiation, or response to environment?

Computational approaches

Regulatory elements database

Use clustering algorithms to annotate regulatory DNA by cell-type specificity. We used this to identify cell-type patterns of regulatory elements specific to cancer cells.

LOLA (Locus Overlap Analysis)

A method to identify enrichment of regulatory DNA to give context to newly generated data

MIRA
(Methylation-based Inference of Regulatory Activity)

Predict DNA binding from DNA methylation data. Used to predict cancer progression in Ewing sarcoma tumor samples

Looper and pypiper

A standardized way of annotating large datasets to make sharing and re-analyzing data easier.
Thanks for listening!

Slides at http://databio.org/research_overview/