I use computation to ask and answer biological questions. My biological interest is to understand gene regulation: How does DNA encode regulatory networks that enable cellular differentiation? Gene regulatory systems are finely tuned, and when they break down, it can lead to diseases like cancer. To better understand normal and diseased gene regulation, I collect high-throughput genome-scale data in single cells and cell populations, and then harness the power of supercomputing, machine learning, and software engineering to answer questions about biological systems. I’ve worked mostly in Ewing sarcoma, a pediatric cancer.

This research is inherently interdisciplinary, approaching questions in biology and medicine with tools from computer science and statistics. As a result, I have affiliations with various centers, institutes, and departments.


  • R/Bioconductor tools for large-scale bioinformatic analysis
  • Integrating large genome-scale datasets using high performance computing
  • Applied machine learning and data mining
  • Single cell analysis


  • Gene regulation, chromatin, epigenetics, and transcriptomics in dynamic systems
  • Cancer epigenomics, particularly in Ewing sarcoma and pediatric cancers
  • Stem cell developmental systems, particularly mesenchymal stem cells

Read more by perusing my publications: list of publications, Google Scholar, PubMed, or NCBI.


Though my work is computational, my projects generally have biological motivation. Here are some areas I’m currently investigating:

Gene regulation and chromatin structure

I am interested in how cells fold their DNA to enable complex regulatory patterns. Humans are made up of many different cell-types. Though these cell-types share a single genome, they have very different phenotypes and functions, working together to enable multicellular life. The basis for these dynamics is regulatory DNA, which governs when and where different genes are expressed. I analyze data from high-throughput ChIP-seq, DNase-seq, and ATAC-seq experiments to understand how cells do this during development. I am also interested in the evolutionary background of regulatory differences.

Selected relevant publications:
  • 2017 Nat. Med DNA methylation heterogeneity defines a disease spectrum in Ewing sarcoma.
  • 2016 Bioinformatics LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor.
  • 2015 Cell Reports Differential DNA Methylation Analysis without a Reference Genome.
  • 2015 Nat. Methods ChIPmentation: fast, robust, low-input ChIP-seq for histones and transcription factors.
  • 2015 Cell Reports Epigenome mapping reveals distinct modes of gene regulation and widespread enhancer reprogramming by the oncogenic fusion protein EWS-FLI1.
  • 2013 Genome Res Patterns of regulatory activity across diverse human cell-types predict tissue identity, transcription factor binding, and long-range interactions.
  • 2012 Genome Biol Chromatin accessibility reveals insights into androgen receptor activation and transcriptional specificity.
  • 2012 Nature An integrated encyclopedia of DNA elements in the human genome.
  • 2012 Genome Res Predicting cell-type-specific gene expression from regions of open chromatin.
  • 2012 Nature The accessible chromatin landscape of the human genome.
  • 2012 PLos Genet Extensive Evolutionary Changes in Regulatory Element Activity during Human Origins Are Associated with Altered Gene Expression and Positive Selection.
  • 2012 Genes Identifying and characterizing regulatory sequences in the human genome with chromatin accessibility assays.
  • 2011 Genome Res Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity.

Computational cancer epigenomics

I am interested in understanding how cancers commandeer the normal regulatory machinery to create disease. As a model system, I use Ewing sarcoma, a pediatric tumor, which is a good model because it is almost always driven by a single, well-characterized mutagenic event: a chromosomal translocation leading to the fusion protein EWS-FLI1. To explore how this fusion protein re-wires the cells to proliferate uncontrollably, I am examining genome-wide epigenetic profiles of Ewing sarcoma. These types of questions lead to computational problems inherent in dealing with lots of data from different individuals, cancers, and data types. This project is a collaboration with Eleni Tomazou and Heinrich Kovar at St. Anna's Children's Cancer Research Institute in Vienna.

Selected relevant publications:
  • 2017 Nat. Med DNA methylation heterogeneity defines a disease spectrum in Ewing sarcoma.
  • 2016 Oncotarget The second European interdisciplinary Ewing sarcoma research summit - A joint effort to deconstructing the multiple layers of a complex disease.
  • 2015 Cell Reports Epigenome mapping reveals distinct modes of gene regulation and widespread enhancer reprogramming by the oncogenic fusion protein EWS-FLI1.

Single-cell sequencing analysis

In the past, we have only been able to sequence populations of cells, leaving important cell-to-cell differences unexplored. New microfluidics and sequencing technology is making it possible to ask questions about single cells. Using this technology, I am interested in fundamental questions about how cells differentiate and respond to their environments at the single cell level.

Selected relevant publications:
  • 2017 Genome Biol Single-cell epigenomic variability reveals functional cancer heterogeneity.
  • 2016 Trends Biotechnol Multi-Omics of Single Cells: Strategies and Applications.
  • 2015 Cell Reports Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics.

Literature threads

Here are some lists of papers on some relevant topics that I am interested in: