recent Nature Genetics paper Q & A

September 19, 2016

We are excited about our recent paper, “Modeling disease risk through analysis of physical interactions between genetic variants within chromatin regulatory circuitry”. In this paper, we provide evidence that physical interactions between SNPs influence clinical risk to disease and present a model in which gene expression and clinical risk is dictated by multiple variants in a given locus. As we began sharing these findings with collaborators, and presenting the results to the field, we came to discover that certain terms and concepts used in our manuscript can mean different things to scientists with different backgrounds. For example, terms like “interaction” and “epistasis” have different meanings to biologists and statisticians (for more on this, see the 2005 paper by Jason Moore and Scott Williams, “Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis”, Bioassays, 27: 637-646.)

While we utilize this terminology with care in our manuscript, we thought it may be helpful to provide a brief Q & A regarding the common questions we have received when presenting these results. As with all papers from our lab, we strive to be open and transparent about our findings to facilitate critical assessment of the work.

1. How do we (the authors) define the term “interactions”?

This study utilizes the term “interactions” to refer to physical contacts between DNA variants mediated through high-order chromatin folding. We utilize the term “physical interactions” to distinguish from statistical interactions, which are not the focus of this study.

2. If the goal was to identify interactions among SNPs, why did we develop a new approach instead of using a more traditional approach to test for epistasis or additive effects?

Previous functional studies of enhancer-gene regulation have demonstrated that multiple individual enhancers can engage a target promoter. How these multiple elements combine to control gene expression remains an open question. Indeed functional studies have revealed examples in which multiple enhancers act additively [1, 2], as well as examples in which the effect of an individual enhancer is dependent on the genotype and function of other enhancers that control the same gene [3-7].

Our approach was designed to be agnostic to the type of interaction, be it additive, epistatic, synergistic, or an as yet uncharacterized modality. Specifically, we control for the effect of a single variable (i.e. the GWAS allele) to test for the effect of the other (i.e. the outside variant). The impact of the outside variant is measured separately for each GWAS genotype (e.g. non-risk/non-risk, non-risk/risk and risk/risk). Given that the types of interactions are not mutually exclusive, an advantage to this approach is that it captures multiple interaction scenarios. This two-tiered stratification approach is designed to identify variants that can explain additional variation in gene expression beyond the effect of SNPs in linkage disequilibrium with the GWAS allele, and thus directly assess our hypothesis, that chromatin regulatory circuitry, which enables physical interactions between SNPs, is a critical determinant of gene expression and disease risk.

3. Are outside variants indicative of epistasis?

Our approach is not designed to distinguish between modes of interactions and does not directly assess epistasis.

4. How can we be sure that the observed effects are not due to a variant that is in low LD with both the GWAS SNPs and the outside variants, or “third variants”

Recently, an alternative hypothesis was presented for reported epistatic interactions [8-10]. SNPs identified as interacting, and in statistical epistasis with one another, were subsequently shown to also be in low LD with a single, “third SNP”. These “third SNPs” were shown to be alternative explanation for the observed epistatic interactions. Similarly, we considered an alternative explanation for our results, that a single SNP that is partially linked to both the GWAS and outside variant may be responsible for the observed effects. While this possibility cannot be ruled out definitively, we investigated the likelihood that a single third variant could account for the majority of our results. We assessed whether a single variant could account for the effects of both the GWAS allele and the outside variant on both gene expression and clinical risk (see Online Methods). This assessment is, however, limited to variants that can be accurately detected, i.e. gene expression analyses were limited by distance to gene target and genotype frequency and clinical risk assessments were limited to variants that could be accurately imputed. Thus variants that are rare, insufficiently tagged by GWAS SNP arrays, or located far distal to gene targets could not be assessed.

5. If the outside variants are so important, why weren’t they identified through the standard GWAS?

Standard GWAS and eQTL studies evaluate the effects of individual SNPs. Our results are consistent with a model in which multiple SNPs can better explain gene expression levels than a single SNP alone. In this model, a single SNP may be insufficient to distinguish differences in gene expression levels, and likewise insufficient to be associated with disease risk.

Other recent studies have reported similar findings, in which gene expression levels are best explained by a model that considers the contributions of multiple variants [11, 12]

6. The observed effects must be rare. How often is this seen?

These results were surprisingly not rare. Of the GWAS loci for which we could detect an active gene regulatory circuit (i.e. the GWAS-linked variants fall within putative regulatory elements, see Online Methods) 25% were associated with an outside variant that altered gene expression. Of these, 73% significantly altered clinical risk as well (see Supplemental Table 1 and Fig. 4d). These results are consistent with a model in which multiple SNPs frequently better explain differences in target transcript levels than single SNPs alone.

1. Lam, D.D., et al., Partially redundant enhancers cooperatively maintain Mammalian pomc expression above a critical functional threshold. PLoS Genet, 2015. 11(2): p. e1004935.
2. Bothma, J.P., et al., Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo. Elife, 2015. 4.
3. Wiersma, E.J., et al., Role of the intronic elements in the endogenous immunoglobulin heavy chain locus. Either the matrix attachment regions or the core enhancer is sufficient to maintain expression. J Biol Chem, 1999. 274(8): p. 4858-62.
4. Perry, M.W., A.N. Boettiger, and M. Levine, Multiple enhancers ensure precision of gap gene-expression patterns in the Drosophila embryo. Proc Natl Acad Sci U S A, 2011. 108(33): p. 13570-5.
5. Jeong, Y., et al., A functional screen for sonic hedgehog regulatory elements across a 1 Mb interval identifies long-range ventral forebrain enhancers. Development, 2006. 133(4): p. 761-72.
6. Montavon, T., et al., A regulatory archipelago controls Hox genes transcription in digits. Cell, 2011. 147(5): p. 1132-45.
7. Perry, M.W., et al., Shadow enhancers foster robustness of Drosophila gastrulation. Curr Biol, 2010. 20(17): p. 1562-7.
8. Hemani, G., et al., Detection and replication of epistasis influencing transcription in humans. Nature, 2014. 508(7495): p. 249-53.
9. Wood, A.R., et al., Allelic heterogeneity and more detailed analyses of known loci explain additional phenotypic variation and reveal complex patterns of association. Hum Mol Genet, 2011. 20(20): p. 4082-92.
10. Wood, A.R., et al., Another explanation for apparent epistasis. Nature, 2014. 514(E3-E4).
11. Gusev, A., et al., Integrative approaches for large-scale transcriptome-wide association studies. Nat Genet, 2016. 48(3): p. 245-52.
12. Gusev, A., et al., Quantifying missing heritability at known GWAS loci. PLoS Genet, 2013. 9(12): p. e1003993.