Conveying Genotype-Phenotype Effect Size to a General Audience

By Castedo Ellerman

The Challenge

Purchasers of direct-to-consumer DNA tests are quick to assume that genotyped DNA determines genetic traits. Some consumers frame traits in a binary "nature vs nurture" framework. Others appreciate that most traits are influenced by a mix of "nature and nurture". A few might be familiar with statistical measures of effect size such as 'fraction of variance explained'. But even for consumers familiar with statistics, there is a tendency to assume that scientific findings on genotype-phenotype effect sizes are much stronger than they really are. Perhaps this is a bias induced by not knowing genes by any identity other than personally understood trait (e.g. "gene for being athletic").

Impression of Scale

A simple way to give an impression of whether effect sizes are large or small is to give word labels. Gene Heritage uses the labels "Highly/Fairly/Barely Predictable" to give an impression of a large, medium or small effect size. An earlier post ("How real are DNA test results?") discusses why a consumer probably should care to evaluate DNA test results as predictions. Additional more detailed communications can be in the form of numerical values and visualizations. Independent of added detail, visualizations can also help readers get a rough visual impression of the scale of genotype-phenotype effect size.

Visualization

Notable research has been done in the field of presenting risk information to patients. The website iconarray.com has a great summary of existing research. Additional research focusing on genomic risk includes papers [1] and [2]. Research indicates that presenting risk information with pictographs (also called icon arrays or crowd figure pictograms) in addition to numerical numbers helps a wider range of readers better understand levels of risk.

Risk and effect size are both forms of uncertainty. Using pictographs to convey effect size may have similar improvements in comprehension. Here is an example of a pictograph used by Gene Heritage to convey the effect size of genetic sex on height:

100 Females Height 100 Males
4
of
100


Above
177cm




46
of
100
16
of
100



177cm
to
170cm



34
of
100
34
of
100



170cm
to
163cm



16
of
100
46
of
100




Below
163cm


4
of
100

Quantification

For binary genotypes and binary phenotypes, a number of metrics such as sensitivity, specificity, and odds ratio are broadly used measurements of effect size. Fraction of variance, such as the measure of heritability, is the most common measurement of genotype-phenotype effect size with quantitative linear phenotypes.

The previous post ("Predictability Score Definition") details the use of Information Theory to measure fraction of genetic and phenotype information. This has an advantage of naturally handling non-linear categorical phenotypes (such as hair color). It also has the nice property of generalizing to binary and linear phenotypes. This results in a unified quantitative measurement that can be used across most kinds of phenotype categorizations.

Like measures of "variance explained", measuring fraction of information makes possible summary language such as "information explained" or "information gained".

Motivations

The motivations for using information theory in the numerical measurement of effect size are to have a measurement that

  1. corresponds to the pictograph shown to genetic report readers (similar pictographs have similar numerical measurements)
  2. can be applied to binary, linear or categorical phenotypes
  3. ranges between zero and a finite upper bound (such as one or ten)
  4. can be summarized in more commonly known concepts like "information" rather than "variance"
An additional secondary benefit might be that "fraction of information" may make sense to consumers who are aware that raw DNA data (from microarrays) provided by the popular direct-to-consumer services is limited.

References

  1. Lautenbach DM, Christensen KD, Sparks JA, Green RC. Communicating genetic risk information for common disorders in the era of genomic medicine. Annu. Rev. Genomics Hum. Genet. 2013;14:491–513. [PMC free article] [PubMed]
  2. Smit AK, Keogh LA, Hersch J, et al. Public preferences for communicating personal genomic risk information: a focus group study. Health Expectations. 2016;19(6):1203–1214. [PMC free article] [PubMed]
Posted