CODIS Loci Ready for Disease Prediction, Vermont Court Says

A trial court in Vermont has gone where no court has gone before. In State v. Abernathy [1], Chittenden Superior Court Judge Alison Sheppard Arms found that because “[s]ix CODIS loci … have associations with an increased risk of disease or have functional properties,” the custodians of law enforcement DNA databases can make “probabilistic predictions of disease.” According to the judge, modern research has established that “some of the CODIS loci have associations with identifiable serious medical conditions,” making the scientific evidence “sufficient to overcome the previously held belief[s]” about the innocuous nature of the CODIS loci.

Emphasizing this finding that richly information-laden STR profiles reside in identification databases, the court proceeded to strike down “Vermont’s new pre-conviction DNA testing requirement … that requires submission of a DNA sample from a ‘person for whom the court has determined at arraignment there is probable cause that the person has committed a felony … .'” In an atypical opinion, the court applied a “special needs” balancing test, placed the burden of proof on the state, and held that this law violates the state constitution.

A major theme in Judge Arms’ discussion of human genetics is that there has been a revolution in our understanding of what used to be called “junk DNA.” Even though the CODIS loci originally were described as “junk” in “good faith,” that understanding was wrong–we now know that even DNA that does not code for proteins is biologically important.1 Other judges, advocacy groups, and at least one law professor have jumped from the discovery that the triplet code for proteins is not the sole message inscribed in DNA to the conclusion that all the CODIS loci may well convey significant information about disease states or propensities.

There are a couple of problems with this reasoning. All that we actually know is that some non-protein-coding DNA regulates gene expression. Scientists do not believe that all non-protein-coding sequences are regulatory. In particular, whether noncoding, nontranscribed, and largely nonconserved sequences are part of a regulatory system (even if their presence might have some function) is far from established.2 The opinion cites an essay I wrote making this point [3] but then ignores its content. It quotes the legal treatise, Modern Scientific Evidence, for the view that “while it is generally agreed that no single loci [sic] contains a gene that definitively determines any discernible characteristic of significance, there are nonetheless indications that they may play a role in some sensitive matters, and continued debates about their importance.” Before Abernathy, it appeared that the “continued debates” ended five years ago with agreement on what already was known — that even if the loci do not play a functional role, they might, like certain fingerprint patterns or blood types, have some statistical associations with diseases.3

Venturing beyond the inconclusive generalities like these, Abernathy refers to the biomedical literature on five loci and to a testifying expert’s characterization of the literature (with no specific references) on another locus. The opinion does not give the magnitude of any putative association, let alone any measure of predictive utility.4. It uses the following phrases: “a fairly large effect size,” “a modest association,” “not the most strongly associated,” “small but … not zero,”5 and “cannot find that this marker has no association.” It does not provide measures of the uncertainty in these estimates. Finally, the opinion does not discuss the extent to which the studies said to prove that the associations have been replicated.6

Of course, few judges could confidently review the flood of studies on human genetics. Unlike some previous opinions and law review articles, however, this opinion does not rely entirely or largely on newspaper headlines and stories about “junk DNA.” Here, the iconoclastic findings came after an evidentiary hearing. But, as has happened before with DNA evidence [8], the evidentiary hearing was one-sided. The defendants presented the testimony of Professor Gregory Wray of Duke University, a specialist in genetics and evolutionary biology, and the state did not to present an expert in medical genetics or genomics to counter his testimony. Although Professor Wray reviewed the biomedical literature before he testified, the defense submitted no written report, and the state rather than the defense introduced the papers cited in the opinion as exhibits. Scanning the testimony, it seems to me that Dr. Wray never was asked a series of critical questions:

  1. Is it generally accepted that the associations he pointed to apply to the population of individuals whose DNA is placed in law enforcement databanks?
  2. Assuming that they do apply to that population, what is the positive and negative predictive value of any inference about disease status or propensity derived from these particular CODIS alleles?
  3. How would the predictive or diagnostic disease-related information in a state DNA database compare to that of (a) color photographs, (b) fingerprints, (c) blood types used in conventional serology, and (d) the HLA-A and HLA-B haplotypes that used to be a mainstay of parentage testing?
  4. Are the CODIS genotypes likely to be substantially more predictive in the future?

Until these questions are answered, there is reason to ask whether the trial court’s findings fairly represent the status quo or instead are grim predictions of what could come to pass.


1. For a short audio clip reporting on the revolutionary discoveries, click on Joe Palca, Don’t Throw It Out: ‘Junk DNA’ Essential In Evolution, All Things Considered, Aug. 19, 2011 (with a sound bite from Professor Gregory Wray, among other interviewees).

2. According to Judge Arms,”[t]he term ‘junk DNA’ was coined in the early 1980s.” In fact, the phrase normally is attributed to Susumu Ohno, who used it in the title of a 1972 paper [2]. Ohno did not reason that “we don’t know what noncoding DNA does, therefore, is it is useless junk.” Indeed, he proposed that the duplication and inactivation of genes produce non-protein-coding DNA (now designated pseudogenes) that might have a function. A video introducing Ohno and reading an excerpt from the paper about the role of the noncoding sequences as “spacers” with evolutionary importance can be found at Since 1972, other possible functions for noncoding DNA have been proposed. Some functions imply that the sequences should be conserved as one species evolves into another. Others, such as Ohno’s suggestion that noncoding sequences act as buffers between genes, do not.

3. See [5, p. 228] (referring to “a brief debate in the legal literature” necessitated by “a misunderstanding by Simon Cole over some of the things I [John Butler] had written in a review article on STR markers” and emphasizing that “STR markers used for human identity testing do not predict disease.”). One source of confusion, which also infects the Abernathy opinion is the thought that a statistical association between a locus and a disease detected in a family study in say, Northern India, establishes that the same association exists throughout the population in the United States.

4. Even a strong association (large relative risk) would not make for a useful predictive test if the prevalence of the condition is very small. See [3].

5. The sentence “[t]he relative risk of developing schizophrenia associated with this marker is small but it is not zero” is technically flawed. A relative risk of 1 would express a 0 correlation.

6. Replication is always important, and the problem of false positives is especially acute with genome-wide association studies. See, e.g., [6, 7].


1. State v. Abernathy, No. 3599-9-11 (Vt. Super. Ct. June 1, 2012).

2. S. Ohno, So Much “Junk” DNA in our Genome, 23 Brookhaven Symp. Biol. 366 (1972) (also published in Evolution of Genetic Systems 366 (H.H. Smith ed. 1972).

3. David H. Kaye, Please, Let’s Bury the Junk: The CODIS Loci and the Revelation of Private Information, 102 Nw. U. L. Rev. Colloquy 70 (2007).

4. David H. Kaye, Mopping Up After Coming Clean About “Junk DNA”, Nov. 23, 2007, available at

5. John M. Butler, Advanced Topics in Forensic DNA Typing: Methodology (2012).

6. D.J. Hunter & P. Kraft, Drinking from the Fire Hose–Statistical Issues in Genomewide Association Studies, 357 N. Engl. J. Med. 436 (2007).

7. Thomas A. Pearson, & Teri A. Manolio, How to Interpret a Genome-wide Association Study, 299 J. Am. Med. Ass’n 1335 (2008).

8. David H. Kaye, The Double Helix and the Law of Evidence (2010).

Cross-posted to Forensic Science, Statistics, and the Law.