Tag Archives: STRs

Scientists’ Brief on CODIS Loci

On November 9, 2012, the Supreme Court voted to review a case posing the following question: “Does the Fourth Amendment allow the States to collect and analyze DNA from people arrested and charged with serious crimes?” In Maryland v. King, the state’s supreme court concluded that the protection against unreasonable searches and seizures forbids the state from collecting DNA from an individual whose true identity can be established with ordinary fingerprints. On December 28, 2012, the Supreme Court received a Brief of Genetics, Genomics and Forensic Science Researchers as Amici Curiae. Below are ten questions and answers about the brief.

Who contributed to the brief?

I did, and Hank Greely was an additional author. The scientists who participated in the writing are all active and distinguished researchers at medical schools (including Harvard, Yale, and Johns Hopkins) or universities (including Duke, Penn State, and Kings College, London). They include a former president of the American Society of Human Genetics, a past president of the American Board of Medical Genetics, Fellows of the American Association for the Advancement of Science, and members of the Institute of Medicine and the American Academy of Arts and Sciences.

Why did these law professors, medical and statistical geneticists, and molecular biologists submit an amicus brief?

The brief is intended “to inform the Court of the possible medical and social significance of the DNA data stored in law enforcement databases.” (P. 1). Advocacy groups, legal scholars, and some judges have asserted that the small number of features used in law enforcement DNA databases are predictive of health status (or soon will be). The brief attempts to clarify this issue.

Which side does the brief support?

The brief was submitted in support of neither side. It describes the nature of genetic information, the features of the genome used in law enforcement DNA databases, how those features are used in medical research, and whether they currently permit police, employers, or insurers to discern significant facts about a person’s present or future health status.

What conclusions does it reach?

Amici conclude that “[u]nlike medical genetic tests, law enforcement identification profiles have no known value for medical diagnosis or prediction of future health.” (P. 2).

That’s today. What about the future?

Amici caution that “no one can say with certainty what the future will bring, and it is possible that specific loci will be found to affect the operation of certain genes or to display correlations to disease states.” (P. 2). Nevertheless, they suggest that “it is unlikely that the identification profiles will turn into powerful medical diagnostic or predictive tools that can be used to infer disease states or predispositions by examining forensic database records.” (P.2).

Does this mean that the “CODIS loci,” as the identifying features are called, have no medical significance?

Absolutely not. The DNA sequences have been used in medical research for some 20 years to hunt for disease-causing gene mutations. They have been studied for associations with diseases and traits such as longevity. The question the brief addresses is what kind of information can be gleaned from inspecting a database record.

Doesn’t the highly publicized ENCODE Project prove that there is no such thing as “junk DNA”?

The brief contends that debate over the fraction of the genome that is, in an evolutionary sense, ‘junk’ … is orthogonal to the matter before the Court. (P. 26). A section of the brief explains that the data sets and papers recently released from the international Encyclopedia of DNA Elements Project are important to further research into gene regulation and other matters, but they do not indicate that all DNA sequences are critical to health or other important traits. What “[t]he ENCODE papers show [is] that 80% of the genome displays signs of certain types of biochemical activity–even though the activity may be insignificant, pointless, or unnecessary.” (P. 32).

Well, how about other uses? Don’t the CODIS loci tell scientists a lot about a person’s ancestry and race?

Not really. The CODIS loci can reveal something about bio-geographic ancestry, but anthropologists and population geneticists use far more probative ancestry-informative and lineage markers to study genetic histories. That “race” is not a biological category is now well known. As for socially perceived race, “[a] CODIS profile could be used to calculate probabilities that someone would be described as Caucasian, African-American, or Hispanic, but categorical inferences would not be very accurate, and attempts to predict the census-type race of a person from a CODIS profile would seem pointless considering that apparent race already would be known.” (P. 36).

So the brief shows that there is absolutely no important information that can be deduced from a CODIS profile?

No, amici do not say that either. The brief explains that “[b]ecause children inherit all their DNA from their biological parents, the CODIS loci can be powerful tools for determining whether two people could be genetically related as parent and child. … [T]he most powerful genetic information other than identity that the CODIS profiles contain [would be] that two people are not parent and child” or “that two people were identical twins.” (Pp. 33-34).

Where can I find the brief?

Here’s a pdf file. It should appear, along with other briefs, “soon” in the American Bar Association’s Preview of Supreme Court cases.

Cross-posted to Forensic Science, Statistics, and the Law.

Replicating Samples to Invade Privacy?

It is tough for lawyers to get science right. I say this not to denigrate lawyers–I am one myself–but to stress the importance of taking the time and effort to communicate the scientific facts clearly so that the value judgments are persuasive. An article attacking the constitutionality of an Arkansas law on DNA sampling from arrestees illustrates this point. In “Step Out of the Car: License, Registration, and DNA Please,” Associate Professor Brian Gallini of the University of Arkansas School of Law, gives an account of DNA profiling that makes it appear that the process of forensic DNA profiling reveals “the totality of a person’s genetic makeup” to arrive at an identification profile. At least, that is how the following exposition of DNA profiling for identification could be read:

[E]ven the layperson knows that taking a DNA sample requires an intrusion into the body, which thereafter reveals the totality of a person’s genetic makeup. … Although courts have characterized DNA swabs as only “minimally intrusive,” they do so without recognizing … the intrusion upon the arrestee’s interest in keeping the information revealed by a DNA sample private. From a buccal swab, the state obtains an analyzable sample of an arrestee’s DNA. That, in turn, allows the state to perform a polymerase chain reaction procedure (PCR), which involves replicating the DNA sample. This replication then allows the tester to look at “short tandem repeats” (STR). At this stage, the STRs reveal specific areas of DNA known as “loci.” In total, the tester is looking to isolate thirteen different loci in order to identify an individual’s exact genetic makeup. Once complete, the sample potentially “provides the instructions for all human characteristics, from eye color to height to blood type.”

What is wrong with this picture. Let me count the ways:

  1. PCR does not replicate the DNA sample. Human cells can replicate the full nuclear genome, but PCR can only replicate short stretches of DNA from targeted locations–the loci.
  2. Replication itself does not allow the tester to look at STRs. Visualization or ascertainment comes later.
  3. STRs do not “reveal specific areas of DNA known as ‘loci.'” An STR is a certain type of DNA sequence that occurs at, well, an STR locus. PCR primers used in forensic identification amplify only the sequences at these loci. The rest of the genomes remains terra incognito.
  4. The tester is not seeking “to identify an individual’s exact genetic makeup.” Rather, the laboratory is seeking to ascertain a small number of variations that are not in genes (or not in the exons of genes).
  5. The physical sample was complete before it was typed. “Once complete,” the tiny profile cannot possibly “provide[] the instructions for all human characteristics, from eye color to height to blood type.” The STR typing never gives any instructions for phenotypes.

Do these corrections mean that samples could not be used to gain information about human phenotypes such as eye color? Of course not. Eye color is a phenotype that can be deduced (in some instances) from genotyping. But such genotyping is not STR profiling.

And how much would it invade your privacy if a laboratory technician were to figure out your eye color in this roundabout way–instead of looking you in the eye? But that’s another story, and I have argued elsewhere against indefinite sample retention.


Brian Gallini, Step Out of the Car: License, Registration, and DNA Please, 62 Ark. L. Rev. 475 (2009)

Lost in the Junk

Today, New York Times reporter Gina Kolata reported that “Reanimated ‘Junk’ DNA Is Found to Cause Disease” [1]. The research paper in question [2] does not refer (in so many words) to “junk DNA,” for this term no longer is in vogue among geneticists. But the paper does identify the role of a pseudogene within a macrosatellite in causing a common form of muscular dystrophy (known as FSHD). (See also [3].)

I mention the article because of the legal literature and courtroom testimony on “junk DNA.” Some lawyers, judges, sociologists, and advocacy groups maintain that because parts of what once was called “junk DNA” are functional, it follows that the nonfunctional DNA sequences used for criminal identification databases could well be used to diagnose or predict disease status. Will this new discovery soon be cited as supporting this possibility? I hope not, because the discovery has little bearing on the privacy implications of forensic STR profiling.

Elsewhere, I have shown that the concern over medical utility of the forensic STRs ignores biologically important distinctions among different types of noncoding DNA [4, 5]. The fear is based on the following sort of argument:

  1. Some noncoding DNA has diagnostic or predictive utility.
  2. The STRs used for identification are noncoding.
  3. Therefore, the STRs used for identification have diagnostic or predictive utility.

Of course, one could just as well argue that

  1. Some mammals are vampire bats.
  2. Humans are mammals.
  3. Therefore, humans are vampire bats.

It takes a more discerning analysis of the larger category (noncoding DNA or mammals) to make a meaningful risk assessment. Let’s see how this plays out here.

That a pseudogene is involved in the development of muscular dystrophy is an exciting discovery with immediate implications for research into other diseases. Apparently, the pseudogene is not entirely inactive. Indeed, it is an open reading frame that always is transcribed into RNA, but ordinarily the transcripts are not stable. FSHD patients, however,

carry specific single nucleotide polymorphisms (SNPs) in the chromosomal region distal to the last D4Z4 repeat. This FSHD-predisposing configuration creates a canonical polyadenylation signal for transcripts derived from DUX4, a double homeobox gene of unknown function that straddles the last repeat unit and the adjacent sequence. … DUX4 transcripts are efficiently polyadenylated and are more stable when expressed from permissive chromosomes [those with the right SNPs]. These findings suggest that FSHD arises through a toxic gain of function attributable to the stabilized distal DUX4 transcript. [2]

Let’s translate this into less technical language. Within a cell, the information in a gene gets “transcribed” into another molecule (RNA) and then translated into proteins. The “wrong” proteins can cause muscular dystrophy. A remnant of a gene on the fourth largest chromosome still contains instructions for a cell to make a protein, and these still get transcribed into the RNAs. But normally the RNAs do not get translated into proteins because they fall apart fairly quickly. However, certain mutations elsewhere on the same chromosome cause the RNAs to become stable. Those stabilized RNA are translated into the proteins resulting in the disease. In short, the “junk DNA” is a gene that still produces transcripts, but the transcripts are not functional in most people.

In contrast, the microsatellites (STRs) used in forensics do not produce transcripts. Therefore, they cannot have the same effects as the DUX4 pseudogenes that do. The colorful writing about reanimating “junk” DNA notwithstanding, the recent findings about the D4Z4 macrosatellite cannot support the argument that forensic STR profiles are a real threat to medical privacy.

This does not mean that it impossible that the STR profiles could not have some role in bodily functioning and health. The human geneticists I have consulted do not see how this could be, but other geneticists have warned me not to rule it out. Fifty years from now, they say, we’ll know more than we do now. Of course, even if the STRs do something in some situations, this does not mean that the features of the STRs used for identification will have any medical significance.


1. Gina Kolata, Reanimated ‘Junk’ DNA Is Found to Cause Disease, N.Y. Times, Aug. 19, 2010

2. Richard J. L. F. Lemmers, Patrick J. van der Vliet, Rinse Klooster, Sabrina Sacconi, Pilar Camano, Johannes G. Dauwerse, Lauren Snider, Kirsten R. Straasheijm, Gert Jan van Ommen, George W. Padberg, Daniel G. Miller, Stephen J. Tapscott, Rabi Tawil, Rune R. Frants, and Silvere M. van der Maarel, A Unifying Genetic Model for Facioscapulohumeral Muscular Dystrophy,” Science DOI: 10.1126/science.1189044, Aug. 19, 2010,

3. National Institute of Neurological Disorders and Stroke (NINDS), Discovery Opens Door to Therapeutic Development for FSH Muscular Dystrophy, NIH News, Aug. 19, 2010, http://www.nih.gov/news/health/aug2010/ninds-19.htm

4. D.H. Kaye, Science Fiction and Shed DNA, 101 Nw. U. L. Rev. Colloquy 62 (2006), http://www.law.northwestern.edu/lawreview/colloquy/2006/7/

5. D.H. Kaye, Please, Let’s Bury the Junk: The CODIS Loci and the Revelation of Private Information, 102 Nw. U. L. Rev. Colloquy 70 (2007), http://www.law.northwestern.edu/lawreview/colloquy/2007/25/