Tag Archives: CODIS loci

“A DNA Fingerprint Is Nothing More than a Long List of Numbers”

In Our DNA Is Our Blueprint, I suggested that the Supreme Court would be wise to avoid describing DNA as the equivalent of a building’s blueprint or a medical history. The Court also should be wary of efforts to dismiss a DNA identification profile as a mere “string of numbers” with no further implications. Brief for the United States Amicus Curiae Supporting Petitioner, Maryland v. King, No. 12-207, Jan. 2, 2013, at 2. According to the government:

The analysis of the genetic material … reveals nothing private about the arrestee at all. … A DNA fingerprint is nothing more than a long list of numbers … they do not encode protein sequences — that is, they do not “code” for physical traits, propensities, or susceptibilities. … A DNA fingerprint therefore yields no private information at all. … In short, the number string does not give rise to any inference about the personal information or characteristics of the person to whom it uniquely belongs. Obtaining those numbers therefore does not meaningfully invade an arrestee’s privacy.

Id. at 19-22.

Here, I demonstrate that inferences from a CODIS profile to facts that individuals would reasonably regard as private are not inherently impossible. Whether this fact weighs heavily in favor of the individual is, of course, a further question. Parts I and II probe the government’s contention that obtaining “the number string” “does not meaningfully invade an arrestee’s privacy” because there is “no private information at all.” Part III offers a few thoughts on what follows if the government is wrong about the information content of the DNA sequence data.

I. Strings of Numbers

A social security number is a string of digits, but until 2011, the string contained an area code based on the state in or from which the application for a number was made. For example, a pre-2011 number starting with 520 refers to an application from within Wyoming. Social Security Administration, Social Security Number llocations, Jan. 2, 2013. This information might not seem intensely private, but the information content is not zero.

Today’s social security numbers are a random nine-digit string (with some possibilities excluded). Social Security Administration, Social Security Number Randomization, Nov. 29, 2012. CODIS profiles are different in at least three respects. First, parts of them could be close to (and therefore correlated with) disease-causing loci.  However, correlations that permit accurate inferences about health status from the profile itself are not known. See Scientists Brief on CODIS Loci, Jan. 3, 2013.

Second, in the future some parts of the profiles might be shown to play a causal role in gene regulation, affecting the quantity of a protein produced in a cell. Id. Indeed, one CODIS locus has been shown to participate in a regulatory system, but this does not mean that it is like a medical record. Brief of Genetics, Genomics and Forensic Science Researchers as Amici Curiae, Maryland v. King, No. 12-207, Dec. 28, 2012. The government’s brief suggests that if this situation were to change, the Fourth Amendment balancing would need to be re-examined. That also would be the case if correlations with more predictive power were to be discovered.

Finally, unlike numbers assigned by the Social Security Administration, a new CODIS profile does not emerge from a random-number generator every time a child is born. Rather, the child’s profile is a mixture of pre-existing numbers. A child inherits a random half of the father’s numbers and a random half of the mother’s numbers. This aspect of sexual reproduction has immediate implications for privacy.

II. Inherited Numbers Carry Some Information

The fact that seemingly empty numbers are inherited via sexual reproduction complicates the privacy analysis in several ways. It means that siblings will have numbers that, on average, are closer to one another than to unrelated persons and that a parent and child will have at least one of every two pairs of numbers in common. Consequently, a curious database administrator could compare profiles of pairs of arrested individuals to draw inferences about possible genetic relationships. Most inferences of specific relationships would be wrong–for example, many nonsiblings would show more similarities in their profiles than true siblings would. Nevertheless, many pairs could securely be said not to be parent and child.

Usually, these possible inferences would be unimportant. Most people are not my parents. But suppose a candidate for sheriff were a strong challenger in an upcoming election, and she, her husband, and her adult child (born during the course of the marriage) were arrested. The CODIS profiles could be used for parentage testing. A finding of nonpaternity would be proof of the candidate’s marital infidelity. The fact that “[a] DNA fingerprint is nothing more than a long list of numbers” does not mean the “fingerprint” is devoid of socially significant information.

III. Does It Matter?

The government’s brief points out that the CODIS system makes such abuses difficult to accomplish at the level of the national database (NDIS). See Brief for United States, at 19-20:

DNA identification profiles stored by CODIS — as Maryland law contemplates — have no identifying information associated with them. CODIS contains the number-string itself and information about the laboratory that generated it; only in the event of a “hit” in the database can the record ultimately be traced back to a particular arrestee. See CODIS and NDIS Fact Sheet.

But the state or local laboratory that prepared the profile does not need to trace it “back to a particular arrestee.” Someone there already knows to whom the profile belongs. Moreover, the hypothetical does not involve NDIS. It involves a corrupt sheriff intent on learning the CODIS profiles of known individuals from samples taken by his officers.

A more convincing response is that unusual, unauthorized, and unlikely privacy abuses are not weighty enough to overcome strong government interests in collecting biological material. Maryland retains the original DNA samples. It could test them for a large number of genetic conditions. Urine samples in drug testing programs could be examined for disease-related information. As the government points out, these possibilities do not render the collection and statutorily limited analysis and use of the material unconstitutional. Id. at 23-24.

The government goes too far, however, when it suggests that the risk of abuse is “irrelevant.” Id. at 24. The Court should not blind itself to the possibility of abuses of power, of bad faith, and of temptations to cut corners. But neither should it mistake the possible for the probable. Unless the possibilities for abuse are substantial, they should not invalidate a program that truly serves strong state interests.

Cross-posted to Double Helix Law

CODIS Loci Not Ready for Disease Prediction After All?

Last month, I noted the findings of a superior court in Vermont that “some of the CODIS loci have associations with identifiable serious medical conditions,” making the scientific evidence “sufficient to overcome the previously held belief[s]” about the innocuous nature of the CODIS loci [1]. The judge based her conclusion in State v. Abernathy [2] that the CODIS loci now permit “probabilistic predictions of disease” on the unpublished views of biologist Greg Wray, who oversees the Center for Evolutionary Genomics and the DNA Sequencing Core Facility, within Duke University’s Institute for Genome Sciences and Policy.

A technical report accepted for publication in the Journal of Forensic Sciences seems to dispute these claims. Sara Katsanis, a staff researcher at Duke’s same Institute for Genome Sciences and Policy, and Jennifer Wagner, a postdoc at the University of Pennsylvania’s Center for the Integration of Genetic Healthcare Technologies, searched the biomedical literature and genomic databases not only for associations with phenotypes in the current 13 loci used in offender databases, but also in ones that soon may be added to the system. They came up with “no evidence” that any particular CODIS single-locus STR genotypes “are indicative of phenotype.”


1. CODIS Loci Ready for Disease Prediction, Vermont Court Says, June 15, 2012.
2. State v. Abernathy, No. 3599-9-11 (Vt. Super. Ct. June 1, 2012).

Cross-posted to Forensic Science, Statistics, and Law Blog