How strong are the justifications for retaining DNA samples after the identifying profiles are entered into a law enforcement database? According to the majority of the Ninth Circuit panel in Kriesel III, the “primary justification” is that “match confirmation” ensures “the continued accuracy and integrity of the CODIS system.” That is … [continued on the FSSL blog].
On November 9, 2012, the Supreme Court voted to review a case posing the following question: “Does the Fourth Amendment allow the States to collect and analyze DNA from people arrested and charged with serious crimes?” In Maryland v. King, the state’s supreme court concluded that the protection against unreasonable searches and seizures forbids the state from collecting DNA from an individual whose true identity can be established with ordinary fingerprints. On December 28, 2012, the Supreme Court received a Brief of Genetics, Genomics and Forensic Science Researchers as Amici Curiae. Below are ten questions and answers about the brief.
Who contributed to the brief?
I did, and Hank Greely was an additional author. The scientists who participated in the writing are all active and distinguished researchers at medical schools (including Harvard, Yale, and Johns Hopkins) or universities (including Duke, Penn State, and Kings College, London). They include a former president of the American Society of Human Genetics, a past president of the American Board of Medical Genetics, Fellows of the American Association for the Advancement of Science, and members of the Institute of Medicine and the American Academy of Arts and Sciences.
Why did these law professors, medical and statistical geneticists, and molecular biologists submit an amicus brief?
The brief is intended “to inform the Court of the possible medical and social significance of the DNA data stored in law enforcement databases.” (P. 1). Advocacy groups, legal scholars, and some judges have asserted that the small number of features used in law enforcement DNA databases are predictive of health status (or soon will be). The brief attempts to clarify this issue.
Which side does the brief support?
The brief was submitted in support of neither side. It describes the nature of genetic information, the features of the genome used in law enforcement DNA databases, how those features are used in medical research, and whether they currently permit police, employers, or insurers to discern significant facts about a person’s present or future health status.
What conclusions does it reach?
Amici conclude that “[u]nlike medical genetic tests, law enforcement identification profiles have no known value for medical diagnosis or prediction of future health.” (P. 2).
That’s today. What about the future?
Amici caution that “no one can say with certainty what the future will bring, and it is possible that specific loci will be found to affect the operation of certain genes or to display correlations to disease states.” (P. 2). Nevertheless, they suggest that “it is unlikely that the identification profiles will turn into powerful medical diagnostic or predictive tools that can be used to infer disease states or predispositions by examining forensic database records.” (P.2).
Does this mean that the “CODIS loci,” as the identifying features are called, have no medical significance?
Absolutely not. The DNA sequences have been used in medical research for some 20 years to hunt for disease-causing gene mutations. They have been studied for associations with diseases and traits such as longevity. The question the brief addresses is what kind of information can be gleaned from inspecting a database record.
Doesn’t the highly publicized ENCODE Project prove that there is no such thing as “junk DNA”?
The brief contends that “debate over the fraction of the genome that is, in an evolutionary sense, ‘junk’ … is orthogonal to the matter before the Court.“ (P. 26). A section of the brief explains that the data sets and papers recently released from the international Encyclopedia of DNA Elements Project are important to further research into gene regulation and other matters, but they do not indicate that all DNA sequences are critical to health or other important traits. What “[t]he ENCODE papers show [is] that 80% of the genome displays signs of certain types of biochemical activity–even though the activity may be insignificant, pointless, or unnecessary.” (P. 32).
Well, how about other uses? Don’t the CODIS loci tell scientists a lot about a person’s ancestry and race?
Not really. The CODIS loci can reveal something about bio-geographic ancestry, but anthropologists and population geneticists use far more probative ancestry-informative and lineage markers to study genetic histories. That “race” is not a biological category is now well known. As for socially perceived race, “[a] CODIS profile could be used to calculate probabilities that someone would be described as Caucasian, African-American, or Hispanic, but categorical inferences would not be very accurate, and attempts to predict the census-type race of a person from a CODIS profile would seem pointless considering that apparent race already would be known.” (P. 36).
So the brief shows that there is absolutely no important information that can be deduced from a CODIS profile?
No, amici do not say that either. The brief explains that “[b]ecause children inherit all their DNA from their biological parents, the CODIS loci can be powerful tools for determining whether two people could be genetically related as parent and child. … [T]he most powerful genetic information other than identity that the CODIS profiles contain [would be] that two people are not parent and child” or “that two people were identical twins.” (Pp. 33-34).
Where can I find the brief?
Cross-posted to Forensic Science, Statistics, and the Law.
A few reporters have inquired about the Texas tale concerning “hundreds of dried blood samples [shipped] to the federal government to help build a vast DNA database–a forensics tool designed to identify missing persons and crack cold cases.”  Earlier postings (on March 4 and 15) explained that contrary to the impression created in a series of Texas Tribune stories, the Guthrie cards were not destined for a data base like the FBI’s National DNA Index System (NDIS) that matches DNA recovered from crime scenes against stored DNA profiles from convicted offenders or arrestees.
This was not to say that, as a matter of public policy and research ethics, the state Department of Health Services should have released its Guthrie cards for any research without first securing parental consent. That is an issue about which reasonable minds can and do differ. In judging the propriety of these decisions, however, it is vital to understand the true privacy risks that the dissemination of the samples entailed. Indeed, inasmuch as plaintiffs are demanding that the Armed Forces DNA Identification Laboratory (AFDIL) return the cards (which would lead to their incineration under Texas’s current policy), the question is crucial to the continuing tussle in Texas. If there is a substantial privacy risk, the demand obviously has more merit than if there no such risk. Furthermore, because plaintiffs also want the researchers to extirpate the DNA sequence data in its anonymized research database, the issue is of national concern. The research database, although small, is important to a fair presentation of mtDNA matches in all criminal cases and to the correct interpretation of these findings during an investigation. This posting therefore provides an assessment of some of the risks that children conceivably could face from the presence of their mtDNA sequences in the research database and from their Guthrie cards in the laboratory’s files.
I. Use of the AFDIL Research Database
The population-genetics database for mitotypes poses a rather remote threat to the Texas newborns. The FBI explained how such databases work in 1999:
The FBI Laboratory, the Armed Forces DNA Identification Laboratory, and other laboratories have collaborated to compile a mtDNA population database . … The database is referred to as the SWGDAM (Scientific Working Group on DNA Analysis Methods) database. It contains sequences from four main racial groups: Caucasians, Africans, Hispanics, and Asians. Most of these samples have been obtained from paternity-testing laboratories, blood banks, or academic groups studying ethnic populations. The database currently contains 2,426 mtDNA sequences from unrelated individuals. However, the database is updated frequently and is constantly growing. …
When a sequence from a questioned sample [one found at a crime-scene] and a known [suspect’s] sample is the same, the SWGDAM database is searched for this sequence. … The FBI Laboratory lists the number of observations of a sequence in each racial subgroup of the database in a report of a mtDNA examination. For example, a sequence might be seen five times in the database samples of Caucasian descent and one time in the database samples of Hispanic descent yet not appear in the remaining database subgroups. …
[�] Most of the sequences in the forensic mtDNA database occur a single time (approximately 60 percent), and the total number of mtDNA sequences in the entire human population is not known. Reliable frequency estimates for most mtDNA sequences are therefore not possible [because] small databases are not effective tools for estimating frequencies of rare events.
[�] However, statistical methods exist for calculating an upper-bound estimate of the frequency of mtDNA types with zero occurrences or very few occurrences in a database of limited size. This upper-bound estimate describes the highest frequency expected for a particular mtDNA sequence using the database. … As the database grows in size, the frequency estimates for individual mtDNA profiles will become more and more refined and eventually lead to reliable population frequency estimates. 
When used to estimate the frequency of a mitotype in a crime-scene sample that matches a defendant in a criminal case, the population-genetics database poses no risk that a Texas baby will be accused of a crime–correctly or otherwise. But one can imagine a different scenario: Suppose that Inspector Javert is pursuing the perpetrator of a horrific crime in Texas. The usual suspects have excellent alibis. A search of the Texas DNA database of convicted offenders draws a blank. A search of NDIS also comes up negative. Javert, who never gives up, takes the mtDNA sequence from the crime scene and compares it to the sequences in the anonymized population-genetics database. Voila! He finds a match.
Javert demands that the custodians of the population reference database tell him whether it came from the 800 or so Texas samples. It does. Now he could go to the state health department to obtain the names of the 800 suspects (if such records still exist). Or better, if the cards supplied to AFDIL retained their original numbers and if the health department retained the numbers linked to personally identifying information, then Javert could find his way to one family, and he could investigate whether anyone in that maternal lineage could be the culprit. There might be other families with the same mitotype, but Javert at least would have found a lead.
Although the Javert scenario is fictional, it is not impossible. Even anonymized population reference mtDNA databases could lead the police to a family in some situations. But it’s a stretch.
II. Genetic Discrimination
Every tissue repository contains human biological material that could be tested for genetic markers or predictors of various diseases. Most states have laws to prevent insurance companies and employers from conducting or using such genetic test results, and the recent federal Genetic Information Nondiscrimination Act (GINA) provides comprehensive national protection as well. Considering that the AFDIL research samples came with no names attached to them, the risk that the children will face “genetic discrimination” if the cards are retained seems very small indeed.
III. Leakage into NDIS
The National DNA Index System used to find “cold hits” to convicted offenders in criminal investigations contains over 7,000,000 STR profiles. It is possible to extract STR profiles from the Texas cards. But adding these 800 or so STR profiles to the database would violate the federal law establishing the Convicted Offender DNA Index System (CODIS). Furthermore, lacking identifying information, the database administrators would not find them terribly useful. A hit from an unsolved crime to one of these profiles would mean that one of 800 Texas babies has grown up to deposit DNA at a crime scene. Our Investigator Javert then could find his way to a single suspect–if the cards supplied to AFDIL retained their original numbers, if the illegal NDIS record kept track of this number, and if the health department retained the numbers linked to personally identifying information. Still, the full scenario–that AFDIL researchers would supply the cards to the FBI, that the FBI would analyze the STRs and illegally add them to the operational database, and that a hit in the database would lead to a suspect–is strained.
IV. Leakage into the National Missing Persons Database
The FBI maintains a National Missing Persons database, also known as CODIS(mp).  When a child is missing, a family member with the same mitotype (anyone in the same maternal lineage) can supply a DNA sample for mtDNA sequencing. The mitotype will be kept in the missing persons database to be checked against mtDNA extracted from unidentified human remains that come to the attention of the police. A hit between the mtDNA from the remains and the family member’s DNA serves to identify the remains as the reported missing person’s. It would make little sense to include several hundred de-identified samples from Texas newborns in this database.
I would not claim that the retention of the de-identified Texas samples or the presence of the anonymized mtDNA sequences in the population-genetics research database poses absolutely no risk of someday implicating today’s newborns in a criminal investigation. But the pertinent scenarios seem farfetched. The real population genetics database bears little resemblance to “a vast DNA database” for finding missing persons and solving criminal cases. The risk it poses to the Texas newborns and their families is minimal.
1. Emily Ramshaw, DNA Deception, Texas Tribune, Feb. 22, 2010, http://www.texastribune.org/stories/2010/feb/22/dna-deception/, last viewed, March 2, 2010
2. Alice R. Isenberg & Jodi M. Moore, Mitochondrial DNA Analysis at the FBI Laboratory, For. Sci. Commun., July 1999 Vol. 1 No. 2, 1999, available at http://www.pocketexpert.net/files/U.pdf (last viewed April 10, 2010).
 Nancy Ritter, Missing Persons and Unidentified Remains: The Nation’s Silent Mass Disaster, NIJ Journal, No. 257 (2007), available at http://www.ojp.usdoj.gov/nij/journals/256/missing-persons.html (last viewed April 10, 2010)
� 2010 David H. Kaye
The faculty Senate weighed in on the University of Akron’s new DNA-profiling policy described in the October 31 posting, “Foolishness in Akron Raises a Serious Question about GINA.” Its resolution of November 5th characterizes the DNA sampling requirement as “of doubtful legality,” overbroad, and counterproductive. Not much to argue with here (although an adequate legal analysis of the policy’s legality under GINA is not trivial).
One of the Senate’s arguments, however, seems hyperbolic. The resolution states that the Board of Trustees’ new regulation “poses a serious threat to the personal privacy of University employees, not least because of the likelihood that DNA records submitted to the Federal Bureau of Investigation will remain in its database.” But this outcome is likely only if the FBI has the statutory authority to include in CODIS DNA identification profiles of not only those individuals who have been arrested or convicted of certain crimes, but of everyone applying for a job with the university.
As originally enacted, the DNA Identification of 1994 authorized the FBI to “establish an index of–(1) DNA identification records of persons convicted of crimes; (2) analyses of DNA samples recovered from crime scenes; and (3) analyses of DNA samples recovered from unidentified human remains.” 42 U.S.C. � 14132(a). In 1999, the Act was amended to include “(4) analyses of DNA samples voluntarily contributed from relatives of missing persons.” Plainly, these provisions do not authorize the inclusion of job applicants in the Convicted Offender DNA Index System.
In 2004, � 14132(a)(1) was broadened to encompass “DNA identification records of–(A) persons convicted of crimes; (B) persons who have been charged in an indictment or information with a crime; and (C) other persons whose DNA samples are collected under applicable legal authorities, provided that DNA profiles from arrestees who have not been charged in an indictment or information with a crime, and DNA samples that are voluntarily submitted solely for elimination purposes shall not be included in the National DNA Index System.” Finally, in 2006, the categories became “(A) persons convicted of crimes; (B) persons who have been charged in an indictment or information with a crime; and (C) other persons whose DNA samples are collected under applicable legal authorities, provided that DNA samples that are voluntarily submitted solely for elimination purposes shall not be included in the National DNA Index System.”
This progression reflects a desire to share the fruits of state laws that require mere arrestees to provide DNA samples. Federal law currently permits the FBI to include in the set of records for CODIS searches the profiles of individuals arrested for violations of various state and federal criminal statutes. The interesting legal question is whether the recent expansion of � 14132(a)(1)(C) goes beyond arrestees. If a state such as Ohio were to change its DNA database statute to permit the profiles of applicants for government jobs to be added to its state database, would the FBI be allowed to include this group in its CODIS searches for other states? If the hypothetical Ohio law were an “applicable legal authority,” then this result would seem to follow from the text of the amended Act. Yet, I doubt that the broadening of the 1994 DNA Identification Act was meant to go beyond the incorporation of arrestee profiles in CODIS searches. Perhaps the legislative history of the 2004 and 2006 amendments would shed some light on this question. Comments are welcome.
Declan McCullagh, University Backs Away From New-Hire DNA Testing, CBS News Blogs: Taking Liberties, Nov. 6, 2009, http://www.cbsnews.com/blogs/2009/11/06/taking_liberties/entry5545118.shtml, last accessed Nov. 9, 2009