Tag Archives: population genetics databases

The Feds’ “Vast DNA Database”

A few reporters have inquired about the Texas tale concerning “hundreds of dried blood samples [shipped] to the federal government to help build a vast DNA database–a forensics tool designed to identify missing persons and crack cold cases.” [1] Earlier postings (on March 4 and 15) explained that contrary to the impression created in a series of Texas Tribune stories, the Guthrie cards were not destined for a data base like the FBI’s National DNA Index System (NDIS) that matches DNA recovered from crime scenes against stored DNA profiles from convicted offenders or arrestees.

This was not to say that, as a matter of public policy and research ethics, the state Department of Health Services should have released its Guthrie cards for any research without first securing parental consent. That is an issue about which reasonable minds can and do differ. In judging the propriety of these decisions, however, it is vital to understand the true privacy risks that the dissemination of the samples entailed. Indeed, inasmuch as plaintiffs are demanding that the Armed Forces DNA Identification Laboratory (AFDIL) return the cards (which would lead to their incineration under Texas’s current policy), the question is crucial to the continuing tussle in Texas. If there is a substantial privacy risk, the demand obviously has more merit than if there no such risk. Furthermore, because plaintiffs also want the researchers to extirpate the DNA sequence data in its anonymized research database, the issue is of national concern. The research database, although small, is important to a fair presentation of mtDNA matches in all criminal cases and to the correct interpretation of these findings during an investigation. This posting therefore provides an assessment of some of the risks that children conceivably could face from the presence of their mtDNA sequences in the research database and from their Guthrie cards in the laboratory’s files.

I. Use of the AFDIL Research Database

The population-genetics database for mitotypes poses a rather remote threat to the Texas newborns. The FBI explained how such databases work in 1999:

The FBI Laboratory, the Armed Forces DNA Identification Laboratory, and other laboratories have collaborated to compile a mtDNA population database . … The database is referred to as the SWGDAM (Scientific Working Group on DNA Analysis Methods) database. It contains sequences from four main racial groups: Caucasians, Africans, Hispanics, and Asians. Most of these samples have been obtained from paternity-testing laboratories, blood banks, or academic groups studying ethnic populations. The database currently contains 2,426 mtDNA sequences from unrelated individuals. However, the database is updated frequently and is constantly growing. …

When a sequence from a questioned sample [one found at a crime-scene] and a known [suspect’s] sample is the same, the SWGDAM database is searched for this sequence. … The FBI Laboratory lists the number of observations of a sequence in each racial subgroup of the database in a report of a mtDNA examination. For example, a sequence might be seen five times in the database samples of Caucasian descent and one time in the database samples of Hispanic descent yet not appear in the remaining database subgroups. …

[�] Most of the sequences in the forensic mtDNA database occur a single time (approximately 60 percent), and the total number of mtDNA sequences in the entire human population is not known. Reliable frequency estimates for most mtDNA sequences are therefore not possible [because] small databases are not effective tools for estimating frequencies of rare events.

[�] However, statistical methods exist for calculating an upper-bound estimate of the frequency of mtDNA types with zero occurrences or very few occurrences in a database of limited size. This upper-bound estimate describes the highest frequency expected for a particular mtDNA sequence using the database. … As the database grows in size, the frequency estimates for individual mtDNA profiles will become more and more refined and eventually lead to reliable population frequency estimates. [2]

When used to estimate the frequency of a mitotype in a crime-scene sample that matches a defendant in a criminal case, the population-genetics database poses no risk that a Texas baby will be accused of a crime–correctly or otherwise. But one can imagine a different scenario: Suppose that Inspector Javert is pursuing the perpetrator of a horrific crime in Texas. The usual suspects have excellent alibis. A search of the Texas DNA database of convicted offenders draws a blank. A search of NDIS also comes up negative. Javert, who never gives up, takes the mtDNA sequence from the crime scene and compares it to the sequences in the anonymized population-genetics database. Voila! He finds a match.

Javert demands that the custodians of the population reference database tell him whether it came from the 800 or so Texas samples. It does. Now he could go to the state health department to obtain the names of the 800 suspects (if such records still exist). Or better, if the cards supplied to AFDIL retained their original numbers and if the health department retained the numbers linked to personally identifying information, then Javert could find his way to one family, and he could investigate whether anyone in that maternal lineage could be the culprit. There might be other families with the same mitotype, but Javert at least would have found a lead.

Although the Javert scenario is fictional, it is not impossible. Even anonymized population reference mtDNA databases could lead the police to a family in some situations. But it’s a stretch.

II. Genetic Discrimination

Every tissue repository contains human biological material that could be tested for genetic markers or predictors of various diseases. Most states have laws to prevent insurance companies and employers from conducting or using such genetic test results, and the recent federal Genetic Information Nondiscrimination Act (GINA) provides comprehensive national protection as well. Considering that the AFDIL research samples came with no names attached to them, the risk that the children will face “genetic discrimination” if the cards are retained seems very small indeed.

III. Leakage into NDIS

The National DNA Index System used to find “cold hits” to convicted offenders in criminal investigations contains over 7,000,000 STR profiles. It is possible to extract STR profiles from the Texas cards. But adding these 800 or so STR profiles to the database would violate the federal law establishing the Convicted Offender DNA Index System (CODIS). Furthermore, lacking identifying information, the database administrators would not find them terribly useful. A hit from an unsolved crime to one of these profiles would mean that one of 800 Texas babies has grown up to deposit DNA at a crime scene. Our Investigator Javert then could find his way to a single suspect–if the cards supplied to AFDIL retained their original numbers, if the illegal NDIS record kept track of this number, and if the health department retained the numbers linked to personally identifying information. Still, the full scenario–that AFDIL researchers would supply the cards to the FBI, that the FBI would analyze the STRs and illegally add them to the operational database, and that a hit in the database would lead to a suspect–is strained.

IV. Leakage into the National Missing Persons Database

The FBI maintains a National Missing Persons database, also known as CODIS(mp). [3] When a child is missing, a family member with the same mitotype (anyone in the same maternal lineage) can supply a DNA sample for mtDNA sequencing. The mitotype will be kept in the missing persons database to be checked against mtDNA extracted from unidentified human remains that come to the attention of the police. A hit between the mtDNA from the remains and the family member’s DNA serves to identify the remains as the reported missing person’s. It would make little sense to include several hundred de-identified samples from Texas newborns in this database.

Conclusion

I would not claim that the retention of the de-identified Texas samples or the presence of the anonymized mtDNA sequences in the population-genetics research database poses absolutely no risk of someday implicating today’s newborns in a criminal investigation. But the pertinent scenarios seem farfetched. The real population genetics database bears little resemblance to “a vast DNA database” for finding missing persons and solving criminal cases. The risk it poses to the Texas newborns and their families is minimal.

References

1. Emily Ramshaw, DNA Deception, Texas Tribune, Feb. 22, 2010, http://www.texastribune.org/stories/2010/feb/22/dna-deception/, last viewed, March 2, 2010

2. Alice R. Isenberg & Jodi M. Moore, Mitochondrial DNA Analysis at the FBI Laboratory, For. Sci. Commun., July 1999 Vol. 1 No. 2, 1999, available at http://www.pocketexpert.net/files/U.pdf (last viewed April 10, 2010).

[3] Nancy Ritter, Missing Persons and Unidentified Remains: The Nation’s Silent Mass Disaster, NIJ Journal, No. 257 (2007), available at http://www.ojp.usdoj.gov/nij/journals/256/missing-persons.html (last viewed April 10, 2010)

� 2010 David H. Kaye

Up in Smoke: 5 Million Neonatal Blood Samples Incinerated

An effort to avoid the pointless destruction of the millions of Guthrie cards maintained by the Texas Department of Health Services has come to naught. Plaintiffs who sued the department as well as privacy advocates initially were open to the idea of preserving all the cards with neonatal bloodspots for future research while seeking consent for their storage from millions of parents. [1]

However, this was not to be. After the state quickly settled the dubious lawsuit, an enterprising but inadequately informed journalist published Internet stories alleging that the department had turned “over hundreds of dried blood samples to the federal government to help build a vast DNA database–a forensics tool designed to identify missing persons and crack cold cases” and that the samples “were forwarded along to the federal government to create a vast DNA database, one that could help crack cold cases and identify missing persons.” [2] In her latest installment of this tall tale she continues to write that the samples will “help identify missing persons and crack cold cases.” [1]

The suggestion that the U.S. military is using the samples to build a database to “crack cold cases” or to identify “missing persons” in this country is preposterous. To summarize a previous posting [2]: First, the research project is limited to mitochondrial DNA, which rarely is used in forensic investigations because it is not capable of providing specific identification. Second, AFDIL does not maintain any databases of DNA profiles to crack cold cases. Third, even if AFDIL were authorized to maintain a database of civilian DNA profiles for criminal investigations, a collection of nameless mtDNA sequences from de-identified samples would be pretty useless. Finally, the true purpose of the research is clear from “Federal MtDNA Paper” posted on the Tribune‘s website. The AFDIL paper explains that the research database, which cannot be used to identify individuals, simply allows geneticists to put estimates of random-match probabilities for mtDNA on a sounder footing. These estimates are necessary to understand the probative value of an mtDNA match in any criminal investigation or trial. They have nothing in particular to do with cold hits or missing persons.

In sum, the research database has virtually no meaningful privacy implications. Some parents might not want their children’s blood samples used to improve the criminal justice system, but that alone is not much of a reason to destroy what the article calls a medical “treasure trove.” The children’s DNA is not going into any military or law-enforcement database for tracking down missing persons or cracking cold cases.

Yet, this fear apparently was the monkey wrench that jammed the effort to preserve the samples while seeking consent. Here are some excerpts from the latest news as described by the same journalist:

[T]he Department of State Health Services . . . agreed in December to destroy the blood spots, after a civil rights attorney and several Texas parents sued the state for storing them for research purposes without permission. But after the court settlement was signed, privacy advocates lobbied the agency for an alternate solution: a research database that would keep the blood spots intact while seeking electronic consent from parents. They got the go-ahead from some key lawmakers and from the lawsuit’s plaintiffs, who pledged to void the settlement, but not from DSHS.

When The Texas Tribune discovered last month that state health officials had turned hundreds of baby blood spots over to a federal Armed Forces lab between 2003 and 2007 to build a mitochondrial DNA database . . . any chance for saving the blood spots fizzled out. All 5 million blood spots were sent to a Houston-area incinerator last week.

“If there was any way the blood spots were going to be saved, the whole thing fell apart at that point,” said state Sen. Bob Deuell, R-Greenville, . . . “When this came out about these specimens going to the military, I said, ‘We’ve lost this one.'”

. . . State health officials say Austin-based national patient privacy advocate Deborah Peel and Deuell, a physician, approached DSHS Commissioner David Lakey early this year about using electronic consents to save the 5 million existing blood spots from destruction. The agency reviewed the idea but never pursued it. . . .

Critics say . . . DSHS conveniently settled the lawsuit before the trial went to the discovery phase, meaning the documents on the federal DNA study were never disclosed to the plaintiffs. (The Tribune obtained the documents on the federal project — designed to build a forensics tool to help identify missing persons and crack cold cases — through Texas open-records laws.) “Unfortunately, that of course confirmed the plaintiffs’ worst fears,” said Peel, founder of the nonprofit advocacy group Patient Privacy Rights.

Peel said the state’s decision not to seek a non-destructive solution is a shame. . . . “We were going to … reach out to those 5 million families and let them know they had an alternative to having their blood spots destroyed,” Peel said. . . .

Deuell said the impression he got from state health officials was that they feared they would be subject to litigation from other parents if they negotiated with the plaintiffs not to destroy the blood spots. . . . “They said, ‘The plaintiffs are just three people out of 5 million. Who’s to say somebody else wouldn’t come back and file a new suit?'”

Harrington [plaintiffs’ attorney] said that worry is “utter nonsense.” He said both sides could have gone back to the judge to have a new settlement drafted — one that would’ve protected the agency. “What’s the harm in that?” Harrington asked. “We would have supplemented or amended the settlement. It would have been totally possible.”

But once news broke that some of the blood spots had been turned over to the federal lab — and that the state had no intention of destroying those samples — the plaintiffs’ offer was off the table. Instead, they have demanded that the state get the blood spots back from the federal government, or they’ll file another lawsuit. . . . [1]

Maybe another lawsuit would be a good thing. With competent lawyering and journalism, the people of Texas finally might realize that none of their children’s DNA has found its way into any DNA database for identifying anyone.

References

1. Emily Ramshaw, DNA Destruction, Tex. Tribune, http://www.texastribune.org/stories/2010/mar/09/blood-drive/

2. David H. Kaye, A Texas Tall Tale of “DNA Deception,” Double Helix Law, Mar. 4, 2010 https://sites.psu.edu/dhlaw/2010/03/04/a-texas-tall-tale-of-dna-deception/