Monthly Archives: September 2012

Dear Judges: A Letter from the Electronic Frontier Foundation to the Ninth Circuit

On the eve of the en banc oral argument in Haskell v. Harris, The Electronic Frontier Foundation (EFF) filed a letter asking “the Court to consider the ENCODE project findings in determining the outcome of this case.” It seems hard to oppose the idea that the court should consider relevant scientific research, but without input from the scientific community, will the judges do better than they have in the past as “amateur scientists” (to use the skeptical phrase of Chief Justice Rehnquist in Daubert v. Merrell Dow Pharmaceuticals, Inc.)?

Deciphering the ENCODE papers’ descriptions of the data is no easy task, and EFF’s lawyers do not seem to be up to it. Their letter asserts that the project “has determined that more than 80% of DNA once thought to be no more than ‘junk’ has at least one biochemical function, controlling how our cells, tissue and organs behave.” This is not a fair characterization of the findings. Which geneticist ever claimed that all noncoding DNA plays no role in how cells behave? The issue always has been how much junk, how much func — and what “functions”?

What does EFF mean by “controlling”? Making organs function? Stimulating tissue growth? Turning normal cells into cancerous ones? Making us tall or short, fat or skinny, gay or straight? None of those things are mentioned in the Nature cover story cited in the letter. Instead, the EFF relies on New York Times reporter Gina Kolata’s misleading news article for the letter’s claim that “The ENCODE project has determined that ‘junk’ DNA plays a critical role in determining a person’s susceptibility to disease and physical traits like height.”

My earlier postings described the limited meaning of the phrase “biochemical function” in the cited paper. I’d love to see a citation to a page of an ENCODE paper that asserts that fully 80% of the noncoding DNA is determining “susceptibility to disease and physical traits like height.” And if I were a judge, I would demand an explanation of why “physical traits like height” are, in the words of the EFF letter, “sensitive and private.”

After the judges consider the ENCODE papers (by having their law clerks read them?), will they be better informed about the actual privacy implications of the CODIS loci than they were before this excursion into this realm of the bioinformatics? I would not bet on it, but maybe I am growing cynical.

On the “Clear” Outcome Under “Established” Law

Today’s New York Times included an editorial (California and the Fourth Amendment) on Haskell v. Harris, the challenge to the California Proposition requiring DNA sampling on arrest. En banc oral argument takes place today. The following is a letter I sent to the Times editor. I expect somewhere between 0 and 50 percent of it to be published there (point estimate = 0):

Dear Editor,

Your editorial (September 19) asserts that the constitutionality of taking DNA on arrest “should be clear” given “established rights against unreasonable search and seizure.” Yet, over vigorous dissents, federal courts of appeals have ruled otherwise–twice in panels of the Ninth Circuit and once (en banc) in the Third Circuit.

Whether acquiring purely biometric data from arrestees necessitates a warrant is doubtful, and whether acquiring DNA data is “unreasonable” is a close question. The physical invasion of personal security is minor when the individual is already in custody and the sampling is only marginally more intrusive than fingerprinting. The medical information content of the identification profile is (given current knowledge) only slightly more significant than that of a fingerprint. Very few false convictions arising from DNA database searches have been documented. (One in Australia has been reported.)

Contrary to the suggestion in the editorial, what divided the judges in the Ninth Circuit was not whether “the law’s real purpose was investigation.” No one doubted that. The dissenting judge believed that the Supreme Court already had decided that “fingerprints may not be taken from an arrestee solely for an investigative purpose, absent a warrant or reasonable suspicion that the fingerprints would help solve the crime for which he was taken into custody.” What the Court actually held was “that transportation to and investigative detention at the station house without probable cause or judicial authorization together violate the Fourth Amendment.” The dissenting judge also worried, among other things, that “it is possible that … at some future time,” an identification profile might permit strong inferences about the diseases an arrestee has or might develop.

I do not claim that arrestee DNA sampling clearly is constitutional. There are a number of valid concerns about indefinite sample retention and other matters. Neither do I maintain that its benefits (which are not well quantified) plainly outweigh its costs and its impact on legitimate interests in personal privacy and security. But assertions that the balance is “clear” and that the “established” law dictates the result oversimplify a delicate constitutional question.

Trashing Junk DNA: Alice in Genomeland

Earlier today, I introduced the concepts and terms required to ascertain whether the estimated proportion of the genome that encodes the structure of proteins or regulates gene expression has jumped from 5 or 10% to 80%. I now focus on the possible meanings of “functional” to see whether the ENCODE papers state or imply and such seismic change. It appears that they do not.

“Functional” is an adjective, and Alice learned from Humpty Dumpty that adjectives are malleable:

“When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean–neither more nor less.”
“The question is,” said Alice, “whether you can make words mean so many different things.”
“The question is,” said Humpty Dumpty, “which is to be master–that’s all.”
Alice was too much puzzled to say anything, so after a minute Humpty Dumpty began again. “They’ve a temper, some of them–particularly verbs, they’re the proudest–adjectives you can do anything with, but not verbs–however, I can manage the whole lot! Impenetrability! That’s what I say!”

Like Humpty, who was redefining the word “glory,” the ENCODE authors recognized that “functional” can have many meanings. As Ewan Birney later explained:

Like many English language words, “functional” is a very useful but context-dependent word. Does a “functional element” in the genome mean something that changes a biochemical property of the cell (i.e., if the sequence was not here, the biochemistry would be different) or is it something that changes a phenotypically observable trait that affects the whole organism?1/

Still other possibilities exist. For example, the first paper to use the adjective “junk” for noncoding DNA noted that even debris accumulated in the course of evolution or introduced from viral infections could have a function simply by creating spaces between genes.2/ The pieces of dead wood that are joined together to form the hull of a row boat have a function–they exclude the water from the vessel to keep it afloat. This does not mean that the detailed structure of the planks–the precise width of each plank or the number of ridges on its surface–affects its functionality. And, just as something can be inactive and functional, so too something can be alive with activity and yet be nonfunctional.

ENCODE uses biochemical activity–the notion that “the biochemistry would be different”–as a synonym for functional. Here is the definition of “functional” in the top-level paper:

Operationally, we define a functional element as a discrete genome segment that encodes a defined product (for example, protein or non-coding RNA) or displays a reproducible biochemical signature (for example, protein binding, or a specific chromatin structure).3/

This definition may be useful for the purpose of describing the size of ENCODE’s catalog of elements for later study, but it contrasts sharply with the notion of functional as affecting a nontrival phenotype. The ENCODE papers show that 80% of the genome displays signs of certain types of biochemical activity–even though the activity may be insignificant, pointless, or unnecessary. This 80% includes all of the introns, for they are active in the production of pre-mRNA transcripts. But this hardly means that they are regulatory or otherwise functional.4/ Indeed, if one carries the ENCODE definition to its logical extreme, 100% of the genome is functional–for all of it participates in at least one biochemical process–DNA replication.

That the ENCODE project would not adopt the most extreme biochemical definition is understandable–that definition would be useless. But the ENCODE definition is still grossly overinclusive from the standpoint of evolutionary biology. From that persective, most estimates of the proportion of “functional” DNA are well under 80%. Various biologists or related specialists have provided varying guestimates:

  • Under 50%: “About 1% … is coding. Something like 1-4% is currently expected to be regulatory noncoding DNA … . About 40-50% of it is derived from transposable elements, and thus affirmatively already annotated as “junk” in the colloquial sense that transposons have their own purpose (and their own biochemical functions and replicative mechanisms), like the spam in your email. And there’s some overlap: some mobile-element DNA has been co-opted as coding or regulatory DNA, for example. [�] … Transposon-derived sequence decays rapidly, by mutation, so it’s certain that there’s some fraction of transposon-derived sequence we just aren’t recognizing with current computational methods, so the 40-50% number must be an underestimate. So most reasonable people (ok, I) would say at this point that the human genome is mostly junk (“mostly” as in, somewhere north of 50%).”5/
  • 40%: “ENCODE biologist John Stamatoyannopoulos … said … that some of the activity measured in their tests does involve human genes and contributes something to our human physiology. He did admit that the press conference mislead people by claiming that 80% of our genome was essential and useful. He puts that number at 40%.”6/
  • 20%: “[U]sing very strict, classical definitions of “functional” [to refer only to] places where we are very confident that there is a specific DNA:protein contact, such as a transcription factor binding site to the actual bases–we see a cumulative occupation of 8% of the genome. With the exons (which most people would always classify as “functional” by intuition) that number goes up to 9%. … [�] In addition, in this phase of ENCODE we did [not] sample … completely in terms of cell types or transcription factors. [W]e’ve seen [at most] around 50% of the elements. … A conservative estimate of our expected coverage of exons + specific DNA:protein contacts gives us 18%, easily further justified (given our [limited] sampling) to 20%.”7/

So why did the ENCODErs opt for the broadest arguable definition of “functional”? Birney’s answer is that it describes a quantity that the project could measure; that the larger number underscores that a lot is happening in the genome; that it would have confused readers to receive a range of numbers; and that the smaller number would not have counted the efforts of all the researchers.

Whether these are very satisfactory reasons for trumpeting a widely misunderstood number is a matter that biologists can debate. All I can say is that (1) I have been unable to extract a clear number–whatever one should make of it–for a percentage of the genome that constitutes the regulatory elements–the promoters, enhancers, silencers, ncRNA “genes,” and so on; (2) this number is almost surely less than the 80% figure that, at first glance, one might have thought ENCODE was reporting; and (3) “functional element” as defined by the ENCODE Project is not a term that has clear or direct implications for claims of the law enforcement community that the loci used in forensic identification are not coding and therefore not informative.

Of course, none of this means that the description of the information content of the CODIS STRs traditionally presented by law enforcement authorities is correct. It simply means that even after this phase of ENCODE, there are still a huge number of base pairs that might or might not be regulatory or influence regulation and, hence, gene expression. The CODIS STRs might or might not be among them. Published reports suggest that they are not,8/ but the logic that just because a DNA sequence is noncoding (and nonregulatory), it conveys zero information about phenotype is flawed. It overlooks the possibility of a correlation between the nonfunctional sequence (because it sits next to an exon or a regulatory sequence).9/ Again, however, the published literature reviewing the CODIS STRs does not reveal any population-wide correlations that permit valid and strong inferences about disease status or propensity or other socially significant phenotypes.10/

Will this situation change? A thoughtful answer would take up a lot of space.11/ For now, I’ll just repeat the aphorism attributed to Yogi Berra, Neils Bohr, and Storm P: “It’s hard to make predictions, especially about the future.”


1. Ewan Birney, ENCODE: My Own Thoughts, Ewan’s Blog: Bioinformatician at Large, Sept. 5, 2012,

2. David E. Comings, The Structure and Function of Chromatin, in 3 Advances in Human Genetics 237, 316 (H. Harris & K. Hirschhorn eds. 1972) (“Large spaces between genes may be a contributing factor to the observation that most recombination in eukaryotes is inter- rather than intragenic. Furthermore, if recombination tended to be sloppy with most mutational errors occurring in the process, it would an obvious advantage to have it occur in intergenic junk.”). For more discussion of this paper, see T. Ryan Gregory, ENCODE (2012) vs. Comings (1972), Sept. 7, 2012,

3. Ian Dunham et al., An Integrated Encyclopedia of DNA Elements in the Human Genome, 489 Nature 57 (2012).

4. These regions do contain some RNA-coding sequences, and those small parts could be doing something interesting (producing RNAs that are regulatory or that defend against infection by viral DNA, for example), but this kind of activity does not exist in the bulk of the introns that are, under the ENCODE definition, 100% functional.

5. Sean Eddy, ENCODE Says What?, Sept. 8, 2012, He adds that:

[A]s far as questions of “junk DNA” are concerned, ENCODE’s definition isn’t relevant at all. The “junk DNA” question is about how much DNA has essentially no direct impact on the organism’s phenotype–roughly, what DNA could I remove (if I had the technology) and still get the same organism. Are transposable elements transcribed as RNA? Do they bind to DNA-binding proteins? Is their chromatin marked? Yes, yes, and yes, of course they are–because at least at one point in their history, transposons are “alive” for themselves (they have genes, they replicate), and even when they die, they’ve still landed in and around genes that are transcribed and regulated, and the transcription system runs right through them.

6. Faye Flam, Skeptical Takes on Elevation of Junk DNA and Other Claims from ENCODE Project, Sept. 12, 2012, Stamatoyannopoulos added that:

What the ENCODE papers … have to say about transposons is incredibly interesting. Essentially, large numbers of these elements come alive in an incredibly cell-specific fashion, and this activity is closely synchronized with cohorts of nearby regulatory DNA regions that are not in transposons, and with the activity of the genes that those regulatory elements control. All of which points squarely to the conclusion that such transposons have been co-opted for the regulation of human genes — that they have become regulatory DNA. This is the rule, not the exception.

7. Ewan Birney, ENCODE: My Own Thoughts, Ewan’s Blog: Bioinformatician at Large, Sept. 5, 2012,

8. E.g., Sara H. Katsanis & Jennifer K. Wagner, Characterization of the Standard and Recommended CODIS Markers, J. Forensic Sci. (2012).

9. E.g., David H. Kaye, Two Fallacies About DNA Databanks for Law Enforcement, 67 Brook. L. Rev. 179 (2001).

10. E.g., Sara H. Katsanis & Jennifer K. Wagner, Characterization of the Standard and Recommended CODIS Markers, J. Forensic Sci. (2012).

11. For my earlier, and possibly dated, effort to evaluate the likelihood that the CODIS loci someday will prove to be powerfully predictive or diagnostic, see David H. Kaye, Please, Let’s Bury the Junk: The CODIS Loci and the Revelation of Private Information, 102 Nw. U. L. Rev. Colloquy 70 (2007), and Mopping Up After Coming Clean About “Junk DNA”, Nov. 23, 2007.

Trashing Junk DNA: The Notorious 80%

Last week I noted some of the hyperbolic headlines accompanying the coordinated publication of a large number of datasets from the ENCODE Project . The abstract of the top-level paper begins as follows:

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.1/

Hoping to decipher these sentences, I have been reading about gene regulation. This modest effort stems from more than academic curiosity. If the popular and even some of the scientific press is to be believed, ENCODE has exorcized “junk DNA” from the body of scientific knowledge.2/ The bright light suddenly shining on the “dark matter” of the genome (to introduce another sloppy metaphor)3/ raises a giant question mark for the criminal justice system. Law enforcement authorities have always insisted that the snippets of DNA used to generate DNA identification profiles are just nonfunctional “junk.”4/ Now, according to New York Times science correspondent Gina Kolata,

As scientists delved into the “junk” — parts of the DNA that are not actual genes containing instructions for proteins — they discovered a complex system that controls genes. At least 80 percent of this DNA is active and needed. … [�] … The thought before the start of the project, said Thomas Gingeras, an Encode researcher from Cold Spring Harbor Laboratory, was that only 5 to 10 percent of the DNA in a human being was actually being used.5/

This juxtaposition of percentages suggests that the scientific community has shifted from the view that “only 5 to 10 percent” of the genome is functional (“needed” for the organism to function normally) to a sudden realization that 80% falls into this category.

But the more I read, the clearer it became that this description of a sudden phase transition in science is wildly inaccurate. Johns Hopkins biostatistian Steve Salzberg, in a provocative Simply Statistics podcast interview, describes the 80% figure touted in the ENCODE paper as irresponsible.6/ University of Toronto biochemist Lawrence Moran saw it as a repeat of a similar, problematic performance five years ago, at the conclusion of the pilot phase of ENCODE.7/ Responding to criticism, ENCODE Project leader Ewan Birney explained the new knowledge this way:

After all, 60% of the genome with the new detailed manually reviewed (GenCode) annotation is either exonic or intronic, and a number of our assays (such as PolyA- RNA, and H3K36me3/H3K79me2) are expected to mark all active transcription. So seeing an additional 20% over this expected 60% is not so surprising.8/

“Not so surprising”? A whopping 60%–not a minor 5 or 10%–was already estimated to be “active”? What is going on here?

The answer lies in the definition of some key terms (like exons, introns, and transcription) and requires a rudimentary understanding of the fundamentals of gene expression and its regulation in human beings. This posting presents the essential terminology and concepts. A sequel will apply them to explain what ENCODE’s “assign[ing] biochemical functions for 80% of the genome” means. Anyone who knows what RNA transcripts and transcription factors do can skip this first part (or can read it to let me know of my inaccuracies).

To avoid suspense, I shall lay out my conclusions here and now: (1) if ENCODE gives a clear number for a percentage of the genome that regulates genes–the promoters, enhancers, silencers, ncRNA “genes,” and so on–I have yet to find it; (2) this number is almost surely less than the 80% figure reported for functionality; and (3) “functional element” as defined by the ENCODE Project is not a term that has clear or direct implications for claims of the law enforcement community that the loci used in forensic identification are not coding and therefore not informative. Those claims of zero information are somewhat exaggerated, but that is another story. For now, I merely describe some basics of gene expression and regulation.

Genes make proteins. But how? There are three big steps (with many activities within each step): transcription; post-transcription modification and transportation; and translation. All involve RNA, a single-stranded molecule related to DNA, and proteins. The basic picture is

  • Transcription to precursor messenger RNA: DNA + proteins –> pre-mRNA (in nucleus)
  • Post-transcriptional modification and transportation: pre-mRNA + proteins and RNAs -> mature m-RNA (in cytoplasm)
  • Translation to protein: mRNA + tRNA and proteins –> expressed protein (in cytoplasm)

In the first big step, the base pairs of the gene are transcribed jot-for-jot into an RNA molecule (precursor messenger RNA, or pre-mRNA). In the second major step, the transcript is modified at its ends, edited to remove parts that do not code for the protein that will be made (splicing), and the mature messenger RNA (m-RNA) is moved outside the nucleus. In the third phase, another type of RNA (transfer RNA, or tRNA) stitches together individual amino acids in the order dictated by the m-RNA transcript to form a protein, thereby translating the DNA sequence mirrored in the mRNA into the amino-acid order of the protein. Translation occurs on a kind of microscopic workbench (a ribosome) made of yet another RNA (ribosomal RNA, or rRNA).

For all this to happen, the DNA, which lies tightly coiled in the chromosomes (in a protein-DNA matrix known as chromatin), must open up for transcription to occur. Thus, changes in the chromatin regulate transcription, and these changes can be brought about in a number of ways. Transcription factors (specialized proteins) bind to the DNA. The bound transcription factors then recruit an enzyme (RNA polymerase) that produces RNA. This occurs within a region of DNA, known as a promoter, near the start of the protein-coding DNA (the structural gene). The level of transcription is influenced by activator or repressor proteins that bind to still other small regions (enhancers and silencers, respectively) that also lie outside the structural gene. In short, chemical interactions that open or close the chromatin that houses the DNA and transcription factors regulate the first step in the DNA-to-protein process.

In the past decade, other mechanisms of regulation or control of gene expression have been discovered. Many DNA sequences are not transcribed into messenger RNA, but they are transcribed into a variety of other RNAs. These non-protein-coding DNA sequences can be thought of as genes for RNA. Courting confusion, they usually are called “noncoding” (ncDNA)–because they do not code for protein–but they certainly code for RNAs that are crucial to translation–rRNA and tRNA–and for other RNAs that affect transcription, translation, and DNA replication. So it turns out that the genome is abuzz with transcription-to-RNA activity and other events that feed into the expression of the (protein-)coding DNA.

Yet, this hardly means that every biochemical event along the DNA is functionally important. Some, perhaps many, non-mRNA transcripts are just “noise.” They may float around for a while, but they may not do anything except wither away. In addition, large segments of the DNA transcribed in the course of making mRNA appear in the initial transcript (the pre-mRNA) but never make it into mature mRNA. These unused parts of the pre-mRNA transcripts correspond to long stretches of DNA, known as introns, that interrupt the smaller coding parts–the exons–that are translated into proteins. The initially transcribed intronic parts are removed from the pre-mRNA in a process called RNA splicing. Most of the RNA from introns probably just dissipates.9/

All these terms are a mouthful, but armed with this basic understanding of genes, RNA, and proteins, we can see why the 80% figure does not mean what one might think. We shall also see that the estimated proportion of the genome that encodes the structure of proteins or regulates gene expression has not jumped from 5 or 10% to 80%.


1. Ian Dunham et al., An Integrated Encyclopedia of DNA Elements in the Human Genome, 489 Nature 57 (2012).

2. E.g., Elizabeth Pennisi, ENCODE Project Writes Eulogy for Junk DNA, 337 Science 1159 (2012).

3. E.g., Gina Kolata, Bits of Mystery DNA, Far From ‘Junk,’ Play Crucial Role, N.Y. Times, Sept. 5, 2012. In one respect, the “dark matter” metaphor misrepresents dark matter. The presence of dark matter is inferred from its gravitational effects on visible matter. The presence of noncoding DNA is known from experiments that detect and characterize it just as they do coding DNA. Perhaps the metaphor means that the sequence of “dark matter” DNA cannot be deduced from the structure of a protein made in a cell. This, however, is like saying that dark matter is matter than cannot be seen with the naked eye. And that is not what astronomers mean by dark matter.

4. E.g., House Committee on the Judiciary, Report on the DNA Analysis Backlog Elimination Act of 2000, 106th Cong., 2d Sess., H.R. Rep. No. 106-900(1), at 27 (“the genetic markers used for forensic DNA testing … show only the configuration of DNA at selected ‘junk sites’ which do not control or influence the expression of any trait.”); New York State Law Enforcement Council, Legislative Priorities 2012: DNA at Arrest, at 5, (“The pieces of DNA that are analyzed for the databank were specifically chosen because they are ‘junk DNA.’).

5. Kolata, supra note 3.

6. Interview by Roger Peng with Steven Salzberg, podcast on Simply Statistics, Sept. 7, 2012, (“Why do they feel a need to say that 80% of the genome is functional? … They know it’s not true. They shouldn’t say it. … You don’t distort the science to get into the headlines.”).

7. Laurence A. Moran, The ENCODE Data Dump and the Responsibility of Scientists, Sept. 6, 2012, (“This is, unfortunately, another case of a scientist acting irresponsibly by distorting the importance and the significance of the data.”).

8. Ewan Birney, ENCODE: My Own Thoughts, Sept. 5, 2011

9. Post-splicing processing of a small fraction of the RNA from introns can produce noncoding RNAs that may regulate protein expression. L. Fedorova1 & A. Fedorov, Puzzles of the Human Genome: Why Do We Need Our Introns?, 6 Current Genomics 589, 592 (2005).

I am grateful to Eileen Kane for explaining some of the molecular biology to me. This entry is cross-posted to the Forensic Science, Statistics, and the Law Blog

Trashing Junk DNA

You have seen this week’s headlines:

  • Bits of Mystery DNA, Far From ‘Junk,’ Play Crucial Role (New York Times)
  • ‘Junk DNA’ Concept Debunked by New Analysis of Human Genome (Washington Post)
  • ‘Junk DNA’ Debunked (Wall Street Journal)
  • Breakthrough Study Overturns Theory of ‘Junk DNA’ in Genome (Guardian)

Or maybe you heard MSNBC report that the data from ENCODE “shows us living beyond our genes” –whatever that means — or listened to CBC intone that “‘Junk DNA has a purpose” — sounds divine — or saw the Independent‘s mishugina announcement that “Scientists Debunk ‘Junk DNA’ Theory to Reveal Vast Majority of Human Genes Perform a Vital Function!” — like we did not know that genes were functional and important?

The level of hype here is phenomenal. (Some useful clarification can be found at the Nature News blog). In the next few days, I hope to post some quick thoughts on what the ENCODE figures (like 80%) being bandied about for the “functional” or “biologically active” fraction of the human genome mean for the loci used in forensic DNA identification.

Cross-posted to Forensic Science, Statistics, and the Law
(If any readers have insights to share, send me an email at kaye at, and I’ll try to use them. I am still educating myself about some of the details of gene regulation and can use any help I can get.)

The Constitutionality of DNA Collection Before Conviction: An Updated Scorecard

Note: This scorecard has been superseded. Please check for later editions.

Fifteen years ago, Louisiana adopted a law mandating that “[a] person who is arrested for a felony sex offense or other specified offense . . . shall have a DNA sample drawn or taken at the same time he is fingerprinted pursuant to the booking procedure.” Today, the movement to acquire DNA from individuals not convicted of a crime and to check it against state and national databases of DNA profiles from unsolved crimes is snowballing. As of early 2012, 26 states and the federal government had laws providing for DNA sampling before any conviction is obtained. Most other countries with DNA databases also collect samples on arrest.

The DNA-on-arrest laws in the U.S. had a placid childhood, with surprisingly few challenges to their constitutionality. This period of calm is over. Conflicting opinions are emerging on the reasonableness of these searches under the Fourth Amendment. Within the next few years, it seems likely that, as Kansas State Representative Pat Colloton (R), who authored the bill that initiated her state’s DNA sampling program predicted, “this issue will go to the United States Supreme Court.” (Gramlich 2006). In fact, if U.S. Supreme Court Chief Justice Roberts has his way, the Court will take up the issue in its 2012-2013 Term.

This posting presents a scoreboard on the litigation and scholarly commentary to date. If any players or contests have been omitted, I hope that readers will correct those omissions by leaving a comment. The law review articles listed in the table do not include ones on the constitutionality of convicted-offender databases. Authors who have contended that these databases are unconstitutional would reach the same conclusion for a database that includes arrestees, but the lower courts have resoundingly rejected their analyses. Therefore, little would be gained by keeping track of the many articles on convicted-offender databases.

The tables make the point that as yet there is no consensus on the constitutionality of taking DNA samples during a custodial arrest with the intention of running database searches (in the absence of a warrant and probable cause to believe that the search will produce a hit in the database).

Table 1. Case law (as of August 17, 2012)

Appellate: State Supreme Courts (1.5-1.5)

  • Mario W. v. Kaipio, Commissioner, No. CV-11-0344-PR (Ariz. June 27, 2012) (state arrestee law for juveniles constitutional insofar as it allows sampling as a booking procedure, but pre-conviction analysis of the sample is unconstitutional under a totality-of-the-circumstances standard and an analogy to searching containers)
  • King v. State, 42 A.3d 549 (Md. 2012) (state arrestee law unconstitutional “as applied” under “totality of the circumstances” balancing test), pet.for cert. filed, Aug. 14, 2012
  • Anderson v. Commonwealth, 650 S.E.2d 702 (Va. 2007) (state arrestee law upheld under unspecified balancing test and analogy to fingerprinting as a booking procedure)
  • Related case: State v. Franklin, 76 So.3d 423 (La. 2011) (no search warrant was required to take a DNA sample from a murder defendant for use in the murder investigation because he had to submit a sample “as a routine incident of booking” anyway)

Appellate: State Intermediate Courts (opinions not reviewed by higher courts) (0-2)

  • People v. Buza, 129 Cal.Rptr.3d 753 (Cal. Ct. App. 2011) (unconstitutional under balancing tests), rev. granted, 262 P.3d 854 (Cal. 2011)
  • In re Welfare of C.T.L., 722 N.W.2d 484 (Minn. Ct. App. 2006) (state arrestee law struck down as per se unreasonable without probable cause and a warrant)

Appellate: Federal Courts (2-0)

  • United States v. Mitchell, 652 F.3d 387 (3d Cir. 2011) (en banc) (federal arrestee law upheld under “totality of circumstances” balancing test)
  • Haskell v. Harris, 669 F.3d 1049 (9th Cir. 2012) (state arrestee law upheld under “totality of circumstances” balancing test), reh’g en banc granted, 2012 WL 3038593 (July 25, 2012)
  • United States v. Pool, 621 F.3d 1213 (9th Cir. 2010) (federal arrestee law upheld under “totality of circumstances” balancing test), vacated as moot, 659 F.3d 761 (9th Cir. 2011) (en banc)
  • Related case: Friedman v. Boucher, 580 F.3d 847 (9th Cir. 2009) (an arrest does not justify DNA sampling without an applicable statute)

Trial Courts: Federal (not reviewed by higher courts) (1-1)

  • United States v. Thomas, No. 10-CR-6172 CJS, 2011 WL 1627321 (W.D.N.Y. Apr. 27, 2011) (federal arrestee law upheld under “special needs” balancing test), dismissed, No. 11-1742 (2d Cir. Sept. 20, 2011), ECF No. 43.
  • Amended Order Denying the Government’s Motion to Compel DNA Samples, United States v. Frank, No. CR-092075-EFS-1(E.D. Wash. Mar. 10, 2010), available at (applying totality balancing to a limited list of interests to find compulsory collection before conviction unreasonable)
  • Related case: United States v. Purdy, No. 8:05CR204, 2005 WL 3465721 (D. Neb. 2005) (forcibly taking a buccal swab from an arrestee violates Fourth Amendment in the absence of a statute providing for a uniform and limited system of sampling)

Trial Courts: Federal (reviewed by higher courts) (2-1)

  • United States v. Mitchell, 681 F.Supp.2d 597 (W.D.Pa. 2009) (federal law held unenforceable), rev’d, 652 F.3d 387 (3d Cir. 2011) (en banc)
  • United States v. Pool, 645 F.Supp.2d 903 (E.D.Cal. 2009) (federal arrestee law upheld under “totality of circumstances” balancing test), aff’d, 621 F.3d 1213 (9th Cir. 2010), affirming opinion vacated as moot, 659 F.3d 761 (9th Cir. 2011) (en banc)
  • Haskell v. Brown, 677 F.Supp.2d 1187 (N.D. Cal. 2009) (denying a preliminary injunction against the enforcement of California’s arrestee sampling law in large part because the balance of interests establishes that the requirement is reasonable), aff’d sub nom. Haskell v. Harris, 669 F.3d 1049 (9th Cir. 2012)

Table 2. Law Review Articles and Notes (as of August 17, 2012)


  • D.H. Kaye, The Constitutionality of DNA Sampling on Arrest, 10 Cornell J.L. & Pub. Pol’y 455-508 (2001) (a statute with sufficient protections of private, nonidentifying information is constitutional under the special needs exception)
  • Tracey Maclin, Is Obtaining an Arrestee’s DNA a Valid Special Needs Search Under the Fourth Amendment? What Should (and Will) the Supreme Court Do?, 34 J.L. Med. & Ethics 165, 178-82 (2006) (predicting that the Supreme Court will uphold taking DNA from arrestees under a balancing test but that it should reject the practice as per se unreasonable)
  • D. H. Kaye, Who Needs Special Needs? On the Constitutionality of Collecting DNA and Other Biometric Data from Arrestees, 34 J.L. Med. & Ethics 188 (2006) (proposing a “biometric information exception” to the warrant requirement)
  • Brian Gallini, Step Out of the Car: License, Registration, and DNA Please, 62 Ark. L. Rev. 475 (2009) (Arkansas law unconstitutional because it does not require a judicial finding of probable cause arrest, contains inadequate safeguards to protect the samples and records, and because it does not fall within an established exception to the warrant requirement)
  • Kevin Lapp & Joy Radice, A Better Balancing: Reconsidering Pre-Conviction DNA Extraction from Federal Arrestees, 90 N. Car. L. Rev. Addendum 157 (2012) (pre-conviction DNA extraction should be permitted only after a neutral third-party finding of probable cause and DNA samples should be destroyed)
  • —, Drawing Lines: Unrelated Probable Cause as a Prerequisite to Early DNA Collection,
    91 N.C. L. Rev. Addendum No. 1 (forthcoming 2012
  • David H. Kaye, A Fourth Amendment Theory for Arrestee DNA and Other Biometric Databases, U. 15 Pa. J. Const. L. No. 4 (forthcoming 2013)
  • Related article: Robert Molko, The Perils of Suspicionless DNA Extraction of Arrestees Under California Proposition 69: Liability of the California Prosecutor for Fourth Amendment Violation? The Uncertainty Continues in 2010, 37 W. St. U. L. Rev. 183 (2010) (reaching no conclusions)


  • Martha L. Lawson, Note, Personal Does Not Always Equal “Private”: The Constitutionality of Requiring DNA Samples from Convicted Felons and Arrestees, 9 Wm. & Mary Bill Rts. J. 645 (2001) (the government’s interest in mandatory testing of all those arrested outweighs individuals’ privacy interests)
  • Rene� A. Germaine, Comment, “You Have the Right to Remain Silent. . . You Have No Right to Your DNA” Louisiana’s DNA Detection of Sexual and Violent Offender’s Act: An Impermissible Infringement on Fourth Amendment Search and Seizure, 22 J. Marshall J. Computer Info. L. 759 (2004) (unconstitutional under balancing test other than special needs)
  • Robert Berlet, Comment, A Step Too Far: Due Process and DNA Collection in California after Proposition 69, 40 U.C. Davis L. Rev. 1481 (2007) (with certain modifications, arrestee DNA sampling as provided for under California law would be constitutional)
  • John D. Biancamano, Note, Arresting DNA: The Evolving Nature of DNA Collection Statutes and Their Fourth Amendment Justifications, 70 Ohio St. L.J. 619 (2009) (unconstitutional under special needs and totality of the circumstances tests)
  • Corey Preston, Note, Faulty Foundations: How the False Analogy to Routine Fingerprinting Undermines the Argument for Arrestee DNA Sampling, 19 Wm. & Mary Bill Rts. J. 475 (2010)
  • Ashley Eiler, Note, Arrested Development: Reforming the Federal All-Arrestee DNA Collection Statute to Comply with the Fourth Amendment, 79 Geo. Wash. L.Rev. 1201, 1220 (2011)
  • Lauren N. Hobson, Note, North Carolina’s Arrested Development: Fourth Amendment Problems in the DNA Database Act of 2010, 89 N.C. L. Rev. 1309 (2011) (unconstitutional because no existing exception to the Warrant Clause applies)
  • Kimberly A. Polanco, Note, Constitutional Law-The Fourth Amendment Challenge to DNA Sampling of Arrestees Pursuant to the Justice for All Act of 2004: A Proposed Modification to the Traditional Fourth Amendment Test of Reasonableness, 27 U. Ark. Little Rock L. Rev. 483 (2005) (constitutional under a balancing test)
  • Related note: Jacqueline K. S. Lew, Note, The Next Step in DNA Databank Expansion? The Constitutionality of DNA Sampling of Former Arrestees, 57 Hastings L.J. 199 (2005) (unconstitutional as applied to “former arrestees”)


John Gramlich, States Collecting DNA from Arrestees, July 27, 2006,, accessed Nov. 28, 2009

Martin Kaste, Wash. Lawmakers Fight for DNA Sampling at Arrest, All Things Considered, Feb. 28, 2012,, accessed Aug. 17, 2012

15 La . Rev. Stat. � 609(A)(1) (“A person who is arrested for a felony sex offense or other specified offense, including an attempt, conspiracy, criminal solicitation, or accessory after the fact of such offenses on or after September 1, 1999, shall have a DNA sample drawn or taken at the same time he is fingerprinted pursuant to the booking procedure.”), derived from Act No. 737, approved July 9, 1997, and amended in 2003 (adding the phrase “including an attempt, conspiracy, criminal solicitation, or accessory after the fact of such offenses”)