Do. Not. Assume.

The EU Prüm Treaty is on its way to implementation.  I can sense the excitement mounting.  Ah ah the Prüm Treaty you cry!  At last, at long last, as the tears of joy roll gently down your cheeks.  Of course there may be the odd one or two individuals who are not quite certain what the Prüm Treaty is (not an unreasonable position to find oneself in). The Prüm Treaty will create a super-network on European DNA databases, allowing the police of EU nations to trawl through millions of DNA records.

During the last session of the Scottish Parliament there was quite a bit of enthusiasm among a few Labour MSPs for increasing the number of individuals whose data was held on the police DNA database.  It appeared to be their view that the larger the DNA database the more effective it was.  Thus one might conclude that the Prüm Treaty will bring joy to their hearts.  It is fair to say that a larger database allows a greater probability of making a match.  However, the flip side of that coin is the greater probability of a false positive (incorrectly identifying an individual as a match).  Consider the statistics.   A match of one in ten thousand means that there is a one in ten thousand chance that the match is a false positive, the match has occurred by chance.   So if you check the DNA of ten people a match of one in ten thousand appears pretty good.  However, if the database contains ten thousand data points you would reasonably expect at least one false positive (we are dealing in probabilities so you have to allow for random error).  If the database contains two million data points you can expect quite a few false matches.   The number of loci (positions within the DNA) sampled significantly influences the risk of a false match.

There is considerable debate over how many loci should be sampled; it varies between 10 and 15.  Both the number, and critically the actual location of the loci used, varies between EU states.   When comparing across states the sample may be reduced to only six loci (a number which no state considers reliable).  The new treaty not only increases the risk of a false match by considerably expanding the data volume, but also because it may result in an unacceptably low number of loci being compared.  However, false positives are not the only reason why an individual may be incorrectly identified.   DNA material is extremely tricky to handle and inadvertent contamination is relatively easy (when the data is collected, during testing, even in the manufacturing of laboratory equipment).

Peter Hamkin was a barman in Mersyside when, in 2003, he was accused and arrested of murdering a woman in Italy the year before.  The Italian police had requested that the English police search their DNA database and were informed that he was a perfect match.  After 20 days a second DNA test concluded that the ‘perfect match’ was somewhat less than perfect and that he could be ruled out (he was released without being charged).

DNA is fallible. This is why Scots law likes corroborating evidence, and why DNA matches and fingerprints should not, of themselves, be proof of guilt.

There is a not unnatural concern among various groups (scientists, civil libertarians etc) that there will soon be a lot more Peter Hamkins.  It could be argued that that does not matter; he was released, no great harm done.   However, while the police are incorrectly focussing on an innocent individual then the guilty have more time to go to ground.   There is also the impact on that individual; mud sticks – especially for particularly nefarious crimes.  Corrupt police might decide that this suspect will do (especially if they are under serious pressure to solve the case). There is finally the alarming possibility of DNA material being manufactured in order to provide the appearance of guilt (Frumkin, D. et al (2010) Forensic Science International: Genetics, 4(2), 95-103).

The mud sticks scenario is likely to be exacerbated by a common statistical misunderstanding.   How often have you heard/read a news report stating that the DNA match is one in a million and then going on to suggest that due to the high probability of a match that the individual must be guilty?   The problem with this type of report is that it conflates two probabilities; the probability of the DNA match being a false positive, and the probability of the individual’s guilt.  Think of it this way.  The probability of a DNA match has been established at one in a million.  However, the individual can clearly establish that not only was he on a different continent at the time of the crime, but that he was speaking to a packed hall of several hundred people.  Given all the information one would be obliged to conclude that the probability of guilt is not one in a million (in fact it would be zero).  This is why Scots law likes corroborating evidence, and why DNA matches and fingerprints should not, of themselves, be proof of guilt. (As a small aside the actual science underpinning fingerprint evidence is very poor indeed.)

How many cases have actually been solved by the DNA database, as against the suspect being identified through tradition means and the DNA being used as corroborating evidence?   There are relatively few cases where the database has actually caught a suspect.  The first DNA database trawl was in the UK in 2007, and the Leicestershire police did identify the individual.  However, the guilty party was caught not because he was on the database, but because he was overhead trying to convince somebody else to provide a sample.  For most crimes which have been ‘solved’ by DNA evidence the suspect has been identified by standard police methods and then DNA has been used to corroborate guilt.  A Home Affairs Committee report in 2010 concluded “It is currently impossible to say with certainty how many crimes are detected, let alone how many result in convictions, due at least in part to the matching of crime scene DNA to a personal profile already on the database, but it appears that it may be as little as 0.3%” (Home Affairs Committee – Eighth Report The National DNA Database).

There are of course a range of civil liberties issues connected to DNA databases.   In brief some of the issues are:

  • Erroneous conviction may result from an over-reliance on DNA databases.  Mistakes happen, partial samples, wrongly labelled samples, contaminated samples, false positive identifications etc.
  • Fingerprints tell us nothing about an individual’s disabilities, potential health problems, parentage.  DNA provides extensive private information on an individual.  When such information is held there is always a risk that it will be abused.  Today it may be the case that only ‘junk’ DNA is held, but DNA carrying information would be far more ‘useful’, and pressure is likely to grow to hold more informative datasets.
  • Will an individual who refuses to provide a sample be identified?  Perfectly innocent individuals might find themselves vilified for a quite reasonable refusal to provide a sample, “if he has nothing to hide he has nothing to fear”.
  • Will individuals from particular groups be more likely to find themselves targeted for inclusion within the database?
  • Phenotypic profiling can be used to identify racial or other characteristics. Let us imagine that the DNA from a crime scene is identified as being of a male from the Middle East; now imagine you are one of only six Arabs living on the local housing scheme.
  • Related to the above is the possibility that the crime sample reveals that the suspect suffers from a particular genetic abnormality.  Will the police be able to check hospital records for those being treated for this abnormality?  The end result may be the inadvertent publication of an individual’s most private details.
  • The number of innocent people on database tends to constantly expand as more and more individuals are added: DNA dragnets, examining familial lines, innocent individuals who ‘look like a suspect’ (perhaps over six foot and black).
  • Being identified by a DNA ‘trawl’ twenty years after the event may make it very difficult for the suspect to prove their innocence.  As a DNA trawl can identify anybody on the database, without any prior suspicion, it is inevitable that some innocent individuals will be so identified (Peter Hamkins).  How would you prove that you were not on such and such a housing estate on 19 March 1999?
  • ‘Offline’ (not legally constituted) databases can be built.  As the collection of DNA expands the temptation to maintain unofficial databases increases.  Nobody likes to throw data away.  The temptation to hang on to data that should be discarded is likely to increase as time allows us to become accustomed to DNA records being kept.  To date offline databases have been discovered in both Louisiana and New York   (Simoncelli & Krimsky S. (2007) A New Era of DNA Collections: At What Cost to Civil Liberties? American Constitution Society for Law and Policy)
  • The state may at some future time allow a wider use of the database.   The end result being a) an increase in the likelihood of the data leaking, b) the inclusion of an individual in a study/project/commercial enterprise regardless of their willingness to participate.

I am not arguing that we should not keep DNA records, that DNA ‘fingerprinting’ cannot provide a significant tool in the battle against crime.  However, DNA databases are not a miraculous solution to all unsolved crimes, neither are they built and extended without considerable risk to personal civil liberties.  As with most risks to civil liberties this one is easily ignored, until you are identified.