The Appliance of Science


2:00 pm - December 10th 2008

by Unity    


Tweet       Share on Tumblr

Putting aside an apparent misunderstanding with Douglas as to the purpose of my recent piece on the DWP’s decision to make widespread use of ‘lie detectors’ on benefit claimants, what was also questioned was the value and utility to taking the time to explain some of the science behind these systems.

It took me a few of days and a quick exchange of comments under a parallel article at the Ministry for Alex Harrowell to come up with the goods by tracking down the patent for this system, which supplies the basic technical information required in order to be able to make a few definitive statements about the its viability as a device for ‘sniffing out’ benefit cheats.

So, lets get straight to the $64,000 question – does it actually work?

Over the telephone?

No – I would very doubtful of it being too much more successful in a face-to-face interview situation either.

Sticking with the telephony side of things for the moment, there are three very basic problems with this system which render it next to useless for its claimed purpose.

1. In order to carry out any kind of analysis, the system needs a reference sample of the subject’s voice taken which the subject is at rest and in an ‘emotionally neutral’ state. In practice, this means that the system will try to grab a half second worth of speech from a call based, one would expect, on what would be considered a neutral element of the conversation. The very first thing an operator will do, if you call any kind call centre, is take your personal details such as your name and address and its likely to be this portion of the call from which the system will attempt to extract a reference sample.

But the mere fact that these questions are, in themselves, neutral in no sense guarantees that the caller is an emotionally neutral state.

On a cold call, neither the operator or the system has any way of knowing what the emotional state of the caller is, unless the caller makes it perfectly obvious by yelling at them down the phone and so, unlike a face-to-face interview, in which the interviewer has some control over the environment and can make a concerted effort to create a neutral atmosphere in order to get the reference the sample the system needs, there’s no way to be sure that that any analysis carried out is based on a valid reference point.

2. The primary voice stress analysis element of the system, which the patent confirms is based on the Lippold Tremor, will not work on telephone calls AT ALL.

This is matter of basic physics/electronics.

All telephone systems, analogue and digital, use low and high band filters to compress the voice signal for transmission (which is why the voice of the person on the other end of the phone often sounds a bit ‘tinny’) and the standard low band filter used on all land-line and mobile phone systems, filters out the frequency range (8-12Hz) that the system needs to analyse in order to analyse a piece of speech for evidence of stress.

The system cannot reliably assess whether a telephone caller is stressed because the phone system fails to provide it wil anything to analyse.

3. Much of the additional ‘testing’ that the system does in order to, purportedly, identify whether someone may be lying, appears to based on scanning for short pauses and hesitations in speech, which the patent supposes may indicate that the caller is having to think about their answers rather than respond entirely spontaneously. In themselves these aren’t reliable indicators but even if they were there are a thousand and one things and more than can affect a telephone signal as it passes from A to B which can introduce artefacts capable of confounding this type of voice analysis; everything from variations in line resistance and electrical ‘noise’ to drop outs, digitisation errors, signal adjustments arising out of error correction routines to simply the wrong kind of background noise at the wrong time or moving the the phone towards or away from the mouth at a particular moment.

Many of these problems can be controlled for and largely eliminated if a subject in interviewed face to face in a controlled environment, but before you do that you’ve got to identify to suspect, oops, subject first and this system is of little of no use whatsoever for that purpose.

Beyond the telephony issue, the information in the patent makes it possible to assess, on the basis of what it tells us about the different types of analysis it claims to carry out, just exactly which situations the system cannot cope with at all.

So, for starters, the system cannot make any kind of assessment if the subject is speech-impaired or even has a relatively common speech impediment, such as a stammer.

Likewise, the presence of any kind of cognitive impairment, whether permanent or temporary, will cause major problems for the system and the range of possible confounding factors is huge, covering a broad range of neurological disorders and brain injuries, learning disabilities and a host of psychological conditions and disorders all the way to alcohol and drug use. Based on the information in the patent I think its fair to say that the system could easily be thrown out of kilter is a subject has does nothing other than drink a good strong cup of coffee or a can of Red Bull a few minutes before making a calling the local council.

An individual’s command of the English language will be a significant factor in those elements of the analysis which rely on looking at speech patterns (i.e. pauses and hesitations). There is no way of distinguishing, from an analysis of speech patterns, between someone who gives a hesitant answer because they’re thinking up a lie on the spot and someone who pauses and hesitates because English is not their first language and they’re mentally searching for the right word. Indeed, this problem is not necessarily confined to those for whom English is a second language as, depending on how the system is configured, anyone who’s command of spoken English is less than ‘average’ could find themselves flagged as high risk by the system because their limitations cause them to struggle to ‘get their words out’.

And, in some cases, a subject’s otherwise entirely normal personality traits and characteristics may come into play to confound the system – amongst the claims incorporated into the patent are two which point to the possibility that, for some people, being simply who they are could create a problem for them.

Embarrassment Level: Is your subject feeling comfortable, or does he feel some level of embarrassment regarding what he or she is saying?

Deep Emotions: What long-standing emotions does your subject experience? Is he or she “excited” or “uncertain” in general?

Some people just find the whole business of answering personal questions embarrassing or uncomfortable regardless of the circumstances in which such questions are put to them, in which case the use of a system such as this might well lead councils to investigate or penalise individuals for the ‘crime’ of being very shy or easily embarrassed or a bit excitable. Although its claimed that the reference sample will allow the system to adjust for such traits, the validity of any such adjustment is entirely contingent on getting a valid reference sample to start with, for which there is no absolute guarantee.

In writing the first article, for all that I suspected that the system was likely to be a complete crock but I couldn’t be sure simply because there was insufficient information available from which to reach any definite conclusions.

One blog post and a bit of digging around by other bloggers later, we now have enough information to state, with a considerable degree of confidence, that these systems are:

1. The next best thing to useless when use to ‘evaluate’ cold callers contacting councils by telephone, and

2. So likely to generate false and faulty results on a wide range of potential subjects that there’s no scientific or ethical basis one can think of that would justify their use – and the ‘evidence’ they generate is inadmissible in court anyway so its not clear what kind of legal standing they might have either.

And we also have a pretty solid list of confounding factors the presence of any of which in an individual should rule out the use of this system, a list which includes:

Any speech impairment

Any cognitive impairment, including neurological disorders, brain injury and strokes, particularly one which impacts on memory or language processing.

Any psychological disorder or mental illness

Use of any kind of drug which affects an individual’s cognitive performance, which can include prescription and non-prescription drugs, alcohol and illegal narcotics and may even include perfectly common substances like nicotine and caffeine.

Any learning disability or difficulty that affects cognitive performance or an individual’s language skills even a very mild one which leaves an individual with a slightly less than average intelligence

English as a second language.

Certain normal personality traits.

We can also infer from this that the age of a subject is a significant factor, necessitating the recalibration of the system if used to assess a subject who is either approaching or has already passed retirement age – and yes, the patent does indicate that the ‘sensitivity’ of the system can be adjusted by the end user to ‘improve’ its accuracy or to try and control for an limit the possibility of it throwing out false positives…

…which is, of course, something else that could go wrong as a poorly calibrated system will only serve to increase the number of erroneous results.

Harrow Council, apparently, spent £63,000 on piloting this system on their victims, oops, council tax and housing benefit claimants, which is a fair old sum of money and maybe enough to put some councils (and councillors) off buying into this system.

And if that’s you then, based on the results of Harrow’s pilot, I’d suggest you consider a much more cost effective alternative; one that is so straightforward to use that you can have your own local benefit claimant intimidation system up and running  as quickly as you can run a press release past your local papers and all for a absolutely minimal initial outlay…

…just try a few of these.

For the princely sum of £1.40 for 5 (inc VAT) you’ll get an almost identical ‘hit rate’ rate as Harrow Council for a fraction of the cost as long as you pick only the one number as an indicator of a possible benefit cheat and stick to it and with between 8 and 10 different colours available from most good suppliers you’re sure to find something to suit your council’s corporate image.

Now, who do I invoice to get my consultancy fee?

  Tweet   Share on Tumblr   submit to reddit  


About the author
'Unity' is a regular contributor to Liberal Conspiracy. He also blogs at Ministry of Truth.
· Other posts by


Story Filed Under: Blog

Sorry, the comment form is closed at this time.


Reader comments


Quite so Unity.. quite so

2. Mike Killingworth

Is there any evidence of public outrage at a local authority failing to be “tough enough” on benefit claimants?

In cost-benefit terms, if the fraud rate is established (let’s say it’s 1%) would it cost a local authority £63,000 to promote a private Bill empowering it to reject 1% of claims at random? After all, we all hate claimants, so there couldn’t possibly be an ethical objection.

3. douglas clark

Hmm…

Unity. This post at least made sense.

It acknowledges the simple fact, that lie detectors are shite.

Could I suggest to you that the challenges to lie detectors have a good and honourable history? Could I suggest to you that most effects, are in fact interperable in a number of ways, and that there is little or no peer reviewed support for this woo woo?

You’d seem to accept that, now.

As you are much more connected that I am, could you please tell your political masters that this is, indeed, the piece of shite you and I say it is?

It is, by the way, a form of psychological torture. As it is not evidence based it is an attempt to play God on the part of the inquisitor. With confessional results that have nothing to do with reality.

Such is stupid.

Douglas:

If you’re serious about challenging this then the ‘it’s just woo’ argument doesn’t fly – we have put the evidence base behind any challenge and that evidence base has to acknowledge that there are at least some valid scientific foundations.

The Lippold tremor is real, and I’ve read several of Lippold’s original papers, the earliest of which I’ve been able to track down dates to 1957. Likewise, the underlying technical premise of the system is real in the sense that it ‘works’ using the same base technical methods as any other voice recognition system – and its that which ‘elevates’ this system from being just mere woo to its being the misapplication of science.

Unless you acknowledge that element then you lose the argument straight away. You can’t just dismiss that component of the debate, you have to explore and expose its limitations, giving the context that the company behind this system is trying to obscure in order to sell it to the suckers in government.

You need to realise that, so far as the DWP is concerned, this system works simply because it delivers an outcome they want – a reduction in the number of benefit claimants – anything else is gravy.

Whether it does that in fair manner is, to a significant extent, immaterial in their eyes and, joking aside, if some of these bastards really thought they could get away with rolling a D8 and investigating people at random then they would – its only the likelihood that such an arbitrary approach would land them with a shedload of bad press and an awkward judicial review under HRA that would hold them back from such a move.

So, we need a solid counter argument and of the ones open to us, by far the strongest isn’t the ‘it doesn’t work – full stop’ argument, its actually the extensive list of confounding factors that would compromise the system even if it did work because many, if not most, of these cut across statutory duties to promote equality and are, arguably, discriminatory –

- and in the context of this debate (and knowing how the government think) there’s the focal point of the attack.

If the system can be shown to be more likely to generate false result when fed a subject who has a speech or cognitive impairment or has English as a second language due to its technical limitations – and I think it can – then you have legal basis to challenge its use.

We have to use the system we have to get after this, and unless the Commons Public Administration committee takes an interest in the question of whether the government is spending money on a bit of technology that doesn’t work, then the best route through this is likely to be to push for the Commission for Equality and Human Rights to go after this as a discrimination issue or find a suitable plaintiff and a human rights lawyer who’s prepared to mount a HRA challenge pro bono.

5. douglas clark

Unity,

For goodness sake!

From ‘Assessing the Validity of Voice Stress Analysis in a Jail Setting’ , page 89 – conclusions:

These findings add to the growing literature on tests of voice stress analysis theory and devices. Even though early tests of the “theory” suggested that stress is related to measurable changes in voice patterns (Cestaro 1996, Smith 1997, Hansen 1996, Hansen and Zhou 1996, Haddad, Ratley, Walter and Smith 2002) it is not clear that VSA devices are able to distinguish stress from efforts to deceive (Haddad et al 2002) We are unable to find any peer reviewed and published studies that showed significant support for the effectiveness of VSA software to detect deception. All previously published research conducted in a lab setting has failed to find for VSA theory or technology (Brenner, Branscombe, and Schwartz 1979; Cestaro and Dollins 1996; Hollien, Geison and Hicks 1987, Horvath 1978, 1979; Janniro and Cestaro 1996; Lynch & Henry 1979; O’Hair, Cody, Wang and Chao 1990; Suzuki, Watanabe, Takeno, Kosugi, and Kosuya 1973; Timm 1983; Waln and Downey 1987) Some researchers have tried to test VSA products in “the field” but with limited success (Palmatier, n.d.: 1999, 2000). Our research therefore complements previous research by failing to find support for the VSA products in a real world setting.

I apologise to Messrs Damphousse, Pointon, Upchurch and Moore if there are any typing errors in the above quote. However it is only available in pdf, which is another governmental nonsense.

So, Unity, what have you and your chums added to that debate?

All that Lippold proved was that this subsonic component of voice was a fight or flight response.

Which applies equally to the innocent as to the guilty.

I stand by what I said. This is woo woo science.

What might be useful is getting this ridiculous idea confronted, right now, in the judicial system. As both you and Mike have pointed out, this is a lottery. The fact that politicians take it seriously suggests that it is not only Washington that is crawling with vested interests.

6. douglas clark

Unity,

Whether it does that in fair manner is, to a significant extent, immaterial in their eyes and, joking aside, if some of these bastards really thought they could get away with rolling a D8 and investigating people at random then they would – its only the likelihood that such an arbitrary approach would land them with a shedload of bad press and an awkward judicial review under HRA that would hold them back from such a move.

OK.

I am quite willing to argue beside you on this anti-science, anti-authoritarian platform.

It is just a bit disheartening that we have to do it, don’t you think?

No, Douglas.

1. What it is, is a piece of moderately interesting science, albeit one of very limited application – and actually an offshoot of some of the very early efforts to try an understand conditions like Parkinson’s Disease – one which has been grossly misapplied to a purpose founded on a false premise.

Woo is stuff like homoeopathy and astrology which has no scientific foundations whatsoever and science is science and can either be correctly or incorrectly applied, the latter being the situation here.

2.This is NOT ultimately a scientific debate because the decision being challenged here is not a scientific decision but a political one.

If it was simply a matter of science then there’s no way anyone would consider using these systems, but as I noted in my previous comment, what the government is interested in here is not the scientific validity of the system, other than in the limited sense it provides a modest level of cover for the decision to use the system, but a very simple bottom line – how much money does the system save by propelling people off benefits.

That alone dictates that a straightforward challenge to the scientific validity of the system in extremely unlikely to work, unless you can convince someone like the Public Accounts Committee that the lack of scientific evidence makes the system a waste of money, which is a tough ask as long as it appears that its saving the public purse more than it costs to use it.

One of the political dimensions to this issue which hasn’t been raised yet is that, on the last published figures, the government spent £154 million on anti-fraud measures and investigations related to benefit claims but only saved £106 million as a result of successful investigations. Currently, the government is spending 50% on combating benefit than its saving by detecting fraud while using this system in Harrow, on paper, saved between three and five times as much in withdrawn claims than it cost to put in the system.

As far as the government is concerned, the system works, because it appears to save them considerably more money that it costs to use the system – and beyond they couldn’t give a toss about the technical arguments…

…unless you can pin them on a discrimination charge.

You have to have valid legal grounds to mount a judicial review and, sadly, the mere fact that a piece of technology fails to do what its supposed to is not enough to get a case into court, unless you can persuade the government to sue the developer for selling it a pup.

In theory, an individual facing a prosecution for benefit fraud after being ‘fingered’ by this system could try to challenge the legal validity of the prosecution based on the fact that the trigger for the investigation derives from ‘evidence’ that is inadmissible in court. But you’re hardly going to be batting on friendly wicket by putting up an alleged benefit fraudster at the spearhead of efforts to invalidate the use of the system, which still leaves you to find a valid legal basis for challenging its use – which is where that list of confounding factors and the fact that several of them make the use of systems highly questionable under things like DDA and, potentially, the Race Relations Act, comes into play.

If that system cannot process voice data provided by a disabled subject, or by a subject for whom English a second language, on exactly the same basis that it would any other subject, then you have a discrimination claim that, if successful, would force the government to either pull out of using the system or find some means of screening claimants and discarding those whose disability or ethnic background is a confounding factor.

The first outcome is a straight win, the second would massively increase the administrative overheads and costs that using the system entails and almost certainly kill is a viable proposition – either way you skin the cat.

That’s what’s emerging from this debate, a valid legal basis from which to challenge the government’s decision to roll out this system.

8. douglas clark

Unity,

Don’t agree.

Woo is the misapplication of science as much as it is the non application of science. Or, if you prefer, coming to a conclusion based on dodgy, sciency, evidence. Ben Goldacre is very good on this. As are his nemesis, the drug and cosmetic companies’ PR wings. Ben is a good guy, IMVHO, whereas, well, you fill in the details….

Here is an example of a Local Authority trying to sound, err, authoriative, and getting taken to bits:

http://tinyurl.com/5htsn9

Which is exactly where we are with VSA. It is clearly wrong to link stress and lying. They are not the same things. And it is clearly wrong to deny the gross failures of the technique or the apparatus to actually measure anything beyond random chance.

Anyway, if what you say is correct, then we are being comprehensively lied to. It seems to me that we need a double blind test of this product in exactly the same way as we would test a clinical drug. So, why aren’t we?

I find the idea of lie-detecting customers utterly repellant, even if the system worked – and I’m in the kind of business where I might be forced to administer it.

If you want to fuck up your reference sample try wanking when you phone. If you can time your vinegar strokes to coincide with you answering your name you’ll be fine.

If it makes things more easier I’ll just point out that all call centre staff are young, horny and up for it.


Reactions: Twitter, blogs
  1. The Charlatan and the DWP | Ministry of Truth

    [...] A couple of months ago, myself and Alex Harrowell did a big of digging into the background of the company whose ‘voice risk analysis’ technology is being introduced by the DWP as a means of screening benefits claimants for the possibility that they may be committing fraud. (see here and here) [...]





Sorry, the comment form is closed at this time.