An old-fashioned approach to evidence? Guilty as charged.

I’m old-fashioned, and I’ve just been upbraided for it.  An article by Brian Monteith in the Scotsman made a number of claims which I thought rather far fetched, so I looked at some other evidence.  Monteith had written, at some length, that “the Euro currency project has been an economic catastrophe”, that since 1994 the growth of the US economy had far outstripped the Eurozone, and that if only the UK had not been within the EU we would all have been much richer. I checked some basic figures with the World Bank’s data and wrote this:

The idea that the Euro has been an ‘economic catastrophe’ is wishful thinking. Mr Monteith chose to start the clock in 1994. On the World Bank’s figures  income per capita in the Eurozone started in 1994 at $19516 and by 2017 had reached $43834, an increase of 125%. Income per capita in the USA started at $27350 in 1994 and finished at $60200 in 2017, an increase of 122%. It’s not a huge difference, but growth over time in the Eurozone more than kept pace with growth in the USA.

Growth in the UK, by contrast, was only 110% over the same period. If only our economic performance had been as good as the Eurozone’s.

This, I now know, was totally misguided, because it attracted this as a response:

You’re living in the past !….”Paul Spicker”
Any fool can quote PAST statistics !
Nothing to do with future prospects !

So there we are.  In the course of the last few years on the blog, I’ve tried to back up everything I say. The mistake I’ve been making all this time is to take statistics and evidence from the past, when they should have come from the future instead.  What I should have used is the crystal ball – I’m working on it.

Perhaps I should add that “Paul Spicker”, given inverted commas in the rebuke, is not an invented personality. I obviously lack the imagination that I need to contribute to social media.

Measuring hunger

In the process of catching up on stuff from the World Bank, I came across another ill-conceived paper, published in January:  The challenge of measuring hunger.  Different methods of measurement, the authors complain, yield radically different results:  “In our survey experiment, we calculate hunger to range between 19 and 68 percent”.  Perhaps they might have worked out from this that there is something fundamentally wrong with the approach they’re taking.  It’s more than fifty years since the indicators movement first argued that we had to stop thinking about social issues in terms of single, accurate, precise measures.  What we need are indicators, pointers or signposts – multiple sources of evidence where we look for direction, reinforcement and corroboration, rather than authoritative answers in tablets of stone.  Anything else is doomed to failure.

A little more on why we can't trust the statistics in published articles

I’ve referred earlier this week to the work of Ioannidis, who argues that most published medical statistics are wrong. The British Medical Journal regularly uses its Xmas issue to publish some disconcerting, off-beat papers.  In a previous issue, they produced the findings of a randomised control trial which showed an apparently impossible result: praying for people whose outcomes were already decided several years ago seemed to work.  The message:  don’t trust randomised control trials, because they’re randomised.  This year, an article, “Like a virgin”, identifies 45 women in a sample of nearly 5,500 who claim to have had a virgin birth. The message: don’t believe everything people tell you in surveys.    If only medical journals applied the same rigour to some of their ‘serious’ results.

11 more genes for Alzheimer's? Hardly

The reports of another supposed breakthrough in genetic research are, like so many before it, rather exaggerated.  Last week,  a New Scientist editorial commented that neuroscience

” is plagued by false positives and other problems. … Scientists are under immense pressure to make discoveries, so negative findings often go unreported, experiments are rarely replicated and data is often “tortured until it confesses”. …  Genetics went through a similar “crisis” about a decade ago and has since matured into one of the most reliable sciences of all. “

Yesterday the newspapers were stuffed with reports from that most reliable and mature of sciences, concerning the discovery of 11 genes newly implicated in the causation of Alzheimers.  This is from the Independent:

The role of the immune system in defending the brain against Alzheimer’s disease has been revealed in a study identifying 11 new genes that could help to trigger the most common form of senile dementia.

There’s more than enough there to be able to tell that the report is confused.  In the first place, Alzheimer’s disease is not a single disease entity; it’s a syndrome.  The term is used as a residual category for any form of dementia where there isn’t as yet a clear understanding of the process.   Over the years, the size of that residuum has gradually been reduced as various specific disease entities have been identified – Pick’s, Huntington’s, Parkinsonian dementia, Lewy body, CJD and so on.  The process of refinement still has a long way to go.  Second, there is no evidence that Alzheimer’s is genetically determined or ‘triggered’ by particular genes.  The study does not actually  claim to show that the immune system defends against Alzheimer’s.  All it does it to identify  a group of SNPs or snips (single nucleotide polymorphisms to their friends) associated with the immune system which show some association with the diagnosis of dementia.  That’s an interesting finding, because it suggests that it may be worthwhile to examine immune systems to see what connections emerge.  It’s not the same thing as showing that genes cause Alzheimer’s.

However, it’s not possible to exonerate the authors of the paper altogether of blame for the misrepresentation.  The title of the article, published in Nature Genetics, is:  “Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease”.  This does assume that the associations show ‘susceptibility loci’, and it emphasises that it’s a big study, which implies that it has greater authority as a result.   The conclusion suggests that what needs investigating is the potential association with the risk of Alzheimer’s.

There are three common errors here: the paper commits some of the cardinal sins of statistics.

  • Confusing association with causation.  An association doesn’t in itself tell us what the influence of genes is or what the direction of causation is.  It follows that assocation with certain genes doesn’t reveal susceptibility.
  • Confusing significance with risk factors.  A relationship can be highly statistically significant although its effects are very limited.   (On a graph, it’s the slope of the regression line that really matters rather than the closeness of fit of the observations).   It’s possible that some small part of the response is attributable to the associated factor, and in medical terms that’s potentially important – it could relate to a particular condition – but that’s not equivalent to a risk factor, and in any case the work done doesn’t identify that.
  • Fishing, or data mining.  In any very large body of data, there will be some unusual associations – it’s in the nature of the exercise.  It doesn’t follow that those associations can be invested with meaning.  This study  fishes for the data in a massive pool – over 17,000 people with Alzheimer’s, over 37,000 controls and more than 7 million SNPs.  Then in stage 2 there were 8572 people with dementia, 11,312 controls and 11,632 SNPs.  The significance levels were strict  (p < 5 per 10*-8), but the sheer size of the data sample makes the statistics more problematic, not less so.  The method can’t do more than suggest that some patterns merit further investigation.

Confusion about poverty

An editorial in Friday’s Scotsman complains:

“People are classified as being poor if their income is less than 60 per cent of the UK median. Given this is a relative, as opposed to absolute, measure, then we can say with mathematical certainty that the poor will always be with us.”

I gave some examples of similar muddles in a paper I wrote last year (Why refer to poverty as a proportion of median income?, Journal of Poverty and Social Justice, Volume 20, Number 2, June 2012 , pp. 163-175.) The researchers who introduced the measure explained that the test “does not mean that there will always be poverty when there is inequality: only if the inequality implies an economic distance beyond the critical level.” However, people don’t understand averages or distributions – and journalists usually get where they are by studying words rather than numbers.

There are problems with the use of 60% of the median, but the supposition that it invents poverty isn’t one of them. The main problems are that it compares poor people with incomes that are not much better, that it assumes it’s always impossible for more than half the population to be poor, and that it’s not well understood. The main defence is that it works, more or less, for Europe and for the OECD countries. 60% of the median is primarily a test of very low income, and in countries where income distributions are more equal, poverty is much lower.

The Benefit Expenditure Tables

A new version of the Benefit Expenditure Tables has been released, including  information about Child Benefit and Tax Credits, which for the last few years have been treated as a matter for HMRC.    The nominal cost of all benefit expenditure for 2012/13 was £201.9 billion, of which £107.7 billion (53%) goes on pensioners.

There has been a couple of weeks delay while the presentation of figures were being rejigged after the Budget, but it doesn’t look as if they’ve been done too carefully.  For example, the series for Child Benefit in table 1 stops in 2002/03 and hasn’t been resumed despite the inclusion of all the data needed for it in a different table.  And there is no obvious reason why the nominal outturns on costs for older people in the “Summary Table: GB Benefits and Tax Credits” should be lower than the cost for DWP benefits alone in the “Benefit Summary Table”.

Leaving the Work Programme

It has been announced that one in ten people referred to the Work Programme, 73,260 up to last April, have been subject to sanctions for failing to avail themselves of the opportunities. Or it has not been announced, depending on your point of view, despite the very specific figures and the ministerial comment: the Telegraph explains that this is what “the Department for Work and Pensions (DWP) is expected to confirm next week when it publishes the first official statistics on the overall success of the programme.” If this is an official announcement, it would be another clear breach of the UK Statistics Authority’s Code of Practice.

We know what the Minister Mark Hoban thinks of the figures; he thinks it shows that people are scrounging. “Sadly some people are clearly very determined to avoid having to get job at all.” There are other possibile explanations. It might be, for example, that people think they are better able to find work if they’re not on the programme. It might be that the tens of thousands of people who have been forced to claim JSA instead of incapacity benefits are too sick to work, and now they are being cut off benefits altogether. It might be that people are being sanctioned for not replying to letters. It might be that some have found work – because, despite the propaganda, that’s what most unemployed people do. It might be that people who are being cut off from benefit are being forced into crime or prostitution instead – it’s happened before. We just don’t know, which is why we need the detailed evidence and statistics.

Official statistics and the 'neighbours from hell'

I have written today to the UK Statistics Authority to raise some questions about the government’s figures on “troubled families”. In December the Prime Minister explained:

Today, I want to talk about troubled families. Let me be clear what I mean by this phrase. Officialdom might call them ‘families with multiple disadvantages’. Some in the press might call them ‘neighbours from hell’. … We’ve always known that these families cost an extraordinary amount of money, but now we’ve come up the actual figures. Last year the state spent an estimated £9 billion on just 120,000 families – that is around £75,000 per family.

The same figures have been repeated in a series of government statements, including material from the Department of Communities and Local Government, the Home Office and the DWP.

The UK Statistics Authority exists to guarantee the integrity of official statistics in the UK. They have established a range of criteria for integrity, transparency and quality, but among other requirements they state that departments should

  • “Ensure that official statistics are produced according to scientific principles”
  • “Publish details of the methods adopted, including explanations of why particular choices were made.”
  • “Issue statistical reports separately from any other statement or comment about the figures and ensure that no statement or comment – based on prior knowledge – is issued to the press or published ahead of the publication of the statistics.”

That is not what’s happened here. “We’ve come up with the actual figures”, the PM’s statement says, and policy has been rolled out from that starting point. Some explanation of where the figure of 120,000 families come from appeared in a note from the Department of Education, though it was not publicized; there have been trenchant criticisms from Jonathan Portes and Ruth Levitas, on the basis that there is no connection between the indicators used to identify troubled families and the problems of crime and anti-social behaviour. The basis of the costings is still not publicly available. I’ve asked the Statistics Authority to consider whether there has been a breach of their Code of Practice.

The impact of Work Experience

In February, I wrote to the UK Statistics Authority to express concern about some uncheckable claims being made about the benefits of work experience. The Minister for Employment, Chris Grayling MP, had published an open letter to Polly Toynbee on Politics Home, claiming that “a significant number of placements turn into jobs, with the employer getting to like the young person and keeping them on. … so far around half those doing placements have come off benefits very quickly afterwards.” In the Times on 24th February, he also claimed that “half those young people stop claiming benefits after taking part.” (p.32) This was referred to in BBC’s Question Time on 23rd February as evidence that the scheme was working well. The only evidence, however, was based on a first cohort of 1300 people on placement from January 2011 to March 2011, when by the time of the statement the scheme had been extended to more than 34,000 people.

The DWP has now published more data, this time covering 3490 people in the scheme from January to May 2011. It shows an increase in employment, by comparison with a group of non-participants, from 27% to 35%. There are two main reservations to make about the figure: that it still relates only to an early cohort, who may (or may not) have been easier to place than later cohorts, and that there is no explanation of what being “in employment” might mean in terms of hours or duration (the only test seems to be that the employer has sent a return to HMRC). It is also a lot less than the 50% originally claimed.


The press reports, again, that patients are being denied life-enhancing drugs to save money.  In this case, the issue centres partly from the draft guidance prepared by NICE on Abiraterone, and partly on the impression in Scotland that the drug in question may be partly responsible for the unexpectedly long survival of a convicted murderer.

NICE gets a terrible press, but the work they do is exemplary. The consideration given by the committee is, as ever, consistently careful, thorough and balanced. Their brief was to review  

  • Overall survival
  • Progression-free survival
  • Response rate
  • Prostate specific antigen (PSA) response
  • Adverse effects of treatment, and
  • Health-related quality of life.

There is a case for Abiteraterone.  It does extend survival by about four months – roughly a third more than without the drug – and it seems to have fewer side effects than the existing drugs. However, the benefits are still limited, and the drug is hugely expensive.

This specific example seems to fall into a category discussed in a debate in the British Medical Journal in 2009 (31st January). Adrian Towse, the director of the Office of Health Economics, argued that the public were generally willing to support payments that were double what NICE was allowing for. The NICE thresholds were typically a cost of £20-30,000 for each QALY (a year of valued life), a figure that has been raised for end of life treatments; the public would support £30-70,000. Against that, James Raftery argued that the thresholds should be lower, because they force health trusts to take resources away from other, more effective treatments. The cost of Abiraterone falls in the region of £53,800 to £63,200 for each QALY.  

There is beyond that a common problem: the evidence in this case is almost entirely supplied by the drug’s manufacturer. Manufacturers have only a limited window during which they can market a drug before patents expire; spending time to run all the tests, and in particular to identify the groups best able to benefit, is not always consistent with their financial interests. It is not clear whether Abiraterone does extend survival more than all the alternatives, because the manufacturer has not yet made all the necessary comparisons. If the gaps could be closed, the case for approving the drug would be stronger.