Mistake upon mistake
In recent months, we've had a few slip-ups by the official statistical system in India:
- Yesterday's IIP release was preceded by a mistake. Mint says: On Monday, the government was guilty of a similar error in its factory output data. Till it corrected the number pertaining to capital goods output, analysts were left scrambling for explanations as to how this had grown 25.5% while overall factory growth had shrunk 5.1%. (The answer: it hadn’t, and had actually shrunk by 25.5%).
- On 9 December, we discovered there were important mistakes in the exports data.
- In December 2010, RBI modified the numbers that it releases about its trading on the currency market.
- In September 2010, there was a mistake in the quarterly GDP data released by CSO.
What is going wrong?
These examples are part of a larger theme, of problems of the official statistical system. The Indian statistical system is afflicted by three levels of problems:
- The first level is conceptual problems and analytical errors. As an example, the weights of the WPI basket are wrong; the estimation methods used in the IIP are likely to be wrong, etc. Quarterly GDP measurement does not have a demand side (which requires a quarterly household survey, which the government does not know how to do).
- The second level is the lack of rugged IT systems. The production of statistics requires high quality enterprise IT systems. The government does not have the ability or incentive to roll these out. As an example, the September 2010 mistake in quarterly GDP data seems to have come about because quarterly GDP data is produced in a spreadsheet. As with all usage of spreadsheets, this is highly error prone. The hallmark of a reliably executed process is the absence of spreadsheets.
- The third level is the problems of truant front-line staff. In a country which is not able to get civil servants to show up at school to teach, it is not surprising that front-line staff of statistical agencies are untrustworthy in going out into the field and filling out survey forms. More generally, the statistical system is a set of public goods produced by civil servants, who are unresponsive about the needs of users, or the unhappiness of users, either on flaws about what is done or about the gaps in what is not done.
How to make progress?
Government officials in this field have pinned a lot of hope on the implementation of the report of the statistical commission (headed by C. Rangarajan, 2001). I am personally not optimistic about this. The report seems to emphasise an incremental agenda of building the statistical system, emphasising the interests of the incumbents. In any case, it's been a decade after 2001, and it's important to ask fresh questions about what is going wrong and why.
What is required is a ground-up rethink about the statistical system, from first principles, so as to address the three difficulties above. As an example, most of the civil servants processing data in a labour-intensive manner are not required if a good quality enterprise IT system is put into place (and it is hence not surprising that the incumbents are un-enthusiastic about business process transformation). The revolution of computers and telecommunications needs to be brought into this field, just as it has done in so many others. This does not require large sums of money; it requires superior public administration.
What should users of data do?
Turning to the users of official statistics, most economists attach enormous prestige to phrases like GDP, IIP, CPI, etc. But in India, we cannot unthinkingly use some numbers just because they come with the label `GDP' from some government agency. We have to always skeptically ask first principles questions about how the data is generated. All too often, the standard Indian government data is useless.
Global financial firms who now operate in India have brought a certain cookie-cutter mentality. They produce a major report about each release of quarterly GDP for all countries that they write research reports about. Hence, once they started having such analyst coverage of India, they have started writing a report about quarterly GDP. Such a mechanical approach is a waste of resources. The quarterly GDP data is mostly uninformative.
In the class of government data that I know of, I feel the CPI is reasonably okay. The WPI is a fairly useful database about prices but useless as a price index. The quarterly GDP data, IIP, NSSO, ASI are untrustworthy.
Decision makers in government and in the private sector need to struggle with these issues, carefully thinking about what statistics are allowed to influence their decision processes.
Academic users of data need to be much more careful about avoiding garbage-in-garbage-out (GIGO). With a large number of academic papers that work with Indian data, I stop reading the paper after I have read the data description; I know the data is rubbish, so the paper will not change my mind, so I should not bother reading it. A good referee blocks papers which are GIGO. But even if the referee in a faraway place thinks that quarterly GDP in India is well measured, the researcher should ponder whether there are better uses of his time - are there projects which can be more meaningful and genuinely answer important questions, over and beyond merely getting past a referee?
Finding out more
For more on this subject, you might like to look at the label `statistical system' on this blog.