Search interesting materials

Tuesday, August 30, 2016

The measurement of Indian manufacturing GDP: problems and some solutions

by Amey Sapre and Pramod Sinha.

Since the release of the 2011-12 series, the reliability of Indian GDP data has been the subject of intense debate. In the case of manufacturing GDP, there were large upward revisions in growth rates from 1.1% to 6.2% in 2012-13 and –0.7% to 5.29% in 2013-14, which were inconsistent with trusted private databases about growth in manufacturing. The introduction of the new MCA-21 dataset has also raised questions, as the lack of release has made it impossible for independent researchers to cross-check the estimates.

GDP estimation is a remarkably complex process. It is built on several sub-processes, datasets, and methodologies at the sub-sector level. At every base year revision, we see changes in sources and methods of computation that aim to yield improved measurement of macro aggregates. Valuable insights that can be derived, through the study of measurement issues, for our interpretation of the resulting data and thus our reading of macroeconomic conditions.

In a recent paper, we address three questions about Indian manufacturing GDP estimation:

  1. Are we correctly measuring output and intermediate consumption in the formula for Gross Value Added (GVA)?
  2. How sound is the technique of imputing missing data based on blowing-up using Paid Up Capital (PUC)?
  3. When the new MCA-21 dataset is used, are manufacturing firms being correctly identified?

    Questions about measuring output and intermediate consumption in the formula for Gross Value Added (GVA)

    There are many concerns about how GVA has been estimated using firm data. In the paper, we recreate the process of GVA estimation. We use the Goldar Committee report in letter and spirit, and use the production side approach to recreate the GVA for a set of firms that file in MCA-21. For this, we take the XBRL formatted data from MCA-21 and identify the data fields used to compute GVA.

    We also do a mapping of the XBRL fields with fields in CMIE Prowess and estimate the GVA. A detailed mapping can be found here. These two strategies give us a unique vantage point from which to evaluate discrepancies in GVA estimation.

    Conceptually, the use of the MCA-21 dataset involves a shift from the erstwhile Establishment to the new Enterprise approach of value addition. The establishment approach captured production based data from factories registered under the Factories Act. The enterprise approach captures financial data of firms, and goes beyond just manufacturing by capturing value addition from post-manufacturing, ancillary or related activities such as marketing, and operations of branch/head offices. How does this impact upon value addition? There are two parts to this answer.

    The first is the extent to which measures of output change.

    Under the establishment approach, “Sales” was a measure of output. In the current enterprise approach formula, several disaggregated components of revenue that include revenues from products, services, operating revenues, revenue from financial services, rental income, incomes from brokerage and commission and other non-operating incomes are part of output. In the Goldar Committee report, there is a limited discussion on the inclusion or exclusion of several revenue fields in GVA computation. Also, the data labels and tags of the XBRL fields are broadly based on items in Schedule-III of the Companies Act. The lack of proper definitions of the fields makes the identification process cumbersome and prone to errors. It is evident from the output composition that value addition is not solely accruing from manufacturing activities, but also from several related activities. This leads to inflated GVA levels as the component of output is now similar to the total income of the company, and not industrial sales.

    In the paper, we show a comparison with the previous sales based method and argue that changes in output composition alone can lead to increased levels of GVA. This will eventually push the growth rates upwards.

    Year Based on
    Based on Disagg
    -regated revenue
    2011-12 701896.6 767311.4 65414.8
    2012-13 742237.2 819228.5 76991.3
    2013-14 780371.1 872178.1 91807.0
    Comparison of GVA based on old and new method (Figures in Rs. Crore)

    We study the firms in the CMIE Prowess. Using the traditional sales-based measure, our manufacturing GVA estimate is Rs.780,371.1 crore for 2013-14. Using the disaggregated revenue, it appears that there is an over-estimation of manufacturing GVA of Rs.91,807 crore, by including revenue from non-manufacturing activities.

    What is missing in the GVA formula is a clear rationale of including revenues from different non-manufacturing activities. If such activities form a part of the enterprise level activities, it also requires a clear segregation of costs, identifiable data fields, and a consistent treatment in the formula to identify value addition from core manufacturing and other activities.

    The second issue is changes in measures of intermediate consumption.

    Identifying components of intermediate consumption at the enterprise level is equally difficult. Conventionally, subtracting the cost items (related to production) from output provides a measure of value addition entirely from manufacturing activities. However, with large and diversified enterprises, identifying cost items from financial data fields can pose significant challenges. A close scrutiny of the XBRL fields shows omission of important cost components, such as; Power & Fuel expenses, Advertisement and marketing related expenses. These are sizeable components and their omission can underestimate costs, thereby overestimating GVA. Thus, two possible reasons that account for distortion in GVA are; increase in output due to addition of several revenue items, and omissions in the components of costs.

    Questions on the blow-up methodology

    Missing data imputation is done, in Indian GDP estimation, by assuming that GVA is proportional to Paid Up Capital (PUC). In the paper, we replicate the blow-up process by constructing an available and active set of companies based on random samples that give different Paid-up Capital coverage. The details of the procedure have not been clearly documented in official publications. Several variants of the method are possible, such as; blow-up for each range of Paid-up Capital, blow-up by industry group, by ownership type of company, among others.

    PUC-based blowup assumes that PUC and GVA have a deterministic and linear relation. This is at best a weak assumption, as one cannot draw sufficient inference about a company’s manufacturing activities by looking at its Paid-up Capital value. In the paper, we show that the size distribution of PUC and GVA have no systematic relation, and thus PUC is not an appropriate method to scale up GVA. Since the GVA contribution of a firm can be negative, the PUC based blow-up shows a distorted picture as it always contributes positively.

    Our analysis of the blow-up procedure reveals several shortcomings. First, the blow-up factor is sensitive to Paid-up Capital (PUC) coverage and can show a considerable increase as the number of non-reporting companies increases. Second, the variation in blown-up values is unpredictable as there is no systematic trend for different values of the PUC factor. This leads to an unknown degree of error as the addition due to blow-up can be significantly large as compared to the actual contribution of unavailable companies.

    On this problem, we are also able to offer a solution. In the paper, we show that using industry level growth rates of GVA to scale up previous year’s GVA of unavailable companies is a feasible and superior method. We use a sample to first classify each missing company into its industry and based on growth rates of GVA for each industry, we scale up the last available GVA of the unavailable company. Using industry level growth rates of GVA has an advantage over the PUC based blow-up as it uses the previous year’s GVA of the company instead of scaling up GVA of available companies. Industry growth rates capture the economic conditions faced by firms in and also provide a sufficient clue about the state of business environment. Computationally, on average, the method gives a lower margin of error, lesser variability, better representation of firm’s conditions and provides a close approximation to the actual GVA contribution of the firm. CSO can potentially shift to using this method.

    Are manufacturing companies being correctly identified?

    The Goldar Committee report makes a mention of using the ITC-HS product codes for identification of manufacturing companies. In absence of such codes, the Company Identification Number (CIN), which contains the NIC code, can be used to identify the nature of business activity of the company. The problems with using these two options are known. What is unknown is the extent of misclassification of companies and the error in the GVA estimate.

    The reliance on ITC-HS code has several problems. Only 59% of the 30,006 companies filing in XBRL across all industries had reported the ITC-HS for products and NPCSS for services. However, even having the codes does not solve the problem. The codes only identify a product and do not distinguish between its trading and manufacturing. Thus, using such codes does not provide an assurance that value addition is being correctly captured for manufacturing products.

    The problem is compounded in cases where the codes are unavailable. At present, a company’s CIN and the details on its website are used to identify its business activity. This yields misclassification.  For a company, its 21 digit CIN does not change once it has been created at the time of registration. Over time, a company may change the nature of its business activity or diversify into any other sector. This change of business activity is not reflected in the CIN code of the company. Using CIN can be potentially misleading since the top revenue generating activity of the company might be different from the one mentioned in its CIN code.

    The NIC classification also changes from time to time. This adds to the complexity of identification in two ways; first, changes in business activities of companies are independent of changes in NIC codes, and second, a particular NIC code may not reflect the same business activity over time.

    In the paper, we analyse this problem by studying two groups, (i) companies that operate as non-manufacturing entities, but have their NIC codes registered in a manufacturing activity and (ii) companies that are into manufacturing, but have their NIC code registered in any other economic activity. We show that there are a large number of companies in both categories and can create a significant distortion in the GVA estimate. We argue that any classification method for companies based on either ITC-HS or CIN code or hand mapping based on clues gathered from the name of the companies  or details from their website is likely to be incorrect. This process requires careful hand-analysis of each firm, to classify the firm correctly.


    Sound computation of GDP is essential to decision making by the government and by the private sector. With imperfect observation of GDP, on many questions, we are flying blind. There has been a great amount of criticism of the Indian GDP data in recent years, as the high growth rates seen in the official data are inconsistent with trusted private databases. Our paper contributes three new blocks of knowledge to one component of the problem, i.e. measurement of manufacturing GDP.

    Amey Sapre is an Economics Ph.D. student at IIT, Kanpur and Pramod Sinha is a researcher at NIPFP.

    No comments:

    Post a Comment

    Please note: Comments are moderated. Only civilised conversation is permitted on this blog. Criticism is perfectly okay; uncivilised language is not. We delete any comment which is spam, has personal attacks against anyone, or uses foul language. We delete any comment which does not contribute to the intellectual discussion about the blog article in question.

    LaTeX mathematics works. This means that if you want to say $10 you have to say \$10.