The Leap Blog: June 2016

Wednesday, June 29, 2016

Google banning advertisements of payday loans: Is this vigilante justice?

by Ajay Shah.

Foundations

The State must have a monopoly on violence. In democracies, the coercive power of the State is enveloped in the rule of law. There is separation of powers: Parliament writes criminal law, the Police enforces this law, and a judge awards the sentence. Laws are legitimate either when they are written by Parliament (where legislators have won elections), or when narrow authority for drafting subordinate legislation is given to officials along with a sound regulation-making process. The accused knows the law, is given a hearing, and must be proven guilty beyond all reasonable doubt. The order must be written through a quasi-judicial procedure. It cannot merely hand down punishment; it must be a reasoned order. The accused must have the ability to appeal the order.

Most States are flawed creatures, and many of these things do not work correctly at present. As an example, these foundations of liberal democracy are found in the Indian Financial Code but not in the existing financial law and financial agencies. But the previous paragraph gives us a compact sense of the machinery of sound liberal democracies. The problem faced in constructing this civilised behaviour is politicians and officials who desire unaccountable power [example].

Vigilante justice

There are other ways in which we can go astray. One of them is to slip into vigilante justice: where coercion is imposed by ordinary citizens. A mob who beats up a person who is accused of a crime is a throwback to the medieval ages. It is not rule of law.

We have to be vigilant in detecting and blocking vigilantism. As an example, consider the RBI concept of `Wilful Defaulters'. Under this framework, private persons are supposed to identify `wilful defaulters', and once this is done, the coercive power of the State is used to force all private persons to punish the chosen one. However, private persons cannot run a rule of law process to identify wilful defaulters in a fair manner. This regulation puts the coercive power of the State in the hands of private persons; it is tantamount to State-sanctioned vigilantism. It is not rule of law.

Google and payday lenders

From this perspective, we should worry about Google blocking advertisements of payday lenders. This falls in the context of Google blocking ads by many kinds of sellers.

Google would say: But we are not the State; we're just your friendly local restaurant that decided to stop selling sugar water. It is the legitimate right of a firm to do business with those that it likes. E.g. an ordinary firm can decide that it does not like to do business with (say) Christians. The reason for concern is that things are different with a dominant player like Google. If Google decides to block ads by person X, that matters disproportionately, as Google has something like 70% market share in digital advertising in the US and very large market shares in most countries of the world.

Checks and balances of the State are missing. Because Google is so important in shaping the way people access Internet content, this action by Google is uncomfortably akin to State action which prohibits advertisements of payday lenders. Action by Google, who is a corporation and is not the State, is faulty in that Google does not work by the machinery described in the first paragraph:

Preventing a private person (a payday lender) from showing me advertisements is coercion. This should be the monopoly of the State.
Google chooses what industries are harmful for consumers. This `legislative' power is illegitimate as it is not grounded in Parliamentary law.
The persons who are adversely affected have no recourse. to the due process of law.

Are you sure? Some people believe that the end justifies the means; they are convinced payday lending is bad, and don't care how it is obstructed. But who can know these things for certain? As an example, many people believe that micro-finance lending in India suffers from problems similar to those of payday lending in the US. However, careful research on this question has shown that this preconception is wrong. The realities of these complex questions generally go beyond media viewpoints. What if payday lending is actually good for the people who buy it? We are protected from mistakes by the deliberative and public legislative process, where diverse viewpoints are debated in public. Google is a private person and is not required to use such a legislative process. This makes their do-gooding dangerous.

A slippery slope. Today it is payday lending. What comes next? Humans follow ads shown by Google in all sorts of self-destructive ways. Humans use Google search to find ways to inflict pain and harm upon other humans. Google does not kill people, people kill people.

A more appropriate stance. In other contexts, Google has been more careful. Examples include child porn and sex determination ads, where the decision to coerce is grounded in the State, and Google is just taking instructions. Their behaviour on payday lending is out of line when compared with their own restraint in these other situations. Google appears to now be doing a lot of censorship, which raises important questions such as this one.

If payday lending is bad for its customers, how should it be tackled?

If payday lending has problems, the solution to this lies in financial regulation. This is the business of the State, and not a do-gooding IT company. The machinery of consumer protection in the Indian Financial Code is the mechanism through which the State should exercise coercive power and diminish the damage that payday lending can potentially do. This must be a deliberate and careful process, with checks and balances.

I thank Naman Pugalia and Renuka Sane for useful discussions.

Monday, June 27, 2016

Interesting readings

Great institutions, not great men by Ajay Shah in The Business Standard, 27 June.

Across the aisle- Economic reforms: Act I, Scene I by P Chidambaram in The Indian Express, 26 June.

A Brexit conspiracy theory nails the no-win situation Boris Johnson now finds himself in by Indrani Sen in Quartz, 25 June.

As stray elephant dies soon after capture, lessons from the Anamalais in containing animal conflict by Anand Kumar in Scroll, 23 June.

India sets new record in space mission; PSLV C34 successfully injects 20 satellites into orbit by U Tejonmayam in The Times of India, 22 June.

In Berlin, Unraveling a Family Mystery by Ralph Blumenthal in The New York Times, 22 June.

How to smear your enemies and silence your critics, Chinese Communist Party style by Ilaria Maria Sala and Heather Timmons in Quartz, 22 June.

Prof who criticised Modi, Irani jailed over Lord Rama comments by Cynthia Stephen in The Hindustan Times, 21 June.

Why Uber Keeps Raising Billions by Andrew Ross Sorkin in The New York Times, 20 June.

Focus on how to man institutions by Somasekhar Sundaresan in The Business Standard, 20 June.

How Raghuram Rajan won over the union workers at RBI by Joel Rebello in The Economic Times, 20 June.

How American Politics Went Insane by Jonathan Rauch in The The Atlantic, July-August 2016.

Thursday, June 23, 2016

India needs drones

by Shefali Malhotra and Shubho Roy.

The two vital raw materials that went into the Indian software miracle was access to computer hardware and access to data communications. The first became possible when customs tariffs were removed, and the second became possible by opening up to private and foreign telecom companies. When thinking about another new industry, drones, it's useful to imagine: What would have happened to the Indian software industry if the coercive power of the State was deployed to ban computer usage by civilians? Registering to fly a drone in Nigeria costs \$4,000 and \$5 in the US. So far, India has banned all civilian use of drones, i.e. the cost of registration to fly a drone in India is much higher than that in Nigeria.

Drones have a variety of civil/commercial applications. In areas like crop insurance, soil mapping, disaster, conservation, traffic management, crowd management, photography and filming, drones may be a game changer. All these applications are hobbled by the ban.

The DGCA has come up with draft regulations which is designed to allow civilians to use drones. These draft regulations are not accompanied by an analysis of the costs of complying with the regulations. Moreover, these regulations do not seem to consider the needs of a nascent industry. Consequently, drone applications will remain extremely expensive in India. Capabilities in technology flow from a vibrant user community which demands increased sophistication; as long as India does not avidly use drones, we will not become designers and makers of drones. India's expertise in software and technology gives India an edge in this important emerging area. However, if the regulatory regime is hostile to the development of technology; India will soon fall behind.

One example of an application of drones: Crop insurance

Insurance depends on verifying two facts. Did the insured event actually occur? And how much was the damage (monetary terms) to the insured? Today, when a Haryanvi farmers' crop gets ruined by hail, there are two problems for the insurance company and the farmer. First, did the hail storm actually take place? India does not have accurate village/taluka level weather data. Second, how much of the crop was actually damaged by the hail storm and not removed by the farmer to inflate the insurance claim? Answering these questions in rural Haryana is not easy.

While these facts could be ascertained by sending persons called "claim verifiers/processors" to farms, it is very costly to send individuals to each insured farm on repeated visits to verify claims. As farm sizes are small, the transaction costs of settling insurance claims become very high. This in turn makes insurance commercially unviable for insurers or the premium is too high for farmers to pay.

Drones can change this industry for the better and make crop insurance much cheaper for the insurance company and the farmer. Here is the arrangement that can be used.

When the farmer makes the initial purchase of insurance, an agent of the insurance company would map the latitude-longitude of borders of the farm. The insurance company can charter high altitude drones to collect accurate weather data. Lower flying drones can take high resolution pictures of the farm right after an insured event (hail storm) takes places. These photographs can allow an insurance company to establish if the hail storm actually damaged the crops and also the extent of damage.

Drones will be substantially faster, cheaper and probably more accurate than human verifiers. Drones can also cover much larger areas in much lesser time than individual human claim verifiers. The high quality aerial images can be processed by computers to determine whether the damage was by hail (rather than being a false claim where the crop had been harvested) and even the extent of damage. The insurance company can process the information and transmit the insurance payout directly to the farmers account. No human intervention. In future disputes about insurance claims, these high quality images can form the best evidence to determine the truth.

This is not just a hypothetical illustration. It is coming about in India.

Some other application areas

Farmers in other countries are already using drones to identify soil conditions, health of crops, watering needs, etc. Some of these drones cost less than Rs.5000 [link].

Farmer in China spraying crops using a drone

India has one of the lowest police to citizens ratio. Drones can increase the effectiveness of the few policemen. Common policing work like crowd management, traffic, security in large events can be helped by drones. In such areas drones are force multipliers where the Indian state can provide basic public goods like security to more citizens at lower costs. The Andhra Pradesh police has begun moving in this direction.

The need for regulation

Any proposal to regulate must be backed by a full articulation of the underlying market failure. In the case of drones, there are two dimensions. One element of the market failure is the possibility of negative externalities in the form of harm to innocent bystanders. The other element is the possibility that drones are new weapons for committing old (IPC) crimes.

Drones are aircraft without pilots and passengers. Therefore, regulations governing certification of safety for pilots and passengers are not applicable for drones. However, just like an aircraft, drones can fly over properties and persons without their consent. Badly made or badly flown drones crashing into people or property is a concern. This justifies basic safety/quality standards for drones, and some level of competence for the drone operator.

Drones can now enable a class of crimes which were previously hard to organise. Drones have fundamentally changed the nature of privacy in ones home. High walls and thick screens are no protection against snooping by a drone which could be operated by a media company, government agency or a personal enemy. Drones can also be used to carry out attacks by dispersing chemicals or mounting weapons. Drones can be used to spy on military establishments or carry out attacks on industrial/nuclear installations. While the easy answer for a lazy government is to ban drones, this is a very intrusive intervention. A better tradeoff in security would be to create checks and balances which permit society to gain from applications of drone technology while avoiding the problems. A natural point of departure is the registration system for cars.

India should develop the regulatory framework for drones now. Other countries are already doing this. Delaying the process will impede innovation in drones and derail development of the drone market. India will fall behind in the global drone market. One day, when India wakes up to civilian applications, we will then be a mere importer of drone technology as this knowledge will not have spread deeply in the country.

Approach to regulation

There are three approaches to regulating drones:

Banning them: Prohibit the civilian use of drones. This is where we are today.
Regulating them: Regulate civilian use of drones to minimise the harm to others and prevent the potential misuse of drones.
Regulating and encouraging them: Positive interventions by the government to facilitate innovation.

Regulating drones

This approach requires drone operators to comply with safety and security standards. At the same time, the cost of compliance should be borne in mind so as to not make investment in the drone industry unviable. Other jurisdictions are balancing these two competing interests through a multi-pronged approach.

Risk based regulation: The riskier the drone operation, the greater propensity it has to cause harm to others. It follows that risky drone operations must have higher standards of compliance with safety and security requirements. For example, the US law creates a distinction between drone operations conducted for research or recreational purpose (in demarcated areas) and drone operations conducted for non-recreational/commercial purpose; which may fly over strangers who did not consent to drone over-flight. In the former case, the drone operator does not require US Federal Aviation Authority (FAA) approval, but must operate safely and in accordance with law. In the latter case, the drone operator requires specific authorisation from the FAA. The EU and UK categorise drone operations depending on the level of risk. For example, a drone operating over the open sea is less risky than a drone operating over spectators in a stadium. In the former case, a drone operation may not require any approval but may have operational limitations, such as, the drone operator should maintain visual line of sight with the drone and the drone operation should not be conducted above 400 feet. In the latter case, the drone operation may require multiple approvals, such as, design and production approval, air worthiness approval, operational approval and proof of pilot competency.
No-fly zones: Certain areas, like nuclear installations and ammo-dumps, are sensitive. Drone accidents in such areas may cause widespread devastation. There are other sensitive areas where any breach of the security protocol may cause a national security threat, like the border of a country. Hence, there is a need for airspace restrictions to minimise the perceived harm in sensitive areas. For example, the US FAA prescribes fly and no-fly zones based on airspace-centric security requirements. These airspace restrictions are used to protect special security events, sensitive operations, high-risk areas, etc. As an example, Raisina Hill may be classified as a restricted airspace area.
Drones as weapons: Drones may be used for criminal activity, such as a terrorist attack. Developing some standards of compliance will help minimise the risk of such criminal use. For example, drone operators in the US are mandated to display the registration number of the drone, on the drone. This enables easy identification of the drone operator in the event of a criminal activity. Singapore criminalises carriage of prohibited items, like a weapon, on a drone and discharging anything, whether gaseous, liquid or solid, from a flying drone.
Privacy: Drones have been used to track unsuspecting individuals and trespass into private property or a restricted area. To prevent this, Singapore has criminalised taking photographs of a protected area (as declared by the Singapore Government) using drones. In the US, any government operated drone operation is required to comply with the provisions of the US Constitution, Federal law and other applicable regulations and policies on privacy, like the Privacy Act, 1974. The US FAA has also formulated guidelines to encourage private parties to advance privacy, transparency and accountability during commercial and non-commercial drone operations and prevent unintentional violation of the privacy of others. For example, a drone operator is encouraged to provide prior notice to individuals of the time frame and area where the drone is intentionally collecting data and develop a privacy policy for the collected data. The UK CAA has also framed similar guidelines.

Encouraging drones

Alongside these enforcement perspectives, there is a need for positive interventions by the government to facilitate drone innovation. This approach recognises that the drone industry is in a nascent stage. The quality and pace of innovation in drones will not only depend on the players involved, but also the regulatory framework within which the innovation is taking place. These interventions may not be in the form of fiscal incentives (the most commonly used in India) but more in the nature of creating an enabling environment for the private sector to innovate and operate.

This may require a change in laws that discourage the suppliers and users of the drone industry. For example, drones actively interact with other users of airspace and should operate without causing harm to these users. To ensure this, the US FAA carries out safety studies to support safe integration of drones in the national airspace system. It may also require some institutional changes to facilitate the development of the drone industry. For example, the US FAA allocates research and test sites within the US to allow drone testing and enable development of drone technology in a safe environment.

The UK Civil Aviation Authority (CAA) supports the research and development process in the drone industry by facilitating full and open consultation with the developers of drone technology so that the CAA can provide guidance on the applicable rules and regulations. The US FAA coordinates with other Federal Agencies and the international community to designate permanent areas in the Arctic where small drones can operate 24 hours for research and commercial purposes. The US FAA has recently entered into a partnership with the drone industry to explore next steps in drone operations beyong the scope of the applicable law.

Next steps

The DGCA draft guidelines is a step in the right direction. However, the guidelines leave much to be desired. India needs to move on to formulating a regulatory framework which regulates and encourages the drone industry. It has some natural advantage (expertise in software technology and IT) which may allow it to be a key player in the global drone market. However, if India squanders away the lead by not creating a conducive environment for drones, it will end up lagging behind other nations.

Minimising the regulatory burden

There is a need to regulate the drone industry to minimise the risk of harm that it may cause to third parties. On the other hand the cost of compliance should not be higher than the profits/benefits. High regulatory cost will discourage players (especially small firms) from entering the market and will nip the industry in the bud. The draft regulations (in some places) have very high costs of compliance, without any attendant benefit to the society. This is a result of the vague language used in the draft regulations.

An example of vague language increasing the cost of the compliance is the requirement of permission for low drone flights. Regulation 5.3 of the draft regulation states:

the operator shall obtain permission from local administration, the concerned ADC.

The guidelines are silent on what is 'local administration'. Is it the district magistrate, local police station, local court? No one knows. It is also not clear whether you need permission from "local authorities" and "the concerned ADC" or "the concerned ADC" is the "local authority". The abbreviated term ADC is not expanded or explained anywhere in the guidelines.

Such vagueness drives up the cost of technology adoption by small firms. These firms would have to run from pillar to post to get the above "permission". Since these local authorities will also not know whether they are the right "local authorities", and lack a guidance document based on which they can to analyse applications, they will probably take inordinately long or refuse.

The draft guidelines is peppered with other technical terms, like "Temporary Segregated Areas (TSA)" and "Temporary Reserved Areas (TRA)", which are also referred to but not defined. There is also no cross-reference in the guidelines allowing a reader to find what they mean and which areas they apply to. They may be the terms of art for airlines, but such opacity hampers the large technology community who must tinker with drones.

Making regulations user-friendly

Till now, the airspace was used by a niche population, pilots. Hence, if airspace regulations were not easily available and were technical, it was not a problem. With the coming of drones, airspace will become accessible to a large section of the population from a 16 year old kid to hobbyists, researchers, companies large and small, government, etc. Airspace regulations must now become comprehensible and reader-friendly. For example, it is crucial for a drone operator to know areas where a drone can be used and areas where it cannot. The draft guidelines state that drone operations cannot be carried out in notified prohibited area, restricted area, danger area, TSA and TRA.

However, the draft guidelines do not provide much guidance on what constitutes these areas or even where one can find these areas. Although, the regulations refer to the Aeronautical Information Publication (AIP) regarding details of these terms, the AIP is not readily accessible to the general public. In contrast, the US FAA has developed an app (B4UFLY) illustrating the fly and no-fly areas for ready reference of drone operators. Using this app, a 12 year old child can understand where to fly a drone.

Screen showing no fly zones in the US

Conclusion

Induction of drone technology into India is, at present, very costly. When authorities, processes and systems are unclear in a law, the potential cost of getting a drone permission can literally be infinite. There is no way to know which authority to apply and the authority itself does not know whether he/she has the power to grant an application. We need clearer regulations, and we need a regulatory framework to support the industry.

References

Subtitle B, Title III of the US FAA Modernisation and Reform Act, 2012.

EASA Proposal to Create Common Rules for Operating Drones in Europe (September 2015).

CAA CAP 722: Unmanned Aircraft System Operations in UK Airspace: Guidance (March 2015).

Singapore Unmanned Aircraft (Public Safety and Security) Act 2015 (No. 16 of 2015).

Johan Hauknes and Lennart Nordgren, Economic rationale of government involvement in innovation and the supply of innovation-related services, STEP Report Series R-08 (1999).

The authors are researchers at the National Institute for Public Finance and Policy. They thank Sumant Prashant, Bhargavi Zaveri and Pratik Datta for discussions.

Tuesday, June 21, 2016

Interesting readings

Home Free by Jennifer Gonnerman in The New Yorker, 20 June.

Social media and corruption: Evidence from a Russian blog by Ruben Enikolopov, Maria Petrova, Konstantin Sonin in The Vox, 20 June.

The best way to welfare by Abhijit V. Banerjee in The Indian Express, 18 June.

Maharashtra govt to Bombay HC: 'Issued tenders for decibel meters' by Express News Service in The Indian Express, 18 June. Such measurement is in all Android phones.

World without leaders: Bowing to every prejudice thrown up by the mob is the new norm by Kanti Bajpai in The times of India Blogs, 18 June.

A cashless dream for kirana stores by Varad Pande and Manisha Pandita in The Mint, 17 June.

How to win the financial inclusion war by Varad Pande and Manisha Pandita in The Mint, 15 June.

Is Sebi overstating money laundering concerns? by Mobis Philipose in The Mint, 14 June.

Emami founders say hospital investment was a big mistake by Soumonty Kanungo in The Mint, 14 June.

Forced unbundling for greater competition by Ajay Shah in Business Standard, 13 June.

Cracking the payments bank puzzle by Varad Pande and Manisha Pandita in The Mint, 13 June.

SEBI and FMC: The story of a merger that took a dozen years by Shaji Vikraman in The Indian Express, 13 June.

Across the aisle: Gross domestic product or puzzle? by P Chidambaram in The Indian Express, 12 June.

Understanding the financial needs of the underserved customer by Varad Pande, Nirat Bhatnagar and Manisha Pandita in The Mint, 11 June.

JS-level posts vacant at Centre, few takers by Subhomoy Bhattacharjee in The Business Standard, 11 June.

The unnoticed agreement by Jaya Jaitly in The Indian Express, 11 June.

Why I Quit Twitter - and Left Behind 35,000 Followers by Jonathan Weisman in The New York Times, 10 June.

Integration between spot and futures markets and the way forward in commodity markets by Samir Shah in NIPFPMF YouTube Channel, 9 June.

India's liberalised yet restrictive visa policy by Natasha Agarwal in The ORF, 8 June.

Hopping in His Matchbox by Neal Ascherson in London Review of Books, 2 June.

The Management Myth by Matthew Stewart in The Atlantic, June 2006.

Economic History in South Asia: In Conversation with Professor Tirthankar Roy in The LSE Blogs,

How a tiny, secretive research shop exposed one of the world's biggest commodity traders by Heather Timmons in The Quartz, 30 May.

Anonymous Hackers Turned Stock Analysts Are Targeting US, Chinese Corporations by BeauHD in The Slashdot News, 26 May.

Why West Bengal continues to be a place for unauthorised money raising (This time is hardly different)? in The Mostly Economics Blog.

Addresses in India are chaos. I see two candidates that can help: Open Location Code and Mongolia is changing all its addresses to three-word phrases by Joonlan Wong in The Quartz.

Wednesday, June 15, 2016

Sophisticated clustered standard errors using recent R tools

by Dhananjay Ghei

Many blog articles have demonstrated clustered standard errors, in R, either by writing a function or manually adjusting the degrees of freedom or both (example, example, example and example). These methods give close approximations to the standard Stata results, but they do not do the small sample correction as the Stata does.

In recent months, elegant solutions have come about in R, which push the envelope on functionality, and yield substantial improvements in speed. I use the test dataset of Petersen which is the workhorse of this field.

The problem

In regression analysis, getting accurate standard errors is as crucial as obtaining unbiased and consistent estimates of the regression coefficients. Standard errors are important in determining the accuracy of the coefficients and thereby, affecting hypothesis testing procedures.

The correct nature of standard errors depends on the underlying structure of the data. For our purposes, we consider cases where the error terms of the model are independent across groups but correlated within groups. For instance, studies with cross-sectional data on individuals with clustering on village/state/hospital level. Another example could be difference in difference regressions with clustering at a group level. Clustered standard errors allow for a general structure of the variance covariance matrix by allowing errors to be correlated within clusters but not across clusters. In such cases, obtaining standard errors without clustering can lead to misleadingly small standard errors, narrow confidence intervals and small p-values.

Clustered standard errors can be obtained in two steps. Firstly, estimate the regression model without any clustering and subsequently, obtain clustered errors by using the residuals. Clustered standard errors can be estimated consistently provided the number of clusters goes to infinity. However, the variance covariance matrix is downward-biased when dealing with a finite number of clusters. One of the methods commonly used for correcting the bias, is adjusting for the degrees of freedom in finite clusters.

R and Stata codes

The code below shows how to compute clustered standard errors in R, using the plm and lmtest packages. Petersen's dataset can be loaded directly from the multiwayvcov package. Pooled OLS and fixed effect (FE) models are estimated using the plm package.

# Loading the required libraries
library(plm)
library(lmtest)
library(multiwayvcov)

# Loading Petersen's dataset
data(petersen)
# Pooled OLS model
pooled.ols <- plm(formula=y~x, data=petersen, model="pooling", index=c("firmid", "year")) 
# Fixed effects model
fe.firm <- plm(formula=y~x, data=petersen, model="within", index=c("firmid", "year"))

Clustered standard errors can be computed in R, using the vcovHC() function from plm package. vcovHC.plm() estimates the robust covariance matrix for panel data models. The function serves as an argument to other functions such as coeftest(), waldtest() and other methods in the lmtest package. Clustering is achieved by the cluster argument, that allows clustering on either group or time. The type argument allows estimating standard errors by allowing for heteroskedasticity across groups. Recently, the plm package introduced the small sample correction as an option to the "type" argument of vcovHC.plm() function. This is switched on by specifying type="sss".

# OLS with SE clustered by firm (Petersen's Table 3)
coeftest(pooled.ols, vcov=vcovHC(pooled.ols, type="sss", cluster="group"))  

# OLS with SE clustered by time (Petersen's Table 4)
coeftest(pooled.ols, vcov=vcovHC(pooled.ols, type="sss", cluster="time")) 


# FE regression with SE clustered by firm
coeftest(fe.firm, vcov=vcovHC(fe.firm, type="sss", cluster="group")) 

# FE regression with SE clustered by time
coeftest(fe.firm, vcov=vcovHC(fe.firm, type="sss", cluster="time"))

Stata makes it easy to cluster, by adding the cluster option at the end of any routine regression command (such as reg or xtreg). The code below shows how to cluster in OLS and fixed effect models:

webuse set http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/
webuse test_data.dta, clear

* OLS with SE clustered by firm (Petersen's Table 3)
reg y x, vce(cluster firmid)
* OLS with SE clustered by time (Petersen's Table 4)
reg y x, vce(cluster year)

* Declaring dataset to be a panel
xtset firmid year
* FE regression with SE clustered by firm
xtreg y x, fe vce(cluster firmid)
* FE regression with SE clustered by time
xtreg y x, fe vce(cluster year) nonest

The table given below shows a comparison of the standard errors computed by R and Stata. The standard errors computed from R and Stata agree up to the fifth decimal place.

Model	SE (in R)	SE (in Stata)
OLS with SE clustered by firm	0.05059	0.05059
OLS with SE clustered by time	0.03338	0.03338
FE regression with SE clustered by firm	0.03014	0.03014
FE regression with SE clustered by time	0.02668	0.02668

Performance comparison

I run benchmarks for comparing the speed of Stata MP and R for each of these models on a quad-core processor. The results show that R is faster than Stata. In order to do parallelisation, I set the number of processors that Stata MP will use as 4. An example of the benchmarking code in Stata is given below:

* Stata benchmarking program : Example
set processors 4
timer clear
timer on 1
bs, nodrop reps(1000) seed(1): reg y x
timer off 1
timer list

Parallelisation in R is done using standard R packages. An example of the benchmarking code in R is given below:

# R benchmarking program : Example
library(doParallel)
library(rbenchmark)
set.seed(1)
c <- detectCores()
cl <- makeCluster(c)
ols.benchmark <- mcparallel(benchmark(lm(y~x, petersen), replications=1000))
mccollect(ols.benchmark)
stopCluster(cl)

The table below shows a comparison of R and Stata MP for each of these models. The average time is calculated as the ratio of elapsed time to the number of replications. Relative efficiency is defined as the ratio of the average time taken by Stata MP to the average time taken by R. It turns out that the R is faster.

Model	Replications	Average time (R - 4 core)	Average time (Stata MP - 4 core)	Relative efficiency
OLS with SE clustered by firm	1000	0.0737	0.1635	2.22
OLS with SE clustered by time	1000	0.0557	0.0742	1.33
FE regression with SE clustered by firm	1000	0.0880	0.3176	3.61
FE regression with SE clustered by time	1000	0.0729	0.1118	1.53

Multi-level clustering in R

Two way clustering does not have a routine estimation procedure with most of the Stata commands (except for ivreg2 and xtivreg2). There are a few codes available online (See for example, here and here) that do two way clustering. This is easily handled in R, using the vcovDC.plm() function. The function can be used in a similar fashion as vcovHC.plm().

# OLS with SE clustered by firm and time (Petersen's Table 5)
coeftest(pooled.ols, vcov=vcovDC(pooled.ols, type="sss"))

A more recent addition, multiwayvcov package is useful for clustering on multiple levels and, in computing bootstrapped clustered standard errors. The package supports parallelisation thereby, making it easier to work with large datasets. Two functions are exported from the package, cluster.vcov() and cluster.boot(). cluster.vcov() computes clustered standard errors, whereas, cluster.boot() calculates bootstrapped clustered standard errors. The code for replicating Petersen's results is available in the reference manual of the package. One limitation of cluster.vcov() is its inability to work with plm objects. This is because the package imports estfun() from the sandwich package, which is not compatible with plm objects.

R code

Here's the R code to reproduce the results.

Dhananjay Ghei is a researcher at the National Institute of Public Finance and Policy. He thanks Ajay Shah, Vimal Balasubramaniam and Apoorva Gupta for valuable discussions and feedback.

Monday, June 13, 2016

Problems of the Health Management Information System (HMIS): the experience of Haryana

by Smriti Sharma.

Last year during a "Beti bachao, Beti badhao" video conference, errors in the data became visible. The `Maternal Infant Death Review System' (MIDRS) of Haryana showed that the health staff in some districts of Haryana had been grossly under-reporting deaths of mothers and infants. As an example, for the trimester of April-June 2015, the number of infant deaths measured in the MIDRS was 3,307, but only 728 were reported into in the Health Management Information System (HMIS). For maternal deaths, HMIS showed 21 deaths while MIDRS showed 145.

MIDRS is a surveillance-based system which was launched by the Haryana government in 2013 to keep tabs on such under-reporting. The system includes a mixture of routine passive data collection and active surveillance by specially recruited and trained field volunteers. Ironically, HMIS too was conceived as a mechanism to monitor the functioning of the National Health Mission (NHM).

Inaccurate data in HMIS raises concerns about the working of NHM. In this article, we take a close look at the HMIS in Haryana and understand the sources of difficulties.

HMIS: A management tool for National Health Mission

The Indian government launched the `National Rural Health Mission' in 2005. This was renamed as the `National Health Mission' (NHM). HMIS was intended as a management information system to oversee the working of NHM. NHM is a national mission that runs through the length and breadth of the country. There are approximately 1.8 lakh health facilities that make use of HMIS to capture data.

HMIS captures data about antenatal coverage, immunisation coverage, delivery services, family planning coverage indicators etc. Some states like Haryana used another system called DHIS for data collection at the State level. These systems remained in operation, but their data was uploaded into HMIS to achieve comprehensive information in HMIS.

Substantial public expenditures are taking place through NHM. For NHM to work effectively, HMIS must be sound. Hence, the reports about errors in HMIS are particularly alarming. If HMIS contains faulty information, there may be substantial failures in the working of NHM. This motivates HMIS as the object of study.

How does HMIS collect data?

Figure 1 shows how data flows into HMIS. The Indian public sector health system has multiple tiers, where the first point of contact between the community and the health system is the sub-centre where the most peripheral health services are provided. Here at the sub-centre, when a pregnant woman walks in, an auxiliary nurse & midwife (ANM) jots down her details into her register.

The ANMs at the sub-centre level do not have access to computers and have to record information in handmade registers. ANMs maintain multiple registers and carry all of them every month to the relevant Primary Health Centre where the information from their registers are transferred onto the DHIS (in Haryana) by a Data Entry Operator.

Figure 1: Data flows in HMIS

Figure 1 also shows the different levels at which the data is aggregated. Data from the sub-centre, primary health centres and community health centres is aggregated at the block level. The data sent from the block level and sub-district hospital and district hospital is aggregated at the District Headquarters. The District Headquarters then sends the aggregated data to the State Headquarters which forwards it to the national level.

Issues of data quality

Numerous concerns have been raised about the quality of the data. Singh et. al., 2014, found that many districts in Haryana were routinely over-reporting data. For example in Palwal district:

ANC registrations were 47% higher than the total number of expected deliveries. Expected deliveries refer to the probable number of pregnancies which is calculated by multiplying the total population of the area by the birth rate.
Reported deliveries were 11% higher than the expected deliveries.
Measles vaccines administered were 16% higher than the number of reported live births.
Overall, only one district (out of the 21 districts in Haryana) did not have reported occurrence (eg. immunisation rates, deliveries, children weighed) higher than the total population.

In a recent paper, Sharma et. al., 2016, found that the ANMs over-recorded the following two indicators the most:

3 or more Antenatal Care (ANC) visits by pregnant women
Provision of 100 or more Iron/Folic Acid (IFA) tablets

When the ANMs reported the data for monthly submission, the data they inflated the most pertained to:

IFA supplementation
Contraceptive device insertions
Administration of 2nd dose of Tetanus Toxoid (TT) to pregnant women

The authors find that data were over reported because it was known to the health staff that these particular indicators were crucial to the success of the program. Numbers were inflated when the actual coverage of service delivery of a sub centre was low; inflating the data helped to hide low performance.

Going by IPHS Guidelines, it is the responsibility of ANMs to register pregnant women and provide at least four antenatal check ups to pregnant women. They are also responsible for administering IFA tablets. We may conjecture that by inflating these indicators, ANMs were making their performance look better than it was.

Reasons for bad data quality

Why is the quality of data so bad?

Lack of capacity

HMIS was launched in 2008, but as yet, computers and the internet have not reached down the entire chain. There are two chronic problems:

Lack of infrastructure: Data entry at the sub-centre level is by ANMs writing into physical registers. There are bound to be errors at this level because ANMs record data in handmade registers which are very badly designed. These registers sometimes do not have enough space available to write. Also, handmade registers do not necessarily capture all information that is necessary for the DHIS.
Over-burdened manpower: At the PHC level, the Data Entry Operator is responsible for entering data for DHIS. Alongside, she is responsible for fulfilling several other reporting requirements too. For example, there is another health information system called Mother and Child Tracking System (MCTS). This too has its parallel reporting requirements and the Data Entry Operator has to report data for MCTS too. Similarly, the Data Entry Operator has to undertake data entry of immunisation report, vaccine and logistics, release and logbook data.

Lack of accountability

We spoke with the personnel at all levels of HMIS in Haryana. We were told that data errors happen, and are verbally pointed out over telephone calls. Those numbers are then corrected and re-submitted. There is a casual, informal camaraderie between Data Entry Operators and the Monitoring and Evaluation Officer. They all seem to sympathise with each other and have a shared belief that they are over-burdened, which justifies human errors. This situation is not unique to Haryana. According to National Health System Resource Centre's assessment of HMIS in 23 states, 72% of the states give feedback to the districts on the HMIS data. Out of these, 61% give feedback to blocks. However, the feedback is given verbally. Only 38% of the states give feedback via emails or letters. The assessment also showed that the feedback is used to manipulate data and not to improve quality.

The way forward

What you measure is what you can manage. The fact that HMIS has poor measurement raises important concerns about NHM. It is hard to see how NHM can effectively generate bang for the buck when it is grounded in an inaccurate management information system.

While duplication of data reporting is inefficient and a source of discrepencies in data, there are multiple databases which have health related data. These include District-level Household Survey, National Family Health Survey, Annual Health Survey and the Mother and Child Tracking System (which follows each individual mother and child). These should be used on a regular basis to cross-check the information in HMIS, so as to uncover data problems sooner.

The movement of aggregated data betrays IT systems design that is many decades out of date. In a modern IT system that would be constructed today, only transactions would be stored (e.g. one death). All aggregation would be done on the fly when queries are required.

Some early steps towards true business process engineering are easy to envisage. An online application called ANMOL or ANM Online, allows ANMs to use tablets to enter and update the service records of the beneficiaries on real time basis. Since the entire process is digital, the ANMs don't have to carry or maintain the registers and the entire process becomes paperless for them.

The problems of HMIS and NHM are primarily a question of incentives and public administration, and not about computer technology. For a counterpoint, in the 18th century, the recording of deaths in the US and in Europe was being done correctly. It does not require great technology to do these basic things.

Too often in India, there is a temptation to solve problems of public policy with computer technology. As argued in Shah, 2006, these projects must be located around two elements: of doing a full blown business process re-engineering, BPR, (i.e. not a superficial layer of computers on top of the old process), and of removing discretion with front line staff.

References

Singh, Gajinder Pal, Jordan Tuchman and Michael P. Rodriguez, Improving Data for Decision-making: Leveraging Data Quality Audits in Haryana, India, Abt Associates Inc., 2014.

Shah, Ajay, Improving governance using IT systems, page 122-148 in `Documenting reforms: Case studies from India', edited by S. Narayan, Macmillan India, 2006.

Sharma A., Rana S. K., Prinja S., Kumar R., Quality of Health Management Information System for Maternal & Child Health Care in Haryana State, India, PLoS ONE, 2016.

The author is a researcher at the National Institute for Public Finance and Policy, New Delhi. I thank Jeff Hammer and Ajay Shah for useful comments.

Thursday, June 09, 2016

Interesting readings

An idea whose time has come by Somasekhar Sundaresan in The Business Standard, 6 June.

Why India needs a new FDI regime by Bhargavi Zaveri and Radhika Pandey in The Business Standard, 4 June.

We Indians should not fall for the same politicised traps of deifying our heroes by Paroma Roy Chowdhury in The Economic Times blogs, 4 June.

Discrepancies' drive GDP growth by Manas Chakravarty in The Mint, 1 June.

No, Mr Modi, you are not yet transforming India by Mihir S. Sharma in The Business Standard, 30 May.

Prioritise banking reforms by Ajay Shah in The Business Standard, 29 May.

I wish the Prime Minister had said by P. Chidambaram in The Indian Express, 29 May.

RBI's Luddite licensing by Debashis Basu in The Business Standard, 29 May.

The fall guy by Ila Patnaik in The Indian Express, 28 May.

Narendra Modi and India's Dashed Economic Hopes by Aatish Taseer in The New York Times, 27 May.

Demographic dividend is under way with collapse in fertility by Sanjeev Sanyal in The Economic Times blogs, 26 May.

Why the new BIT may not work by Sumathi Chandrashekaran and Smriti Parsheera in The Hindu Business Line, 25 May.

People of no religion outnumber Christians in England and Wales-study by Harriet Sherwood in The Guardian, 23 May.

A good new book: Fraud, Manipulation and Insider Trading in the Indian Securities Markets by Sandeep Parekh in The Wolters Kluwer.