Sunday, June 21, 2015

Release of information in machine-readable format

by Ashish Aggarwal.

All data starts out in computers. All data is analysed using computers. However, all too often, materials are produced and placed on websites which are not readable by computers. This dramatically drives up the cost of using the data. There is much to gain from an insistence that the materials which appear on websites -- of financial firms and of regulators -- are machine readable

One example of a success story is mutual funds who provide data like NAV (Net Asset Value) of their schemes as an electronic feed. Third party websites are able to use this to provide annualised return on portfolios, or analysis and comparison of historical data. None of this would have been possible if mutual funds had a chaos of diverse presentation with different fund houses giving out data in different ways.

The above-mentioned electronic feed is an example of machine-readable data. "A computer file" does not constitute machine readable data. Machine readability is obtained where the data can be read and processed by a computer for further analysis and interpretation. Comma Separated Values (CSV) is one example of a machine-readable data format. Other examples include XML files.

The gains from machine readable data


The value of even minimal information, when made accessible in machine readable form, is remarkable. As an example, suppose a government releases adequate information for all consumer courts to be placed on google maps. Once this is done, consumers can start rating the courts. This can support policy analysis and improvement of the courts which are laggards. As more information is released, more sophisticated applications become possible. If case load data about consumer courts is made available, third parties could build software and systems through which one could get a fairly accurate estimate of the queue and expected hearing slot if a complaint were to be filed on any given day. If data on financial firms against whom complaints are filed is also loaded, one would know which firms are generating more complaints.

Consider a household survey run by a regulator. The regulator can release a PDF file with a report which analyses the survey evidence. This is useful and interesting. A big jump is obtained when the regulator releases the record level data. This would make possible novel analysis by third parties, of kinds that may have never been envisaged by the regulator.

A revolution is shaking the world of finance globally, the financial technology revolution. This is critically about opening up data access to new kinds of firms, while access is controlled by consumers. This is about shifting ownership of data from financial firms or governments to consumers, and giving consumers access to sophisticated analytical services which add value.

Developments internationally


These ideas are not unique to India; they are changing the way governments and regulations work worldwide. The U.S. Government’s Open Government Directive of 2009 is one early example of a government that created such an obligation. It said that to the extent practicable and subject to valid restrictions, agencies should publish information online in an open format that can be retrieved, downloaded, indexed, and searched by commonly used web search applications. An open format was defined as one that is platform independent, machine readable, and made available to the public without restrictions that would impede the re-use of that information.

Initially this led to resistance, inconsistent formats etc and required government to create capacity to make it happen. After that, the US government has set up data.gov, home to its open data initiative with tools, and resources to conduct research, develop web and mobile applications and design data visualisations. It followed this up in 2013 by making open and machine-readable the new default for government information.

Many countries have embarked on similar initiatives. As an example, see a paper tabled in 2012 in the UK Parliament about unleashing the potential of open data. In 2011, the Open Government Platform (OGP) was launched as an international platform for domestic reformers committed to making their governments more open, accountable, and responsive to citizens. Since then, OGP has grown from 8 countries to the 65 participating countries. India is not yet on that list.

Implications for the draft Indian Financial Code


The Indian Financial Code (IFC) has drafted strong reporting mechanisms so as to achieve accountability of financial sector regulatory institutions. This needs to be pushed further into the direction of the release of machine readable data. Good reporting can be used more effectively, if the data tables and charts can be read and analysed with minimum frictions through computer programs.

In Chapter 16, `Functioning of the financial agency', the first section `Minimum standard for publication of information' (S.74(2)) says:

All information published on the website or other repository of the Financial Agency must be in an easily accessible and text-searchable format.

The phrase `machine readable format' needs to be defined and used in the law. This would encourage innovative financial sector firms and third parties to provide analysis to consumers using tools like mobile based apps, thus helping consumers make better choices in a timely manner.


The author is a researcher at the National Institute for Public Finance and Policy, New Delhi.

1 comment:

  1. India has started an open data initiative a couple of years back: https://data.gov.in/. There is also a National Data Sharing and Accessibility Policy (NDSAP): https://data.gov.in/sites/default/files/NDSAP.pdf.

    ReplyDelete

Please note: Comments are moderated. Only civilised conversation is permitted on this blog. Criticism is perfectly okay; uncivilised language is not. We delete any comment which is spam, has personal attacks against anyone, or uses foul language. We delete any comment which does not contribute to the intellectual discussion about the blog article in question.

LaTeX mathematics works. This means that if you want to say $10 you have to say \$10.