Monday, October 15, 2018

The Justice Srikrishna Report and Digital Monopolies

by Jai Vipra.

Much has been said about the recommendations of the Justice Srikrishna Committee Report on Data Protection, particularly with respect to the broad exemptions made for the state on the non-consensual use of people's data. (The Quint, 2018) However, there is little analysis of how a largely consent-based framework affects people's control over data and over the effects arising from the use of their data.

In this article, I make the case that digital monopolies exist and are driven partly by exclusive control over data; that these monopolies can be broken up with free data portability; that the provisions in the Draft Data Protection Bill are strict on consent but lax on portability, and that by being so, they might end up entrenching existing digital monopolies.

The Committee Report takes the approach of fiduciary responsibility of data. In this fiduciary relationship, the generator of data is the 'data principal', and the user or holder of that data is a 'data fiduciary'. The data principle consents to transfer his data to the fiduciary. The data fiduciary then has the responsibility to use the data principal's data in her best interest and in a fair and reasonable manner. This approach was taken because the Committee considers this the best way to protect a person's privacy. (Sengupta, 2018) However, as I will discuss below, privacy is not the sole value to be protected in this case. I also discuss how an exclusive focus on privacy and consent might lead to outcomes inimical to the social good.

Digital monopolies and control over data

To begin with, making the choice to entrust your data to someone with fiduciary responsibility might guarantee privacy. But it does not automatically mean you control the data and the value resulting from the use of that data. One of the harms that can result from massive data collection is the risking of privacy. But another harm is the limiting of competition.

Consider why, while the rest of the market was shrinking, Google and Facebook together captured USD 32.7 billion growth in digital advertising spending (in the first half of 2016). (Holland, 2017) With the large amounts of data being collected by these companies, precision targeting in advertising wins over all other kinds of advertising. Precision targeting means that advertisements are targeted to the individual based on their preferences and activities, rather than to groups of individuals. Companies that are able to offer precision targeting are able to capture the market, while all other companies lose out (Matsakis, 2018).

Coupled with the existence of network economies, only a few large companies are able to offer precision targeting in this manner. The problem here is not that precision targeting is necessarily bad, but that data control with network effects leads to monopolisation of the market. (Hindman, 2009) Hindman's empirical work also shows how the digital economy is more concentrated than traditional media.

The effects of monopolisation are not limited to the advertising market. Some examples might help illustrate the kinds of harms digital monopolies lead to. An entertainment platform like Netflix that controls most of the market can easily refuse content creators a market for their work, or can charge them exorbitantly due to the lack of competition born from the control of viewer data. Proven examples also exist: Google provided Getty Images with a choice of either allowing users to download images directly from search results, or excluding Getty images from search results, both unviable choices for Getty Images. Yelp faced allegations of extorting restaurant owners to purchase ads on the threat of bad reviews.

Due to monopolisation and particularly due to the emergence of Facebook, Amazon and Google, even though more content is being created than ever, less money is flowing to content creators as platforms soak up an ever-increasing share of the returns (Taplin, 2017). It is now mathematically impossible for a small company starting off in a garage to compete with Google. (Hindman, 2009) Network effects in the digital economy are driving predatory pricing and entrenched market power. (Parsheera et al., 2017) A digital monopoly can be as harmful as any other monopoly, and control of data (in addition to network effects) is an important way in which digital monopolies are maintained.

Data portability as a solution

What then are the solutions to monopolisation? One cannot change the fact that network effects exist in two-sided markets. But one can change the other part of the equation, that is, the control of data. If we want to reduce the monopolistic hold that it is possible to have over data, we have to make the data freely portable. This must be contingent on user consent to said portability, given the privacy concerns with data transfer. Thus, we have to create the ability for the user to transfer the data he generates to himself or to another entity, perhaps a competitor of the data fiduciary.

Argenton and Prufer (2012) have proposed something similar in the context of Google biasing its search results to favour its own subsidiaries, such as Google Maps. With a model that analyses indirect network externalities, they establish that search engine data control leads to monopolisation and reduces economic welfare. They propose that search engines should be required to share data on previous searches (with other search engines) to countervail this tendency.

One example of a recent data portability policy is the open banking regulation in the UK. The UK's Competition and Markets Authority found that big banks in the UK had an unfair advantage over other banks and fintech companies because they held valuable data on their customers, for example, transactions data. It also found that consumers would save money by using more suitable financial products if their data was freely portable. Under the Open Banking Regulations, the banks are required to share customer data, with customer consent, with any service provider through open Application Programming Interfaces (APIs). This enables any entity to easily build products that compete with bank products based on that data. It also works to break up the data-driven hegemony of big players in the market.

What might data portability lead to?

However, there are some difficulties here, particularly if we ask who data really belongs to. Consider an example: if I buy a book on Amazon, I create a piece of data. This is not sensitive data, but it is valuable data. Amazon uses this data and derives value from it in multiple ways: with targeted advertising, tailored product pages, by collating it with other pieces of data and observing profiles and trends, etc. To do all this, it processes the data and modifies it. The original data might belong to me, but who does the processed data belong to? What, indeed, is processing? Collection is not a costless activity, and so is collection processing? Even once processed, it is not that straightforward to imagine that the data belongs entirely to Amazon. With respect to personal data, Arghya Sengupta likens processed data to an envelope with a letter in it. The letter still belongs to the user and it is not possible to separate the envelope and the letter, and ownership is made tricky (Sengupta, 2018).

If we decided ownership based on the work put into production, all data (and indeed all property) would be collectively owned. But under current economic and legal arrangements, ownership is decided based on contracts. If I lease my land to you under a year-long contract, legally the land does not automatically belong to you even if you improve the soil quality by cultivating on it. Given these economic and legal arrangements, if I give Amazon my data under a contract that includes portability, Amazon does not own that data even if it processes it.

The Open Banking regulations described above essentially make all contracts conform to this norm of portability. In this way they aim to create a market where the choice of sharing data remains with the user and not the business. In the cases of open banking and search engine data, lack of portability was shown to create monopolies. When this is shown, there is a case for government intervention through mandating contracts with portability in order to fix this market failure.

In our example, this would mean that I can decide to transfer the data I generate by buying a book from Amazon. I could transfer it to an Amazon competitor, who then would also be able to offer targeted advertising. Or I could transfer it to an app that gives me financial advice based on my spending patterns. Amazon would still retain my data and profile for as long as I want it to. It would simply port a copy of that data.

Amazon, given its current business model, loses out in this deal. The extinguishing of potential property rights by outlawing contracts that do not include portability will not be a costless or seamless move. Open banking regulations may make banking as it exists today unviable. In fact, they are widely expected to change the nature of the banking business, turning banking into either a platform or service for other businesses rather than a standalone activity. (Finastra, 2018)

Likewise, free data portability in the overall economy might mean that Google or Facebook are no longer able to provide their services for free. However, the 'free' nature of these services hides costs that already exist. The free service, such as email, is bundled with an ad, and given network effects, this bundling leads to monopolisation. The costs of this bundling are all the costs of digital monopolies listed earlier. Portability does not directly cause unbundling, but it reduces the advantage of the market leader in bundling, and thus makes providing free services in return for ads (and other uses of data) less viable.

The status quo is that platforms are free and make money from the control over data. This status quo includes monopolisation. Free data portability means choosing a model where platforms charge for use and do not make money from control over data. It also means that innovations that require big data can be made by smaller companies, public bodies, or your friendly neighbourhood programmer. Of course, the absence of monopolies is not welfare-enhancing in every context, and likely social costs and benefits of such an intervention need to be examined in much more detail.

The question of what kind of data should fall under free portability is not straightforward to answer. It is clear that some data would need to be portable - for example, ride history in Uber. But many other kinds of data are collected, such as how quickly you book a cab at certain times of day, when you book a shared ride versus an individual cab, how your phone battery level determines your willingness to pay a certain price, etc. Which of these data points is it reasonable to port?

In the UK open banking example, the Competition and Markets Authority (CMA) determined through a careful study the kinds of data that were driving monopoly and were thus important to port. (Competition and Markets Authority, 2016) Then, an Open Bankng Implementation Entity (OBIE) was formed, governed by the CMA and funded by the banks mandated to share data. The OBIE determines API specifications and standards. A model on these lines, where the government and companies arrive at data to be ported through studies, might work. This data would differ from industry to industry.

Data portability in the Justice Srikrishna Report

The Draft Personal Data Protection Bill, released along with the Report, also has provisions on data portability. Data fiduciaries are obliged to provide to users, on request, data generated or provided by the user.

However, the fiduciary can refuse to share data based on some grounds. These include that sharing would reveal a trade secret, would not be technically feasible, processing of data is necessary for the functions of the State, or processing is in compliance of law. It can also charge a fee for providing the data in some cases. The user has legal recourse in the event that an exemption is used unreasonably. But that would likely be a long, costly process to be undertaken by each principal. Unlike the open banking regulations, in the Indian draft bill, data needs to be shared only with the user, and not with any entity with the consent of the user, creating an additional step for the user. This, along with the broad exemptions to sharing, severely restricts the mobility of data and hence consumer choice, and perpetuates monopolisation, as demonstrated above.

The draft Bill gives us extensive rights to say "Do not share my data". It does not give us nearly as many rights to say "Do share my data". More sharing rights would mean that Amazon would be mandated to automatically share my purchase history, with my consent, to anyone who asks, for free. It would also mean that Amazon would not have as many possible reasons to refuse to share this purchase history. There are both costs and benefits to this, but it is not a choice to be made without examining both.

In a way, we have already seen the effects of limiting data sharing (although not strictly free portability with consent) in favour of protection. After the Cambridge Analytica scandal, Facebook shut down third party access to users' friends' data in response to heavy criticism. This has affected researchers who relied on Facebook data, particularly on interactions with friends, for their research. (AEDT, 2018) The lack of an easy portability option has effectively made Facebook the sole owner of that data and will inhibit research.

The Draft Bill also puts in place collection limitations (only data that is necessary for the purposes of processing can be collected) and purpose limitations (personal data can only be processed for clear, specific and lawful purposes that are reasonably expected). Whether the collection and purpose limitations change the monopolistic nature of this market by themselves depends on (a) whether multiple uses of the same data have been driving monopolisation, or whether the sheer volume of data matters more, and (b) whether most people will withdraw consent to multiple uses of their data. It is possible that these questions have different answers in every case, and these effects need to be studied once these limitations are in place.

Consent in a concentrated market

Further, the onerous requirements of consent - different forms, layered notices, etc. - outlined in the Report are likely to lead to more monopolisation in the market. This is due to the high compliance costs which are easier for a large company to bear, but also because of restrictions on sharing data with third parties, there is now an incentive to own the entire value chain. A social network will find it easier to make its own payment platform rather than transfer data to smaller payment platforms. When a business starts doing everything, it reduces choice for consumers as well as workers. The consent framework incentivises entities that already control data to use it for many other purposes - consent, and not competitive forces, being the only barrier.

A million different consent calibrations - I consent to see ads, but not to predictive text in my emails - do not change the business model of digital two-sided markets, which rely on freely generated data for value creation through targeting and prediction. There will exist creative ways of acquiring consent, and there will be a small subset of people who refuse consent - still changing very little about market structure. Besides, consent is not very meaningful in platform economies as they exist today. We consent to WhatsApp terms and conditions because all our friends are on WhatsApp - and opting out means missing out. In such conditions, consent can hardly mean that the larger effects of data use were chosen by the user.

Competition, interoperability and portability

Standards for interoperability function on the same lines of thinking. Telecom operators are mandated to provide interoperability across networks in order to ensure competition. For example, Airtel cannot refuse to connect calls to Vodafone subscribers. Data portability is somewhat like a way of providing interoperabilty so that markets are not captured. Data portability as such falls under the ambit of the Competition Act and is therefore a question separate from data protection. However, the implementation of data protection standards without consideration of competition issues might concentrate market power, and thus both these issues need to be considered together.

Conclusion

There is a growing body of literature on how consumers perform unpaid labour every time they use an AI product or service, as their use, through data generation, provides feedback to algorithms that make the product better. (Crawford and Joler, 2018; Hesmondhalgh, 2010; Arvidsson, 2008) In this context, continued user control of what is done with data merits consideration; control that might not be achieved through consent requirements. User control of data also means user control over the systemic effects of data use. Consent is important and necessary, but stringent consent provisions together with weak portability requirements in a monopolised market only serve to entrench existing monopolies. With this, all the good that can come out of data used in the public interest is also restricted.

Portability will not fix all the ill-effects of market concentration in the digital world, especially those of network effects due to the existence of platforms. But it will reduce one aspect driving market concentration, that is, data control.

While privacy is a valuable goal and stringent consent requirements do help achieve this goal, we must be careful not to conflate all issues related to technology in our times with the single issue of privacy. Privacy and security need to be balanced with opportunities for society to use its own data for its own benefit. The issue of portability needs to be examined in this context.

References

Adam Arvidsson, The Ethical Economy of Customer Coproduction, Journal of Macromarketing, 2008.

AEDT, Cambridge Analytica scandal: legitimate researchers using Facebook data could be collateral damage, The Conversation, 2018.

Arghya Sengupta, Why the Srikrishna Committee Rejected Ownership of Data in Favour of Fiduciary Duty, The Wire, 2018.

Cedric Argenton and Jens Prufer, Search Engine Competition With Network Externalities, Journal of Competition Law and Economics, 2012.

Committee of Experts under the Chairmanship of Justice B.N. Srikrishna, Personal Data Protection Bill, 2018.

Competition and Markets Authority, Retail Banking Market Investigation, 2016.

David Hesmondhalgh, User-generated content, free labour and the cultural industries, Ephemera, 2010.

Finastra, Bank as a Platform - The Essential Tools for Open Banking and PSD2, 2018.

Jonathan Taplin, Move Fast and Break Things: How Facebook, Google and Amazon Cornered Culture and Undermined Democracy, Little, Brown and Company, 2017.

Kate Crawford and Vladan Joler, Anatomy of an AI System, 2018.

Louise Matsakis, Facebook's targeted ads are more complex than it lets on, Wired, 2018.

Matthew Hindman, The Myth of Digital Democracy, Princeton University Press, 2009.

Smriti Parsheera, Ajay Shah and Avirup Bose, Competition Issues in India's Online Economy, NIPFP Working Paper, 2017.

The Quint, Experts React to Data Protection Bill: Key Concerns and Takeaways, 2018.

Travis Holland, How Facebook and Google Changed the Advertising Game, The Conversation, 2017.

 

The author is a researcher at the National Institute of Public Finance and Policy. The author thanks Anirudh Burman and Shivangi Tyagi for useful discussions. The two anonymous reviewers provided very helpful directions for thinking and insights that have been incorporated into this article.

No comments:

Post a Comment

Please note: Comments are moderated. Only civilised conversation is permitted on this blog. Criticism is perfectly okay; uncivilised language is not. We delete any comment which is spam, has personal attacks against anyone, or uses foul language. We delete any comment which does not contribute to the intellectual discussion about the blog article in question.

LaTeX mathematics works. This means that if you want to say $10 you have to say \$10.