Search interesting materials

Tuesday, February 19, 2019

Disclosures in privacy policies: Does 'notice and consent' work?

by Rishab Bailey, Smriti Parsheera, Faiza Rahman and Renuka Sane.

In a recent paper, Disclosures in privacy policies: Does notice and consent work? we evaluate the quality of privacy policies of five popular online services in India -- Google, Flipkart, Paytm, WhatsApp and Uber. Our goal is to question whether the present notice and consent regime is broken because of the way in which privacy policies are designed?

We analyse the identified privacy policies from the perspective of access -- how easy are they to find, how easy are they to read, and on issues of substantive content -- how well do they conform to well recognised principles of a model data protection law? In doing so, we evaluate whether the policies have specific, unambiguous and clear provisions that lend themselves to easy comprehension. It is pertinent to highlight that the versions of the privacy policies that were accessed for this study were dated as of March, 2018 i.e. before the European General Data Protection Regulations (GDPR) was enforced.

We try to evaluate how much do users typically understand of what they are signing up for, and if this can inform us on whether consent is an effective tool to enable individual control over personal data in the online environment. We conduct surveys in five universities in and around New Delhi, and randomly assign one of the five privacy policies to students in the classroom along with a questionnaire. The questions are classified into three categories -- 'easy', 'intermediate', and 'difficult'. The easy questions have a simple and direct answer in the policy. The intermediate questions require a closer reading of the policy making it slightly harder to figure the correct response. The difficult questions require careful reading and some inference. We evaluate the level of understanding of the policies based on the number of questions answered correctly.

Setting the context: Why is this question important?

The 'notice and consent' framework has been the basis for much of the thinking in modern data protection and privacy laws. It relies on the ability of providers to collect and process personal data conditional on providing adequate information to, and obtaining the consent of, the data subject. Its intuitive appeal lies in the normative value of individual autonomy that is the cornerstone of modern liberal democracies. Seeking consent is said to ensure an individual's autonomy and control over her personal information, enabling 'privacy self-management' (Solove, 2013).

There is, however, a growing concern around the inability of this model to provide individuals with meaningful control over their data in light of evolving technologies and data practices (Mathan, 2017). Concerns in this regard arise due to numerous reasons, including the fact that most people do not read the policies, do not opt out or change the default privacy settings (CPRC, 2018; ISOC, 2012), are not able to understand the policies, face consent fatigue and are therefore unable to make rational choices about the costs and benefits of consenting to the collection, use, and disclosure of their personal data (McDonald, 2008; Solove, 2013). Many privacy harms flow from an aggregation of pieces of data over a period of time through interconnected databases of different entities, or from the use of complex machine learning algorithms to make automated decisions. It is, therefore, unrealistic to expect people to assess the impact of permitting the downstream use and transfer of their data (Solove, 2013). Moreover, privacy policies are often binary in nature where people either have to fully opt-in or completely opt-out of using the services (Cate, 2013).

A large body of literature has therefore evolved that demonstrates that consent is broken, and yet, accepts the necessity of finding ways to make the notice and consent regime work better. These points about the evolving nature of consent have also been acknowledged in policy and legal debates. For example, in Europe, the recently enforced GDPR has continued (and in fact attempted to strengthen) the consent model implemented under the Data Protection Directive of 1996, while also setting out several duties of data controllers. In August 2017, the Supreme Court of India recognised the fundamental right to privacy (Puttaswamy, 2017). Around the same time the Government of India constituted a committee under the chairpersonship of Justice B. N. Srikrishna (Srikrishna Committee) to draft a data protection law. The Srikrishna Committee's report and the draft Personal Data Protection Bill, 2018 submitted by them to the Government have affirmed the central role of an effective notice and consent regime, making consent one of the grounds for processing of data.

As per the Srikrishna Committee's recommendations, for consent to be valid, it should be 'free, informed, specific, clear and capable of being withdrawn'. In case of 'sensitive personal data', the draft Personal Data Protection Bill, 2018 proposes a higher standard of 'explicit consent' with additional requirements on what would amount to informed, clear and specific consent with respect to such data. Given the critical role of consent in the draft law, it becomes important to question whether, and how, consent based frameworks can be made to work better?

Results: Accessibility

We first analyse accessibility of the selected privacy policies (Google, Flipkart, Paytm, WhatsApp and Uber) based on a series of measures including how embedded a policy is within a particular website, the length of the privacy policies, and the languages they are made available in. We find that the policies can generally be accessed through 1-3 clicks (from the main web page). However, the links to the privacy policies are usually positioned at the bottom of the main web page, and in relatively small font size. This does not lend itself to easy discoverability, particularly as links to the privacy policies are usually not highlighted.

As far as length of the policies is concerned, the privacy policies of the Indian companies we studied are significantly shorter than the studied multinational companies ('MNCs'). This is largely due to the greater number of issues touched upon as well as more detailed explanations of rights and obligations by the MNCs. Some of this may be due to the fact that the MNCs' policies may be following some of the obligations under data protection laws of foreign countries that contain more onerous requirements than India's Information Technology Act, 2000 (and the rules under it).

Interestingly, Google is the only company amongst those studied that provides a copy of its privacy policy in languages other than English. Despite some of the other websites being made available in Indian languages (for instance, Uber's website can be accessed in Hindi), the privacy policy continues to be accessible only in English. This clearly illustrates a problem in a country where English speakers number roughly only 10-15 percent of the population.

Results: Readability

While measuring readability is not an exact science, tools such as the Flesch-Kincaid reading ease and grade level tests have been used for decades to analyse metrics such as word and sentence length and their impact on readability. It should be noted that the model does not actually analyse the meaning of words used, whether they could have multiple or ambiguous meanings, whether words used in the text are commonly used, etc. It is therefore possible for a completely un-understandable text (consisting of short but rarely used, complex or ambiguous words) comprising short sentences with short words to have a high readability score. Having said this, the scores do provide a useful comparative matrix to evaluate the readability of the privacy policies.

Applying the Flesh-Kincaid test to each of the privacy policies under study, we find that the policies are rated as either 'very difficult' (Uber, Google, Paytm) or 'difficult' (Flipkart, WhatsApp). The reading ease score of the policies ranged from 16.44 (Uber) to 41.03 (Flipkart) -- a higher score indicates better readability. To put these scores in context, Reader's Digest has a readability score of about 65; Harry Potter books are in range of 80s; and Harvard Law in the low 30s (Lively, 2015; Flesch, 1979).

The results therefore indicate that all the privacy policies under study are complicated documents and require a firm grasp of English and reasonably advanced comprehension abilities to be understood. Given that the target audience for many of these online services ranges from adolescents upwards, it appears that the privacy policies will prima facie be too complicated for many users to comprehend.


Results: Visual presentation

Another way in which reading a privacy policy can be made easier, both in terms of readability and comprehension, is through the use of highlights, marginal notes and by properly segregating and identifying overarching topics. We find some evidence of this in the studied policies.

Uber's privacy policy is divided into multiple sections with each sub-heading in bold font. The policy also contains marginal notes that summarise each section, thereby making the policy easier to understand at a glance. Notably, Uber also provides an easy-to-read summary of their privacy policies in a separate "overview" page. Google's privacy policy also contains segregated sections, and a table of contents which permits easy access to different portions of the policy. Interestingly, the policy also frequently uses layered information or pop-ups where additional information is presented pertaining to certain terms and activities when a user moves the cursor over highlighted words. While WhatsApp also provides segregated sections, it does not generally provide additional information in a layered manner or highlight particularly important information (though, certain highlighted terms do allow click-throughs, for instance "Facebook family of companies" and "cookies").

The two Indian companies -- Flipkart and Paytm -- do not provide layered information or any further click-throughs in their privacy policies. Flipkart demarcates sections using a bold font (in the same font size as the rest of the document), while Paytm utilises a larger font size, in bold, for section headings. The effects of some of these presentation strategies, like click-throughs and pop-ups, are however not reflected in our survey results as the survey was conducted using printed copies of the privacy policies.

Results: Terminology

We focus next on the kind of terminology used in the privacy policies. Our focus here remains on the text of the policies, without getting into the the manner in which the policies may be implemented in actual practice. We note that the use of legal and technical terminology in a privacy policy can lead to a decrease in comprehensibility for the user. Unless specifically defined, a user may not be aware of the true import of a particular word, particularly if technical in nature.
For instance, WhatsApp's privacy policy says:

we do not retain your messages in the ordinary course of providing our Services to you

This does not define what the phrase "ordinary course" implies or explain what the exceptions are. A user may, on a thorough reading of the policy come to understand that an exception may apply to situations where for instance, law enforcement is involved. However, there is no clarity on this. Similarly, the use of words and phrases such as "third party", "affiliate", "profiling", etc., may also lead to confusion in the minds of users given the absence of any specific definitions.

Connected to the problem of lack of adequate information within a privacy policy, is the issue of whether the information being provided is trustworthy and reliable. While it is outside the scope of the present paper to examine the issue of trust in online services, it must be kept in mind that online businesses frequently appear to treat user privacy rights with less than due respect (not least due to the lack of bargaining power and information asymmetry between the parties).

Results: Substantive analysis

For the substantive analysis of the policies, we analyse the policies based on how they conform to well recognised principles of a model data protection law -- i.e. whether they detail the methods and manner of collection of data; the permitted uses of data; information sharing practices with a third party, including with affiliated entities, and law enforcement; whether users are informed of data breaches; whether users are given rights pertaining to access, deletion and export of data; and whether users can seek clarifications or information about the uses of their data or the privacy policy itself. We also evaluate whether these policies have specific, unambiguous and clear provisions that lend themselves to easy comprehension.

The table below provides a snapshot of whether policies have specific provisions on the ten issues identified as a basis for analysis of policies (Y indicating that the issue is addressed in the policy, N indicating it is not, and NS indicating the issue is not specified.)

Our analysis indicates that many parts of the policies are poorly drafted, often containing language that seems intended to insulate the company from liability rather than genuinely informing the user. In several cases, the policies do not include rights that would be considered essential in a modern privacy framework (for instance clauses covering data breach notification, or data retention periods). Sometimes, the policies also seem to assume that the user has knowledge of legal terms and is up-to-date with statutory and other regulatory requirements in their jurisdiction (for instance, the policies studied frequently use terms such as 'to the extent permitted by law', 'as permitted by law', etc.).

Overall, we find that privacy policies are fairly widely drafted to permit service providers broad powers to collect and process information in pursuance of their business interests. Users currently have little to no leeway in amending the contracts entered into by them and must usually sign up for the entire contract if they wish to access the service (though certain services such as Google and WhatsApp do include some granularity in their privacy policies).

Results: Survey

Survey respondents do not obtain very high scores on the privacy policy quiz. The average score of the sample (155 students) is about 5.3 on 10, i.e. on an average respondents were able to correctly answer 5 out of the 10 questions. The policy-wise scores varied between 4.6 (WhatsApp) to 5.9 (Uber).

Respondents fared the worst on policies that had the most unspecified terms, and on policies that were long. They also seemed unable to understand terms such as 'third-party', 'affiliate' and 'business-partner', that are often used in the context of data sharing arrangements.

Not surprisingly, we find that a greater percentage of respondents got the easier questions (as classified by us) correct. For example, almost 76% of the respondents got the correct answer to Q1 on collection of data; about 68% got the correct answer to Q5 on data sharing with the Government, as this information was explicitly provided in most of the policies. The more difficult questions, classified based on factors such as the use of complex legal terms or ambiguity about specific provisions, saw poorer results.

We believe that the complexity of the language and inadequacy of specific details in the policies are reflected in the low understanding of respondents. What is interesting about the responses to the survey is that when provisions are clearly drafted, or when users can be expected to find the answers in the policy, they are more likely to evaluate the questions correctly. However, when terms whose meaning is not precisely defined are used (such as 'third-party' and 'affiliate', for example), respondents make more mistakes. This suggests that in an environment where respondents actually do read the policy, and when the policy is unambiguously drafted, respondents are able to make better sense of what is being offered to them. Better design and drafting of privacy policies is therefore a prerequisite for notice and consent to work better.


While surveys of a similar nature have been conducted in other jurisdictions, we are not aware of any similar study (to understand how users interact with privacy policies) involving Indian participants. The peculiarities of the Indian context throw up new challenges of diversity in language, literacy, modes of Internet access and other variations among the over 500 million Internet users in India. All of these factors will play a role in determining the appropriate design of disclosures and consent frameworks for Indian users.

Our study makes a modest start in that direction by questioning how well do educated, English-speaking users fare in terms of understanding privacy policies. Making the same privacy policies accessible to the larger set of Indian users, many of whom are first time adopters of technology, is undoubtedly going to be a much larger challenge. The study therefore raises further questions on what drives understanding of privacy policies -- whether factors such as age, education, intelligence quotient, comfort with English, urbanisation, familiarity with Internet-based services, all play a role in how an individual evaluates what is on offer? It also raises questions on how privacy policies should be designed so that users are able to understand them better.

Ultimately, the goal of privacy policies should be to make it possible for individuals to evaluate trade-offs between privacy and service, and make choices that suit their preferences, which might themselves change over time. Finding ways to make the notice and consent framework more meaningful is an essential part of this process.


Solove, 2013: Daniel Solove, Privacy self-management and the consent dilemma, 126 Harvard Law Review 1880 (2013).

Mathan, 2017: Rahul Mathan, Beyond consent: A new paradigm for data protection, Takshashila Discussion Document 2017-03, 2017.

CPRC, 2018: Consumer Policy Research Centre, Australian consumers soft targets in big data economy, 2018.

Flesch, 1979: Rudolph Flesch, How to write plain english: A book for lawyers and consumers, 1979.

ISOC, 2012: Internet Society, Global Internet user survey, 2012.

McDonald, 2008: A McDonald and LF Cranor, The cost of reading privacy policies, I/S: A journal of law and policy for the information society, 4(3), 543-568, 2008.

Puttaswamy, 2017: Justice K.S. Puttaswamy v. Union of India, WP (Civil) No. 494 of 2012, Supreme Court of India.

Cate, 2013: F Cate and V Mayer-Schonberger, Notice and consent in a world of big data, International Data Privacy Law, 3, No. 2, 67-73, 2013.

Lively, 2015: Gerald Lively, Readability, Book Notes Plus, April, 2015.

The authors are researchers at National Institute of Public Finance and Policy. They would like to thank Omidyar Network for supporting this research.

No comments:

Post a Comment

Please note: Comments are moderated. Only civilised conversation is permitted on this blog. Criticism is perfectly okay; uncivilised language is not. We delete any comment which is spam, has personal attacks against anyone, or uses foul language. We delete any comment which does not contribute to the intellectual discussion about the blog article in question.

LaTeX mathematics works. This means that if you want to say $10 you have to say \$10.