## Thursday, December 19, 2013

### From clubs to States: The future of self-regulating organisations

by Ajay Shah, Arjun Rajagopal, Shubho Roy.

This post is based on talk at the 2013 National Convention of the Institute of Company Secretaries of India.

India's governance environment is undergoing rapid changes, and this will drastically re-shape the role of Company Secretaries. As a professional body, ICSI needs to understand and anticipate these changes, in order to ensure that its members are equipped to fulfill their critical role. While the ideas here pertain to ICSI, they also apply more generally to other self-regulating organisations.

### The citizen-government interface is changing

The development of the government-citizen interface of a country can be divided into two phases. In the first phase, the interface is characterised by poorly written regulations, wide variation in practice and very bad infrastructure. Each government office uses its own unique processes and practices, and requires physical filings on forms that are difficult to fill. Different branches of government collect the same information in different ways, the same function and form are widely different in different states.

In such contexts, professionals invest time in learning to work'' the system, and create a valuable niche for themselves as indispensable intermediaries between citizens and the state. Twenty years ago filing a personal income tax return was challenging and often required professional help. This culture persists in many government offices: forms have unique' requirements which only experienced' persons know. This knowledge/experience comes from being a member of the professional organisation (the club) and the club prevents this knowledge from falling into the hands of non-members. As a result citizens and businesses are forced' to approach the club members to comply with laws.

The second phase of development occurs when the State-Citizen interface improves. This is occurring in India through computerisation and standardisation of processes and forms on the one hand, and increased empowerment of citizens on the other. The internet is changing the interface in two ways:

1. Many government services are moving on to the internet. While India has only 18% internet penetration, around 39% of railway tickets are sold online.
2. Even with poor systems, the internet is helping citizens deal with poor state interface through HOWTO documents, computerised services, etc.

As a result, consumers of professional services begin to demand more of their service providers, and professionals are faced with an existential crisis.

### The profession's response

Professional organisations have two choices:

1. The knee-jerk reaction to defend their turf, fight to keep systems closed and inaccessible, try to increase the complexity of systems, and create and sustain non-transparent institutions.
2. The enlightened response to focus on long-term survival by aligning itself with the interests of the consumer and industry they serve.

The knee jerk reaction causes frustration among customers and clients. This leaves the profession vulnerable to being side-lined by cost-driven innovations in the economy. An example of this is the rise of Legal Process Out-sourcing (LPO), which has allowed clients to access a broad range of legal services without hiring expensive lawyers. Worse, a profession can spiral into a vicious cycle of defensive and unethical behaviour, in which members to put the interests of the profession above those of their clients and customers, and in which standards of competence and conduct begin to suffer. In extreme cases, persistent self-interested or indisciplined conduct can invite a devastating response from the political establishment, as occurred when the Government took over management of the Medical Council of India in 2010.

By contrast, an enlightened response to a changing environment would try to preempt such crises and focus on the long-term survival of the profession. In the long-term, the profession will only survive if its interests are aligned with those of its clients, and if it provides a useful service to society at large. Strategically, it would make sense for the profession to focus on those roles in which it is truly irreplaceable, re-focussing its attention on its highest value services. And institutionally, the profession's governing body should move from being a club'' to being a state''.

### The way forward

The state model recognises that modern professional organisations are like regulators, in that they incorporate the broad functions of the modern state: legislative, executive and judicial. As such, they ought to be designed with the same internal safeguards and processes as the modern state. These include, most critically, defining and separating out these functions. A good SRO will carry out its legislative functions by making codes of conduct and defining conditions of entry into the profession. It will carry out its executive functions by holding exams to restrict entry into the profession to qualified individuals, and by investigating complaints against its members. And it will carry out the judicial function of disciplining its members.

The record on performance of these functions in India is mixed. We have sometimes been successful in defining codes of conduct and entry conditions, sometimes not. In terms of executive functions, Indian SROs have paid extreme attention to maintaining entry barriers, often at the cost of efforts to investigate complaints against the profession. In terms of the judicial function, Indian SROs are highly averse to disciplining their members in public, fearing that this will be seen as a sign of failure of the SRO. It is possible to bolster each of these functions, and ensuring the long-term survival of the profession requires that this be done.

In order to exercise its legislative function effectively, a modern SRO must:

• make detailed codes of conduct;
• make these codes of conduct available to the public;
• make these codes of conduct readable and comprehensible, through plain English guidance notes, FAQ pages, etc.;

In order to exercise its executive functions effectively, a modern SRO must:

• ensure that the entrance exams it runs are performing their function correctly. This requires analysis of test results and periodic review of the design and content of the tests, to ensure that they are relevant and of an appropriate level of difficulty;
• engage in continuing professional education of its members, as opposed to voluntary and occasional seminars, and ensure that this continuing education reflects the rigour of the selection process for entry;
• ensure that complaints against the profession are taken seriously, and investigated, and that the investigations are time-bound, and that the complainants are informed about the status of the investigation.

Too often in India, entrance tests are just a reflection of the person taking the exam and not the person who organised the exam. We do not bother to think whether the exam was fair. A recent data analysis of ICSE and ISC exams (high school exams) shows statistical evidence of poor exam design and manipulation of marks.

In a club'', standards of conduct are enforced by norms, not rules. Those norms are flexible, and defaulters are treated kindly. Clubs work well for small groups of professionals, who must rely on each other in an uncertain and unpredictable environment, which itself must be managed with a high degree of flexibility and discretion. A club is thus an appropriate model for a SRO in an early stage of its development. But as the environment changes, as processes become technologised and standardised, and customers become empowered, the club model will have to give way to a state'' model, which is less discretionary, less cosy, and less forgiving to defaulters.

Regarding the judicial function of a modern SRO, the organisation must:

1. have an impartial and effective judiciary;
2. have a fair system for addressing complaints;
3. have a detailed procedure for adjudication;
4. make rules allowing the complainant to participate;
5. ensure that adjudication proceedings are time bound.

Achieving this requires three ingredients: The first is a law, which will clearly set the bounds within which the SRO will operate. This law should be anti-professional'' in that it must be designed to protect the interests of society rather than the interests of the profession. Punishments should not only be meted out but publicly shown to have been meted out. In 2012, the New York city Disciplinary Committee publicly disciplined 65 lawyers (See pg. 32) which include 13 disbarments. Many jurisdictions even go to individual practitioner level information about disciplinary actions. We rarely see any comparable level of disciplinary actions against professionals in India. The ones that do happen are usually after a big scandal' like the Satyam failure and soon die out of public memory. No comparable data is publicly available for Indian professional organisations. They seem to hide the data about disciplinary actions and therefore should be presumed to have not carried out much.
India is going through an interesting time. The nature of the state is changing. RTI, Lokpal, E-governance, etc., are just examples of a move to a more perfect republic. SRO's have a choice, get in the way or join the change.

## Wednesday, December 18, 2013

### The future of the Insitute of Company Secretaries of India

On 7 November 2013, I did a talk at the 41st National Convention of Company Secretaries, about the future evolution of ICSI from a club towards a State organ:

You may like to look at other content on the video channel of the Macro/Finance Group at NIPFP.

## Monday, December 16, 2013

### Small samples from big populations shouldn't bother us

by Rajeeva Karandikar, Director, Chennai Mathematical Institute.

In the last few weeks, there has been a lot of discussion about opinion polls. Some people have questioned if these have a scientific basis. Indeed, each time we disclose our findings based on an opinion poll, someone raises this question.

In this article, I offer a simple explanation of the scientific basis of an opinion poll. The key result is this: If the methodology is sound, an opinion poll based on a sample size of 25 thousand respondents in our country, where there are over 500 million voters, can yield surprisingly good projections of the vote shares of major parties.

Consider a lottery. Suppose you are told that a box contains lottery tickets and that each ticket has a number written on it: 1 or 1000. You can pay Rs 100, and then put your hand inside the box and draw one ticket from the box. The prize would be the amount written on the ticket (in Rupees). Most people would not agree to play unless they are told how many tickets in the box have the number 1 or 1000 written on them. However, if they are told that 99 percent of the tickets have the number 1000 on them, many may be willing to play. Indeed, even if the cost of playing the game was Rs 900, many would opt to play if 99 per cent of the tickets have 1000 written on them.

Suppose, instead, you are told that host of the casino will put his hand in the box and draw a ticket. There are still 99 percent tickets with the number 1000 and only 1 percent with the number 1. And you have ascertained that all tickets are identical in all aspects other than the number written on it. Even then, you would be a bit apprehensive, as the host might have put the tickets with the number 1 at the bottom of the box, and given a chance the host can dig deep in and draw a ticket from the bottom. If you are allowed to shake the box and mix the tickets well, you would probably still play.

Now let us consider another scenario. A political party has two claimants for a Lok Sabha constituency, say Raghu and Prasad. Suppose the constituency has 5 lakh voters. Let us imagine that we have lottery tickets with the following characteristics: Each ticket has the names of 2501 voters from the constituency and also that the ticket is coloured Red if 1251 or more voters on that list prefer Raghu over Prasad and the ticket is coloured Blue if 1251 or more voters on that list prefer Prasad over Raghu. Suppose all such lists are written out on otherwise identical lottery tickets.

Let us assume that there is at least a 5 percent gap in the support level of the two candidates. It can then be shown that over 99 percent of the tickets will have the name of the candidate with more support. This is just a question of counting and is purely arithmetical- no element of probability or statistics enters here. Thus it is a matter of fact and not of belief! Indeed, 99.3939507 percent of the tickets will have the colour of the candidate with more support.

Return to our example of two candidates with a gap of 5 percent or more. If the party draws a ticket out of the box after mixing it well, it will end up knowing which candidate is more popular. Here the logic is that since 99 percent of the tickets have the colour of the more popular candidate, we can assume that the colour of the ticket drawn has the winner's colour. Once again the decision maker should ensure that the tickets have been mixed well.

Here are the percentage of tickets that will have the colour of the winner for different combinations of population sizes and sample sizes. In each case, we assume that there is a 5 percent gap in the vote shares of the two candidates.

 Sample size Population size (Total number of voters) 500000 1000000 2500000 5000000 10000000 25000000 1001 94.35 94.34 94.34 94.33 94.33 94.33 1201 95.87 95.87 95.86 95.86 95.86 95.86 1401 96.96 96.96 96.95 96.95 96.95 96.95 1601 97.75 97.75 97.74 97.74 97.74 97.74 2001 98.75 98.75 98.74 98.74 98.74 98.74 2501 99.39 99.39 99.39 99.38 99.38 99.38 3201 99.77 99.77 99.77 99.77 99.77 99.77

The remarkable thing about this is that while accuracy increases as sample size increases, the population size (total number of voters) has only a negligible influence on the accuracy. This is somewhat counter intuitive but true. A sample of size 2501 will give the same accuracy when the population size is 1 million or 25 million!

The following table gives the percentage of lists which have the colour of the winner when the gap between the winner and loser is only 2 percent. Here again we see that sample size determines the accuracy and population size has very little effect on it.

 Sample size Population size (Total number of voters) 500000 1000000 2500000 5000000 10000000 25000000 1601 78.86 78.85 78.84 78.83 78.83 78.83 2501 84.2 84.17 84.16 84.15 84.15 84.15 3601 88.58 88.54 88.52 88.51 88.5 88.5 5001 92.24 92.19 92.16 92.15 92.15 92.14 8001 96.44 96.38 96.34 96.33 96.33 96.32 10001 97.83 97.78 97.75 97.74 97.73 97.73 15001 99.36 99.32 99.3 99.29 99.29 99.29

At the bottom of this article is a computer program written in Python which does these computations. You have to believe me or have a mathematical expert confirm the accuracy of the program and then run the same on a computer with Python installed (which is available freely at http://www.python.org/). You can change the population size, the sample size and the gap between the support levels to get the accuracy level of the corresponding sampling scheme.

The same situation applies when we conduct an opinion poll. We select a group of 2501 voters, and ascertain the opinion of this group, called a sample. It is the percentage of votes for a party in this chosen sample that we report as the estimated vote share of the party. The crucial thing is that our choice should be as if we have written all possible lists on lottery tickets and put them in a box, mixed them well and then drew one and the names on the ticket constitute the group. This is what is called random sampling. One can use random number generators to generate such a random sample from any list of voters.

Colloquially, most people think that random means arbitrary. This is far from true in the scientific setting. Random sampling refers to the methodology of choosing a sample. In this context, it means choosing one list out of all possible lists as if we are drawing a lottery ticket (in the scenario described above). What I have described is the simplest sampling scheme. There are variations which may be more appropriate in a given situation.

Suppose we have access to a list of all telephone numbers in use in a constituency. We can use a computer program to generate a list of 2501 randomly generated phone numbers from this list. We can then call these numbers and ascertain the view of the owner. In this case we could estimate the opinion of the group of people who have phones. In this case, richer, urban, educated class will be over represented and our estimate could be biased. This methodology is used in the US and seems to work well (at least over the last 50 years, while it did not work in the '30s and '40s even in the USA when the telephones were not ubiquitous all across the country).

Thus the most important ingredient in the opinion poll is the methodology of sample selection. One must be sure of getting opinions from a representative sample. Unless the sampling is done properly, there is no statistical guarantee that the estimate would fall within 2.5 percent of the true vote share (with 99 percent probability for the sample size of 2501).

Readers can experiment with the program and obtain accuracy of random sample based prediction for a given sample size, population size and gap in the support for the winning candidate and the losing candidate. The program prints the total number of lists, number of lists where winner has majority and then the last line is the accuracy (in percent).

Python code:
g=5#Gap in percent support for winning candidate and the losing candidate
psize=500000
#population size
ssize=2501
#sample size
def binomlist(N, R):
'''Return [binom(N,0), ... , binom(N, R-1)]'''
a=[1]
for k in range(1, R):
a.append((a[k-1]*(N-k+1))//k)
assert((a[k-1]*(N-k+1))%k==0)
return a

n=psize
#Population size
print('Population Size :')
print(n)

m=(100+g)*n//200
print('Gap in the level of support between the two candidates
(in percent):')
print(g)

#Total number of supporters of the winning candidate
print('Total number of supporters of the winning candidate :')
print(m)

k=n-m
#Number of supporters of the losing candidate
print('Total number of supporters of the losing candidate :')
print(k)

r=ssize
#Sample size
print('Sample Size :')
print(r)

s=1+(ssize)//2
#Majority mark in the sample
print('Majority mark in the sample :')
print(s)

t=r-s
b=binomlist(n,r+1)
c=binomlist(m,r+1)
d=binomlist(k,t+1)
print('Total number of lists :')
print(b[r])

z=sum([ c[r-k]*d[k] for k in range(0,t+1)])
print('Total number of lists in which winning candidate has majority support:')
print(z)

y=(z/b[r])*100
print('Percentage of lists in which winning candidate has majority support:')
print(y)

Output: Population Size: 500000
Gap in the level of support between the two candidates (in percent): 5
Total number of supporters of the winning candidate: 262500
Total number of supporters of the losing candidate: 237500
Sample Size: 2501
Majority mark in the sample: 1251
Total number of lists: 62231690581446480003124486564603608079722664287780679850769754811742042826440472887015830702924480575139486249657512804993096017025966527240485971677012460101302514218686266609441052100836909464169270524814906289825323267820948737888768638306721657325213500920099906234174550459916676877801122648015241862393226740611391693419690393435279384448846498164611917690485938916309022444186853678716540339720996823920632761895486203438380430254590374925296252761868287613362669749365125454631374879693160142819869304875906654921349095055838442562414668977024766179959130011021610575662910956134247564521738477313446196261604802302543410146068132670342155475007095024743323867045795400143176727384029281976933600168079297510291849445093067071083684685003730058946519710247034945376030279821029701472923740192102205025797475452531004667596413727636670465215729867754283833374385303145387948051359404453403594361525378558410033629759275932498192096982291800849470571518287063229431447959133385792138084490304666939123657615189822099874121079295131987178206767084477208423116361539422938568526859676309130466065888802081248462657939570182699815625453901386358318350022709995625288828603793916108904428008609734299699221437566336835240257534085393479491186665079655190103428237800738888006964700812940498236110822184478021780415260136866672164326231310650895521248121755859107866938779565130334913321094933601029730436184082485079029558170569819165053571542795991727217330966100414527221364686964529920726163238492293892228326948001117293468138858023516939457994664567261850006311933756947304285561086248788200803564375093003772848775681842197209982100478555863846338584281906599009475583222084487980818308040033671587984993515974558684475022277901970099053541223542134155842504516032224804445183451317380149589970032212804575628082098277081463957839077920898869597586620515995970008514248167231810555336158368760540858584640697240880859068759980546301544945173321069553721350972811702465776038261514751366432380505653269990734628912787921196828014924849957148325947484479478464943528525829530723712207177801854498505379313242978072796608415660424672105172137755545024900415945428256536045336980540671661266557344947764836566529722714879021182976140139129856559145427658178495007317534394739394235188377026548923486253173751616379130541552155758837114472809783850427754469844587936072840642351778220558057232669828423498123063458914684776588125631122103174618980604765576707899260689467306579408356058711623351399260178055917659963408455273580719144814560484832919904878765961591225700327651593973860864438116094850680456864518725262740928051582341539254987525019787256071659676928373298021282046954594050755383326687971809253561980088484298073161856459452325694037739274837365216230582263924667803583485781780218253284218391730866178085262881994066033816393829013721311131748672183078728900933581558734405974104874756534969248232592313582502037142672654121741282578372979805465908127950187274075397490336844923270342615399463969649554126623283032616619860556580636328136753628805680852851407963270089714073626621120839871909711594994253666342272359471320655869954863460407565986906474615595552330627592635455243206575046823339900668550971352633374485688587126074795047908585598712155945668417607256395345722088439538463447076588501001219788542858153049781833657942552401464715635534332260515218505056897905685877043364499338188335401451485749380823361891367165284575198795795706325965459245970820182968416588569262205864560967474390631523041120889653260589129456152566220058515917669342363795423128034942492225269840456119934077650384674330202082758512409968193166923108572334911544560155774100118425500347543423269124631844558114435125175017555500011451376956685419999212743508413967187310121355328643814596714505035428570563667905329220023568761920372318672748066491538386362503932919067623642006192288629490332239924529275660392604135364518178709645251568947829790968485350757937039473403892835221019104946380501456170261792541779653905610112609257982069288434883821539348456638978706277490266096785801912450185230638539829701924352942262894855270142930259031972046282586563892827754397261394135669853192369909536949252496489905384890519693644219374285311950869838840999862576449738089616498010247917038541443671710835200209258303429618341172828857210172546352033108320714086475275833013658816172562851640153013201594935921805444858511862222986921611649620787025702764928938030416878469784542178572899283792381712549547989957375712162715834971052306908424777553250398897933660688114626175343338296459105052106907815907800493632106153130371390265741118879621773827599268210510657654647567961805679873227774988494734117015705540753609174031970835793477945867907801468391170032234998449398312805494784365015754905261103496492914475522013784154069747889118167745622648817823453096293834552709425038099410751211010184684088478062229567458545385381873514215152003667097109597790568765198065077427946525045384254967053032767662079442599022863825992255697642670195087368061303544182026383279381793756537585195072034501032282072218014776155536686226067590207105059978962856264152015857329823054618874575107943555769313879322107342773045700458905454789107511870461578164581868782223102778218653491871855192270951818997164294958334974045773678167469423485625177317370915848177336736332836777996361609707902987046694214863048890113554854413050567534204094725686390362155721963672141096706581649992815877681713676063878647949708426533918893860041211338666717678506988337856766016039452351562182705851879530723219770661660390683134598036166887734683508116125393617837521663026650845446856218974787651129036755981552785539813024142173328570948759613346757263838152602944234205086327543051588254089090386593113031841672362875582419951045117676214677983892982215768091890388410459618648394701970636811984634408588565396887808598116836276413546396278428362439496050403104026679130181272599410000361010047778136678130879383747883747846895444550308537252212467158365879188315673992959209980186727731070450877815644643251764435971587530519135727567687253468334262068241333011975327403420996864840461485541038137569067936744784955690032673877577363436808601455485916733141230626610962300477240992891148835574452260329156066010179688692107195572679448377022660187493117482552354887255967959473828708949465982835085658015438685159219153348125861982966902706795929866032389611325518509983815234810190419913364567450264118122918625091636534133701922567413915709199174234722642774022748876761798838932368596471473383761819955993306073194192511980655731511111418734812039174839069481997922657860771600177158258301963426575629546453731005562602020307021534742971566271133060854173312518512084425581890020765636041299469386187750782069745075999996018174251440607754688000000
Total number of lists in which winning candidate has majority support: 61854535859474557855990802105237752997003079226915266398295566709786571570865126944390968210637518423542079527102570022999166688984274296235885139162890371514814845152010622805158620168912468690693188245682966954787357816511116779763733163402700476933630959164843305433201850578845370091288748854061390074747439388865845974202161930262599395731727353191068011590081152770196877407971636426951731994085907267804768852937825049803838100473521131063052926267944191522907342658317551764904960334072386631507754440746171433527467144733182012255305326199455319159497759562823152663968262865777457103271700128546028508185840232611523695405934728964035118237960868014973326447922706116544354336034347614678384022162974527037904321191181398534575569848882630055799910749797172304564491772423199336789851612216070203204275524094233112452914612289125662370386046257715156289616617574543451300214501841247096787674912979683070061713605543496243219427259075034171350439369320584086710980928379366489759203262100385124489235015029050551301074751781445992781257981654435667272527606697678067098831709443929240127929128547394461174871163509626564598556318324969348111671167120110424508155676842784775760059444981204834187739812753861969928971222420761131855788632604940532247374491023157683617994324064023325600495426562474895522481817088290306507602098761586556424183165192703406960550038339797967819973579603029824346542759436150333392596915015063750963940450511380855736879240949203056727957820806654033532382652697944335187351538193502302752677438979716069371383839023319581409715533431261519318056963567358615092916567918646344809017192689254441873194607032799176580005518847744513314944645132471407540662504364341425558288361717082939160655446029064670317732228963561205409737330114951013218590804768437809071092058235440817461936484573823922609954200258697063724572845222547780365519185476138723059149923014644555837546472430304327776379190867288941326831870645002682813073558114144397020856651396098641428245880407763399111302626215417904725128805326719787159608089308759310539659291511216704403271866844555932756610742675279329626214802593821101127278170409131325586212124151710165142757100487590767687382044871782315436075121171542572626492638808471680968040729208050469140926077912584273576340641257014821813020214765524809778864954507078512918888714254859177939535150020680594439865146499021168025610541634139341501484522062829556587659324452928788309544618343094453013332661828069966660143355367154612846409938357974206448260728642168060123578255631198332463974385754771418670196985850132396338939942461308023817769221799318621949923468227274010642413121733050935121767411991316887764568127280843446615938837129446630717759517950376477713807259688797866576252414083146600686650880207899667891677604126827061960251420067738418355966327207257392917436336731865485642950327130932493289464011591491047704107981756427219291055264143312206202230577972263205514048071843127797269511291476129899775547198248411450106880937713920234740112362636474232781683958078134701589591040282984364866200009733100910271820163097453414872456237992771088179902024383111236689464059435281063899471133588050088649308537531041274287153585336369394790451199357846698579951685792888991089679835339060335104126017261132825267018640531309505427477537620870911733521553658152454976725660963872168787051805718895517444599140932756344873557326427855400701615942270826310406658514734551562588662900750800729828396653844953945482326221733257969297201087018127132328768798046176023879900010060373993216522794747523289640979321071773794218866605017475730371283611610761350911346861500008729919019301067072919496561285987332861930046572860663808696120651875913544097783371879746140834717719015688160225361094599128341000683910024247665018933294496488786624226050404864707017014357988489253639376601921455654391172067437596345359240709330773941058600329053762687941644065922633145644130469148317865760411103692624307884683174819618671442069152710707134162672027346237374834097350336170928453416600244355207200811054739419529397124154126071051792133199895128726950214690751373964902723667255890924491013232603657299792685025004023713057058893428464868779315295174749486468550601318702477239163614341044641062996932687024872027811548077618001234615864548746268130250171192778617488138004967564306698248535187717421076365106348806106821537346949643846700449330358599365475382470246918611292871944809121969111017864598545829051438437850288221899022100184712042404754086803996345120912030499710004313022194682886744221708406796242511777371707223033713474614368940615606518938162404539717599572710179404281043850851420088512734477536989718707445595772395424620311444817067973609967777834700409297277415549465074115328057064442133675519804938678457212846870080100109455964899729071965181905612555945488493560180366058606748476317280119865991331975082418726690147297736709796082622562532706341896303468093693062647926409712747591212083935703258289194014940709114745569037161608333832469544837654632228442721822702724003627170996788824619104568972975934911932378037473120556765765762195941632676171512945909129242448444638553636536609594131129282152539412714009585868892428320440895663351867733881652239580949142684683481353248562798873757287916757580468627389991155852813550448917969327892911097933990419196015960374273051473175438993206323137256403186109363448282668586166311354485384932931804144339988774403004402405896945245725422702431733690124017022924657227323047668961814477218714072568186313359079987411940130396681799715540521231268441953447718529859895803980262710401387658027606656500057205126849851478255459358279161783808824363571309794095088890379247630444890577923100756584545549108006677956146637604172798742819492521507826898758791892006636607596635419708170718743967976837221333720083327132366807991394197404515600528862762389248905808587268155994320338244998724029406492065210164089814201328490050287743177666822378851971606713160006189128166640525745262825626356517895066557744597519977449314900187385212625277914375251622206262429685098796590400768356218087685302088179153086859311291523273637686891639986221244405181640484253991269897217169690609115907769063954029261659568602193333309944736964180426430695238571165245318661728551017255810873045429123026855411671804358355994634963743745030745365862888759330245538534977291471181991934485986474528502644452587454925992820425856177106615897073589580463926178212465336178760260176190531289730768305813149258314589213372104868616854925650428977268668149331635952396377069894057586620585416152265929820699770161781555434581354014880088969190425658318935898315061156729251402541740877913593377241246796287235439071594744201325364245711979195094423501588318728952447886937824706628041515791435885246207092104530611200000

Percentage of lists in which winning candidate has majority support: 99.39395070510206.

## Friday, December 13, 2013

### Pre-election polls: Do they work, do they influence voters, and shouldthey be banned?

by Rajeeva Karandikar, Director, Chennai Mathematical Institute.

I have been a psephologist for about 16 years and have had a fair amount of success in predicting seats in an upcoming election. Here is a post-mortem of what I said and what happened in the current round of elections.

From 2005, I have been working with CNN-IBN and Centre for Studies in Developing Societies (CSDS). CNN-IBN engages CSDS for the survey. CSDS does a great job of running surveys by the book'. I use vote share data from the survey to come out with seat projections, which CNN-IBN carries on air.

### How did we fare in the recent state elections?

Seat count predictions on air (CNN-IBN)

BJP

Congress

AAP

Others

136-146

67-77

-

13-21

Rajasthan

126-136

49-57

-

12-20

Chhatisgarh

45-55

32-40

-

7-13

Delhi

32-42

9-17

13-21

1-5

The outcome

BJP

Congress

AAP

Others

165

58

-

7

Rajasthan

162

21

-

16

Chhatisgarh

49

39

-

2

Delhi

31

8

28

3

These results show the power and the limitations of opinion poll based projections. If one simply counts the number of cases out of 13 that the actual results are within the interval projected, the score is just 4. However, one should see these as having correctly predicted clear and decisive victories for the BJP in Rajasthan and Madhya Pradesh. For Chhatisgarh, we predicted correctly that the BJP will win, with a much smaller gap than Madhya Pradesh and Rajasthan. So I would count all the three as being good predictions, with Chhatisgarh being very good.

As for Delhi, we underestimated AAP support and marginally overestimated BJP but we had the ordering right: Congress in third position with the possibility of touching a single digit, and BJP as the single largest party.

Vote share estimates on air (CNN-IBN)

BJP

Congress

AAP

Others

41%

35%

-

24%

Rajasthan

43%

33%

-

24%

Chhatisgarh

42%

38%

-

20%

Delhi

33%

23%

27%

17%

The outcome

BJP

Congress

AAP

Others

44.9%

36.4%

-

18.7%

Rajasthan

45.1%

33%

-

21.9%

Chhatisgarh

41%

40.3%

-

18.7%

Delhi

34%

24.5%

29.5%

12%

Here, the survey has worked well, and the errors are generally within the statistically acceptable range. The conversion from votes to seats is a non trivial transformation - since one needs to estimate the distribution of votes across the state in addition to the overall percentage of votes in the state. This requires building an appropriate statistical model. I will explain my methodology for this stage in a future article.

One can see that in MP and Rajasthan, there was underestimation of BJP votes by a few percentage points. The error in vote to seat conversion went in the same direction, and as a result our prediction was much lower than the outcome for the seats obtained by BJP. In Chhattisgarh on the other hand, the survey estimated the gap between BJP and Congress as 4% while the actual gap turned out to be less than one percent. In this case, the error in vote to seat conversion and the error in the vote share estimate cancelled out, and we got a result that was bang on. Of the four, I was the least confident about Chhattisgarh (and I had said so on air) since the estimated gap in vote shares was small.

In my experience, the predictive power of any opinion poll that is done a while before voting is rather poor. For one, any such poll can only measure the mood of state or nation at the time of poll and cannot estimate the potential change that can happen close to the voting day. Some psephologists claim to estimate this change by conducting polls at regular intervals and then extrapolate to get an estimate of this change. However, this assumes there are linear time trends in vote share, which is unlikely.

The other problem is the selection process that determines who in the general population (that is sampled in a survey) shows up to vote. The propensity to vote is not uniform across socio-economic stratums of the society. One can try to factor these in but that can inflate the error.

Exit polls are designed to take care of both these issues. However, choosing respondents in a randomised manner as they exit the booth is rather difficult and our experience has shown that it does not produce a representative sample (as measured by the gap between the socio-economic profile of the sample and population). Hence, we prefer to do a post-election poll, where in the days following actual voting, we do a household survey with sound methodology. In the current round, various exit polls had got numbers close to the actual results for MP, Rajasthan and Chhattisgarh. But for Delhi there was wide variation. And in the past, there have been occasions when exit polls had given an incorrect picture while we got it right with our post-poll.

Going beyond forecasting, these polls are valuable in understanding what was happening on the ground. As an example, the CSDS poll in MP, Rajasthan and Chhattisgarh showed that the gap between BJP and Congress vote shares was higher in rural areas. This diverges from the common view that the BJP is an urban party. The CSDS website gives the breakup of voting intention by various socio-economic groups, and this is valuable knowledge.

### Regulation of opinion polls

Do opinion polls influence voter behaviour? In each of the surveys done by CSDS, one standard question is: Who did the respondent vote for in the last election'. This refers to the last Vidhan Sabha election, if this is a poll for the Vidhan Sabha and the last Lok Sabha election, if the current one is for the Lok Sabha. Almost invariably the recall for whoever won the last time is much higher than the actual votes, even when the winner from previous poll is set to lose the current election.

Thus in 2011, a much larger percentage of voters seem to recall having voted for the left front in 2006 though in 2011 they were voting for Trinamool Congress. I may add that our estimate of the Trinamool vote share and seats in 2011 was very accurate. The same was true in Tamilnadu where, while voting out the DMK, a much larger number of respondents seem to recall having voted for them.

We have observed this time and again across various states and consistently over the last 15 years. The only explanation that I could come up with is that there is a general tendency to go with the winner.

This raises the concern that a political party may run a media campaign claiming that it is ahead in the polls. This justifies regulation (though not a ban). I feel this regulation could be self regulation by the media, e.g. through the Press Council. The regulation should require that each published poll reveals, in public domain, the detailed methodology of sample selection, the sample size, the socio-economic profile of the sample, the dates when sampling was done, the names of the core team members who supervised the survey and the methodology used to convert vote estimates to seat conversion. All agencies that release such information should be open to an audit by an expert group formed by an autonomous body such as Press Council.

CSDS and I have been very open about our methods. The sampling methodology is on the CSDS website, the sample size is always given on air and the socio economic profile of the sample is also given on the CSDS website. I have written about the vote to seat conversion in an academic article: Predicting the 1998 Indian parliamentary election, Karandikar, R. L.; Payne, C.; Yadav, Y., Electoral Studies; 21, 1; 69-89, and have been talking about it in seminars.

Fortunately, over the years, the visibility of any one opinion poll has declined, with so many agencies doing polls and making contradictory predictions. In the recent Delhi elections, the range of seats projected for the Aam Admi Party was from 6 to 31 out of 70 seats! Hence, the salience of this debate has probably declined greatly.

I do believe that there is a feedback loop: if all surveys point in a certain direction, at least some voters tend to get influenced. But this is no reason for a ban. After all, newspapers and TV channels also talk about their assessment of the political situation and if they all seem to point in a certain direction, this too has an effect on the electorate. In addition, under the Indian legal system, while the government can easily do censorship on television, this is harder with newspapers and more generally on the Internet. Hence, even if a ban were desirable, it is not feasible.

## Thursday, December 12, 2013

### The changing role of women in India

#### The three modernisations

The trajectory of a country is about three modernisations: social, political and economic. Social modernisation is about establishing freedom and rights of individuals. Political modernisation is about achieving democracy, where there is rule of law, where State power is dispersed and restricted, where elections generate contestability. Economic modernisation is about achieving a high growth modern market economy, about a government that gets away from expropriation and central planning to a government that is focused on solving market failures.

All three modernisations interact in complex ways and fuel each other. As an example, Milton Friedman's Capitalism and Freedom' hypothesis is the idea that political modernisation fuels economic modernisation and vice versa. This is a well established idea in the discourse. I find it also interesting to think about the other two legs of the stool: the interlinkages between social modernisation and the other two kinds of modernisation.

#### The role of women

When we think of social modernisation and economic modernisation, the big thing that leaps out is the role of women. A society that does not respect women is under-utilising half its labour force. We would expect to see a causal impact of greater equality of women upon growth.

We in India are sometimes complacent about the role of women in India. India is famous for having women in leadership roles. In a dinner meeting by Larry Summers, I once said that India was world #1 on one measure of the role of women: the fraction of the top 100 financial firms that are headed by women. I once met Andre Beteille, and asked him: When compared with 1947, in what aspect have things in India worked out much different from what you expected. He said: The role of women in the elite. He said that for upper class women in India today, it's better than even Japan, which is otherwise a very advanced country. The daughters of the elite in India have no glass ceiling, which is better than what we see in most places.

On a population scale, however, things are vastly worse. Paramita Ghosh reports, in the Hindustan Times, on a crime victimisation survey of women with scary results. The India Today survey (link, link) shows us that 79.3% of men believe that marital rape is okay. We don't know how many men in India act out on this belief, but the report Why do some men use violence against women and how can we prevent it? by the United Nations, shows us scary facts from some Asian countries that have men who think similarly to what the Indian data is showing. The Supreme Court ruling of yesterday is a reminder of the distance that we have to go on achieving social modernisation.

#### Things are changing dramatically with the young

With human capital measures like literacy or graduating high school, a person tends to achieve them when young. If a person has not become literate or graduated high school by age 20, things are unlikely to change later on. Hence, the analysis of the cross section in the population is tantamount to looking at the history: what we see for (say) 50 year olds today is a description of what things were like, 30 years ago, for 20-year olds. Age-specific rates are like rings of a tree.

 Literacy of the cohort aged 22.5(Time-series reconstructed from age-specific rates visible in the cross section)

The graph above shows the literacy of the cohort entering the labour force, which I approximate as being the cohort at age 22.5. The blue vertical line stands for today. This is constructed using the cross-section visible in March 2013 from CMIE Consumer Pyramids, a quarterly panel dataset with 150,000 households covering 700,000 individuals. With children, high literacy rates are found early on, and this yields projections for literacy of the age 22.5 cohort in the future.

We see that overall literacy of the cohort entering the workforce has gone up from roughly 70% in 1990, when India began opening the economy, to roughly 90% today and will go up to 100% in the coming 15 years. In addition, there was a big gender gap, which has been significantly reduced and will fully go away.

Let's turn to high school graduation.

 High school graduates in the cohort aged 22.5(Time-series reconstructed from age-specific rates in the cross section)

It seems shocking to think that in 1990, roughly 7% of the cohort starting off into the labour force, at age 22.5, had passed 12th standard. This has gone up dramatically to 20%. Sharp growth is visible into the future when today's 15 year olds become age 22.5, and there is no gender gap with today's 15 year olds.

The third thing that I want to show from household survey data is the ownership of mobile phones.

 Age-specific rates of mobile phone ownership

All of us have been hearing about miraculous growth of mobile phones in India for a while, and have become a bit inured to the story. While a lot has happened, however, a lot remains to be done. The black line shows that with males, roughly 75% of the young and 80% of the old have mobile phones. The work is progress lies in taking this up to 100% for everyone. What's striking is the women. The upper red line, for March 2013, shows that 40% of girls have mobile phones, and this decays to 20% at age 45. On a related note, Avjit Ghosh, writing in the Times of India, talks about a paper by Yvonne MacPherson and Sara Chamberlain which finds that only 9% of adult women in Bihar have ever sent an SMS. There is a high rate of change with mobile telephony, in even the short timespan between the latest data (March 2013) and the first data from CMIE (June 2010) which is the lower red line.

#### Speculation

I feel that in the early decades after independence, we had a progressive elite, which was able to bring up daughters well and we made amazing strides at the top. But social modernisation took place only in the elite. For the bulk of the population, attitudes and indoctrination and levels of violence remained neanderthal.

M. N. Srinivas has emphasised the extent to which the rest of society aspires to catch up with the lifestyle and the values of the elite. In the early years, there was little catch up on the treatment of women: the elite and the proletariat coexisted like oil and water. Perhaps budget constraints came in the way of translating aspirations. Maybe poor households shortchanged daughters on nutrition and education and mobile phones and such like, thus encouraging subservience in daughters. In my opinion, the economic growth of the last 20 years is creating a new wave of households within which daughters are growing up differently. Daughters who have high school education and a mobile phone are going to engage with the world differently. As an example, they are less likely to accept sexual harassment and sexual assault. We may now be at the early stages of something very big.

Economic modernisation has created this phase of social modernisation. The rise of capable women who will not be pushed around will, in turn, fuel economic growth because we are then getting a superior labour force. There is an enormous distance to cover. In my opinion, it will be a story spread over two generations (50 years) starting from 2000, through which we will endup with something satisfactory on the role of women. Economic growth will create opportunities for women and for sensibly bringing up daughters, and the rise of capable women will fuel economic growth.