Search interesting materials

Friday, October 28, 2016

The Diwali effect in Delhi air quality

by Dhananjay Ghei, Arjun Gupta and Renuka Sane

As Diwali approaches, we have learned to worry about air quality. Over the last few years, several studies have noted the increase in pollution levels during the period of Diwali owing to increase in commercial activity and firework displays. However, as we show in our previous article, there is considerable variation in PM 2.5 levels in Delhi in terms of location/time/month:

  1. Time Effect: The effect of diwali is not uniform throughout the day and is more prevelant at particular time of the day than other times. We also need to adjust for the confounding effect of time: pollution levels are high during the night and low during the day.
  2. Location Effect: Several areas of Delhi are severly polluted throughout the time, whereas others see large variations in their pollution levels. All these reasons make it difficult to attribute the entire increase in PM2.5 on Diwali.
  3. Month Effect: The day of Diwali Festival varies in the Gregorian Calendar between the 17th October and 15th November every year. Existing pollution levels are already high when compared to the annual average. This is a confounding effect.

It is possible that the bad air that we see in Delhi at the time of Diwali is just the bad air quality in winter, and is not causally impacted upon by Diwali. In this article, we attempt to quantify the increase in the PM 2.5 levels during the Diwali period. Does Diwali have an impact upon air quality? If so, by how much?

Issues in research design

The opportunity to identify a Diwali effect comes from the fact that Diwali is a `moving holiday' which takes place on a different day of each year. If this were not the case, it would be strongly correlated with changing climate.

Our ability to analyse these questions is greatly hampered by the lack of data. As of today, the data only runs from 1/2013 to 10/2016.

The air pollution caused by fireworks includes many contaminants. The data that we are studying covers only pm2.5.

Pollution levels on Diwali

The data used for the analysis comes from the US Consulate based in Chanakyapuri and the Central Pollution Control Board for 4 locations (R K Puram, Punjabi Bagh, Mandir Marg, Anand Vihar). The data consists of hourly PM 2.5 levels across the five locations from January 2013 to October 2016. We winsorise the data at 1% on both ends to remove the extreme tail values.

The effect of Diwali on pollution levels

We first estimate the effect of Diwali on daily data using an event study. We aggregate the hourly concentration of PM2.5, at each location, to arrive at the daily numbers. The day of the Lakshmi Puja is taken as the event day. Therefore, we get 3 events for each location. Next, we calculate the percentage change in PM2.5 concentration levels by differencing the logarithm of PM2.5 values. These are then re-indexed to show the cumulative change over a 20 day window.

Event study showing the change in PM2.5 around Diwali date (in days)

The solid line represents the average cumulative percentage change in PM2.5 values during the window, whereas the dashed line represents the confidence intervals calculated using the bootstrapped standard errors. We see that pollution levels start increasing one day before Diwali, and increase till two days after Diwali. It is also interesting to note that the increase in the pollution levels is significant during the two days after Diwali. This can be attributed to the fact that Diwali celebrations begin only on the night of Diwali, thereby leading to a significant increase the next day, as well as Diwali being celebrated over an extended period of time.

We now come at the same set of questions using a regression.

Contribution of Diwali on PM2.5: Regression analysis

Since Diwali is celebrated over a number of days we also define the following models:

  1. Diwali=t: Diwali
  2. Diwali={t-1:t+1}: 3 Days (day before Diwali, Diwali, day after Diwali)
  3. Diwali={t-1:t+2}: 4 Days (preceding day to two days after Diwali)

The model is as follows:

\[ PM2.5_{it} = \alpha + \beta_1*Diwali_{t}+ \beta_2*Diwali_{t}*l_{i} + m_t + h_t + l_i+\epsilon_{it} \]

where, $i$ is location, and $t$ is time. Here, PM 2.5 is the hourly measured levels of the pollutant. The first model takes Diwali to be only the date of Diwali, second model defines the Diwali days from one day before to one day after and the third model considers Diwali from the preceding day to two days after Diwali. In addition, we have month ($m_t$), location ($l_i$), and hour ($h_t$) fixed effects. The base for the location interaction term is Anand Vihar. Robust standard errors are used for our analysis throughout.

Dependent variable:
Hourly PM2.5 Concentration
t = -0.177t = 8.496***t = 13.181***
t = 0.638t = -5.100***t = -6.692***
Mandir Marg*Diwali73.078-67.943-66.844
t = 2.606***t = -4.450***t = -4.979***
Punjabi Bagh*Diwali65.630-49.033-52.254
t = 2.374**t = -3.254***t = -3.945***
R K Puram*Diwali63.348-54.228-67.094
t = 2.291**t = -3.589***t = -5.055***
Month FEYesYesYes
Location FEYesYesYes
Hour FEYesYesYes
Adjusted R20.2640.2640.266
F Statistic (df = 39; 118803)1,091.020***1,094.274***1,103.673***

The first model (Column 1) shows that the baseline effect (i.e. at Anand Vihar) is not statistically different from non-Diwali days. For locations, other than Chanakyapuri, there is a differential effect on Diwali relative to Anand Vihar on Diwali. For instance, Diwali adds on an average 69.35 (73.07-3.72) µg/m3 PM2.5 particulate matter in air at Mandir Marg relative to Anand Vihar.

When we consider the second (Column 2) and third (Column 3) specifications, there is a statistically significant effect in Anand Vihar. The average particulate matter is 99 µg/m3 higher when we consider a two day Diwali, and 135 µg/m3 when we consider a three day Diwali period. While this may not seem much, given the already degraded air quality during these months, Diwali makes the pollution level reach alarming levels (>400, the monthly average in October November is around 340) which can have severe impacts on the health of people.

The Diwali effect is lower in other other locations relative to Anand Vihar. Thus, we see, that on the main day of Diwali, Anand Vihar is not too different from other days, while other locations have more pollutants relative to Anand Vihar. However, once we take into account 1-2 days after Diwali, we see that Anand Vihar is the most polluted location, and other locations have lower pollutants relative to Anand Vihar.


Very little is known, at present, about air quality and Diwali. Using the admittedly weak data resources, we have begun analysing this question here.

To the extent that these results are persuasive, they could help individuals plan strategies to avoid being in Delhi on these days. There is also a case for a Pigouvian tax on fireworks, in order to overcome the externality.

Previous work on Diwali, which helps us see other dimensions of Diwali, includes: Seasonal adjustment with Indian data: how big are the gains and how to do it by Rudrani Bhattacharya, Radhika Pandey, Ila Patnaik, Ajay Shah, and IEDs in Diwali and Toxic chemicals in Holi by Ajay Shah.


  1. 1) What is the alpha for these three models:

    It is some time possible that due to high/low alpha in model 2 and 3, the betas for location*diwali will be impacted to maintain the location*diwali averages. Hence the relative increase (w.r.t anand vihar) may appear small but when added with alpha, the final output may increase.

    2) Use of shifting diwali dates to mitigate cofounding:

    The primary objective of the analysis is obviously not to predict PM25 after diwali but to the asses the relative change in PM 25 due to diwali. The regression analysis do not explicitly tries to mitigate the mentioned cofounding effect in the data which says that PM 25 will anyway increase due to decreasing temperature after diwali.

    An easy way to mitigate the cofounding effect is to compare the volatilities of the diwali month with volatility of the same calendar month but in a year in which there was no diwali in the particular month. For example, diwali (2014) was on October 22. In 2015, diwali was on November 11. Hence, if diwali has an impat upon PM 25, volatiliies in November 14 will be lower than volatilities in november 15. The same interpretation will hold for october but in a different way.

  2. The regression framework by its design has the potential to produce positive and high betas for days after Diwali due to following two Modeling issues

    1) climate heteogenety in days after Non Diwali;
    The regression sample appears to have chosen non Diwali days from entire year ranging from summer to winter. In summer days, PM 25 would surely not have increasing trend similar to winter month. the proportion of summer non Diwali days would anyway be a high proportion of total non Diwali days in the regression model

    2) homogeneity of Diwali days:
    the Diwali days within sample would be obviously taken from early winter days and hence pm 25 is anyway expected to increase after Diwali days. Hence positive and high betas for Diwali is somewhat obvious

    In order to remove the above conclusion bias, a narrowed time framework of the sample could be helpful.

    In general terms, Diwali anyway appears as negligible culprit given the recent days news when PM 25 had peaked even before Diwali due reasons like crop burning and cold.


Please note: Comments are moderated. Only civilised conversation is permitted on this blog. Criticism is perfectly okay; uncivilised language is not. We delete any comment which is spam, has personal attacks against anyone, or uses foul language. We delete any comment which does not contribute to the intellectual discussion about the blog article in question.

LaTeX mathematics works. This means that if you want to say $10 you have to say \$10.