Does cigarette demand respond to price increases in Uganda? Price elasticity estimates using the Uganda National Panel Survey and Deaton’s method

Objective To provide the first published estimates of the price elasticity of demand for cigarettes in Uganda and thereby contribute to growing the evidence base of the likely impact of excise taxes on cigarette consumption and tax revenues in Sub-Saharan Africa. Method We use a linear approximation of the Almost Ideal Demand System along with expenditure data from the Uganda National Panel Survey and exploit the fact that prices of cigarettes vary across geographical space in Uganda. Results We find that cigarettes are price inelastic in Uganda with elasticity estimates ranging between −0.26 and −0.33. That is, we expect that cigarette demand will decline by between 2.6% and 3.3% every time cigarette prices rise by 10%. These elasticity estimates are in line with international evidence and are robust to outliers in the data. Conclusion Our estimates of the price elasticity of demand for cigarettes suggest that the authorities in Uganda can reduce cigarette consumption and simultaneously increase tax revenues by increasing the excise taxes on cigarettes.

I a few comments that may help improve the, otherwise, well-written manuscript. Most of them deal with the methods. The authors use a linear variation of Deaton's AIDS, which is fine. However, I believe there might be some confusion with some aspects of the method. First, in page 5 of the copy I got, they say: "The fact that price variation is largely induced by an external factor (transportation costs) allows one to estimate demand elasticities that are free from concerns of reverse causality or simultaneity bias." I believe this is partly true (and a bit misleading). As I understand it, Deaton's AIDS exploits price variation that is induced by external factors, but it also exploits variation that is caused by the quality shading. Households from different villages face a different set of prices (because of external factors, such as transportation costs), but households from the same village may choose goods with different prices (proxied by unit values). I believe AIDS exploits both variations to identify demand. According to the authors results, a good deal of such variation is within clusters. I believe the authors can do a better job at explaining this and how this is incorporated in the results.
In page 6, the authors say that with quality shading price elasticity of demand will be overestimated. Is that so? Let's assume for instance that a change in price of a cigarette expensive brand, widely smoked, leads to the down-trading to a cheaper brand but the quantity of cigarettes smoked remains the same. Basically, that would mean that the estimated elasticity is close to zero, though it has been a change in behaviour. Is elasticity overestimated or underestimated in such a case? May be the authors can answer this question and use this example to make this point clearer.
Just after that claim, the authors say that unit values are not the same as prices because of measurement errors (by the way, equation (1) is not an equation: it's an identity). I do not think this is correct. Unit values are different from prices even if everything is recorded perfectly. The main difference between them is that households are price-takers but can choose unit values of the goods they choose to buy. Prices are exogenous to households' choices; unit values are endogenous. That's the main difference (Deaton specifically states this) and, unless I am wrong, the authors should change this. They say: "Households are unlikely to correctly recall the amount of money spent on cigarettes and/or the quantity consumed. In some cases, the survey enumerator might incorrectly capture this information." The same could be said about price recording. One could ask about quantities and prices and there would be measurement errors.
On page 8, they say: "Knowing the pattern of the quality effects (i.e. estimates". Usually, parameters associated with "quality" are associated with price elasticity? It's not obvious at all, from the equation of price elasticity (7). Please, clarify this point.
In addition, why there is no estimation of the income elasticity of demand? I believe it would be very important to have this estimate or to justify why is missing.
On page 8, further below, they claim: "Because of the assumption that prices are fixed within clusters and the fact that we do not have price data, prices are proxied by cluster fixed effects". Again, my understanding of Deaton's AIDS tells me this interpretation is wrong. Prices are proxied by unit values and household characteristics. Fixed effects capture tastes, preferences and idiosyncratic factors, as they authors claim right after that sentence. If fixed effects capture prices, tastes, preferences, etc; how is it possible to untangle the price from the other variables to estimate elasticity? I might be wrong, but anyway this needs to be explained better.
Another question is: given that they cannot estimate a "participation elasticity" by using households that choose to buy cigarettes and those who choose not to buy, is there any self-selection bias in the sample? Why not using, for instance, Heckman's method to correct for that? Please, clarify why this is not done. Table 1 could also inform the number of clusters (or the proportion of them out of the total) that have households used in the final estimation. If there is a large proportion of them that are not used, the authors should discuss the implications of this. In addition, it would be nice to understand why the average proportion of households per cluster is so low. Usually clusters are a set of blocks or small neighbourhoods or even villages. Less than two households per cluster seems a very small amount. Is this correct? Is there any implication of this in the results (see comment just below)? Table 5 shows the results with bootstrapped standard errors. How many times was the sample bootstrapped? Was the bootstrap made for the whole sample or was made considering some clustering of households? Any implication of the latter?

GENERAL COMMENTS
This is a well written paper and makes important contribution to the economics of cigarette smoking in the context of Sub-Saharan Africa. It uses a household panel survey to estimate price elasticity of conditional cigarette demand in Uganda.
While the study makes the best use of available data in applying Deaton's model, authors need to make further clarifications to the methods of data analysis and recognize the limitations of the study more clearly.
1. Table 1 shows average household expenditure decreases in real terms. Is it consistent with overall economic growth in the country?
2. There are less than two households per cluster on average. What is the range of the number of households by cluster? I am guessing that some clusters may have a single household. It can cause a limitation in using spatial variation in price for the identification of the effect of price changes. Because in these cases the cluster unit value is essentially the unit value of a household, which can hardly represent a full cluster. In fact, the finding in ANOVA that about 30% residual variation in unit value is due to within cluster variation might be an underestimate because of the representation of some clusters with single households. For these clusters, 100% variation comes from between cluster variation and cluster fixed effects cannot be estimated in equation 3. Overall, the manuscript is very well-written, clear and with well-defined objectives. The topic addressed in the article is relevant to Uganda but also to other developing countries, especially those of Sub-Saharan Africa, where very little research is conducted on tobacco control topics. As the authors point out, the main findings are in line with similar studies conducted for other countries.

What is
1. I have a few comments that may help improve the, otherwise, well-written manuscript. Most of them deal with the methods. The authors use a linear variation of Deaton's AIDS, which is fine. However, I believe there might be some confusion with some aspects of the method. First, in page 5 of the copy I got, they say: "The fact that price variation is largely induced by an external factor (transportation costs) allows one to estimate demand elasticities that are free from concerns of reverse causality or simultaneity bias." I believe this is partly true (and a bit misleading). As I understand it, Deaton's AIDS exploits price variation that is induced by external factors, but it also exploits variation that is caused by the quality shading. Households from different villages face a different set of prices (because of external factors, such as transportation costs), but households from the same village may choose goods with different prices (proxied by unit values). I believe AIDS exploits both variations to identify demand. According to the authors results, a good deal of such variation is within clusters. I believe the authors can do a better job at explaining this and how this is incorporated in the results.
Response: We thank Dr. Paraje for this comment. Indeed, the main source of variation for Deaton's method is the between-village variation in unit values that is largely exogenous -Deaton refers to this as the main identifying assumption. On the other hand, Dr. Paraje is right that some portion of the variation in unit values is a result of endogenous decisions that households are making about quality what he refers to as quality shading. And a final source of variation is due to measurement error as households incorrectly recall expenditures and/or quantities. Our results incorporate all the three sources of variation in estimating price elasticities.To make the point that Dr. Paraje is referring to more explicit, we have now revised the manuscript and added the following sentence in the Method section on page 8: "It is also important to note that some portion of the variation in unit values comes from these choices in quality that households are making. This variation is also exploited below in deriving our elasticities." 2. In page 6, the authors say that with quality shading price elasticity of demand will be overestimated. Is that so? Let's assume for instance that a change in price of a cigarette expensive brand, widely smoked, leads to the down-trading to a cheaper brand but the quantity of cigarettes smoked remains the same. Basically, that would mean that the estimated elasticity is close to zero, though it has been a change in behaviour. Is elasticity overestimated or underestimated in such a case? May be the authors can answer this question and use this example to make this point clearer.
Response: We thank Dr. Paraje for this comment. To explain why the elasticity is overestimated with quality shading, we make reference to the standard formula for calculating elasticities. Recall that the formula for the price elasticity of demand, , is given as: where and are the changes in quantity and price respectively. With quality shading, the shifting to a cheaper brand means that the numerator in the formula () hardly changes given that is small. On the other hand, quality shading also implies that the change in price is smaller than it would otherwise be if there was no quality shading. In other words, the consumer, by choosing the cheaper brand, has avoided a huge change in price and therefore is smaller than it would otherwise be. Given that the numerator in the elasticity formula is smaller than it would otherwise be, the calculated elasticity is therefore bigger (in absolute size) than it would otherwise be. In other words, the price elasciticity is overestimated.
Here is Deaton himself describing why quality shading results in an overestimated elasticity: "[Any presence of quality shading] will cause price responses to be (absolutely) overstated. The argument is straight forward and depends on higher prices causing consumers to shade down quality. Suppose that when there is a price increase, consumers adapt, not only by buying less, but also by buying lower-quality items, thus spreading the consequences of the price increase over more than one dimension. We want to measure the price elasticity, which in an idealized environment would be calculated as the percentage reduction in quantity divided by the percentage increase in price [the equation above]. However, we observe not the price but the unit value. With quality shading, the unit value increases along with the price, but not by the full amount. As a result, if we divide the percentage decrease in quantity by the percentage increase in unit value, we are dividing by something that is too small, and the result of the calculation will be too large."  That's the main difference (Deaton specifically states this) and, unless I am wrong, the authors should change this. They say: "Households are unlikely to correctly recall the amount of money spent on cigarettes and/or the quantity consumed. In some cases, the survey enumerator might incorrectly capture this information." The same could be said about price recording. One could ask about quantities and prices and there would be measurement errors.
Response: Dr. Paraje is correct that equation (1) strictly speaking should be referred to as an identity. We have therefore re-written it as follows with the to signify that it is an identity and not an equation: .
On the second part of his comment, Deaton does specifically say that unit values are not the same thing as prices because of two reasons: quality shading (described above) and measurement error.
Here is Deaton writing in 1997: "Quality effects in unit values are not the only reason why they cannot be treated as prices, and may not even be the most important. Because unit values are derived from reported expenditures and quantities,measurement error in quantity [and expenditure] will be transmitted to measurement error in the unit value, inducing a spurious negative correlation" (Deaton, 1997, p292).
Further, here is Deaton writing in 1988: "In surveys where households report both expenditures and physical quantities, it is possible to divide one by the other to obtain unit values. These unit values, which depend on actual market prices, suggest that there is substantial spatial variation in prices in many developing countries, a finding that makes good sense in the presence of high transport costs. However, it is not possible to use unit values as direct substitutes for true market prices in the analysis of demand patterns. Consumers choose the quality of their purchases, and unit values reflect this choice. Moreover, quality choice may itself reflect the influence of prices as consumers respond to price changes by altering both quantity and quality. Measured unit values are also contaminated by errors of measurement in expenditures and in quantities and are likely to be spuriously negatively correlated with measured quantities" (Deaton, 1988, p418 -419 4. On page 8, they say: "Knowing the pattern of the quality effects (i.e. the magnitude of β), allows us to correct our final price elasticity estimates". Usually, parameters associated with "quality" are associated with the income elasticity of demand. How is β associated with price elasticity? It's not obvious at all, from the equation of price elasticity (7). Please, clarify this point.
Response: The parameter that allows one to assess whether quality effects are present in the data, at least within the context of Deaton's method, is the coefficient on the logarithm of expenditure in a regression of the logarithm of unit values on the logarithm of household expenditure. In other words, it is the coefficient on household expenditure in the so-called unit value regression. This is what Deaton refers to as the expenditureelasticity of quality. In our case, this expenditure elasticity of quality is captured by β. And knowing the magnitude of β allows us to avoid the pitfall of having an overestimated price elasticity as per the discussion of quality shading in point 2 above. In other words, it allows us to correct for a possible overestimate of the price elasticity of demand. This is what links β to the estimate of the price elasticity of demand as per equations (7 to 10) in the manuscript. I refer the reviewer to the discussion of this very point in  from pages 288 to 292 and again from pages 294 to 299.
Response: Dr. Paraje is indeed correct in his observation that the absence of the income elasticity of demand is conspicuous by its absence. We have since added a discussion of the income elasticity of demand (strictly known as the expenditure elasticity of demand in this case because we are using expenditure data and not income).
That discussion now appears in the Method section (as equation 3.11) and in the results section as Table6.
6. On page 8, further below, they claim: "Because of the assumption that prices are fixed within clusters and the fact that we do not have price data, prices are proxied by cluster fixed effects". Again, my understanding of Deaton's AIDS tells me this interpretation is wrong. Prices are proxied by unit values and household characteristics. Fixed effects capture tastes, preferences and idiosyncratic factors, as they authors claim right after that sentence. If fixed effects capture prices, tastes, preferences, etc; how is it possible to untangle the price from the other variables to estimate elasticity? I might be wrong, but anyway this needs to be explained better.
Response: Dr. Paraje is correct that our description of why cluster fixed effects are included is not correct. We have since deleted the following sentence from the manuscript: "Because of the assumption that prices are fixed within clusters and the fact that we do not have price data, prices are proxied by cluster fixed effects".
The only sentences that pertain to the cluster fixed effects now read: "The cluster fixed effects allow us to control for cluster-level tastes and preferences. Similar tastes and preferences are to be expected for narrowly constructed clusters such as villages." 7. Another question is: given that they cannot estimate a "participation elasticity" by using households that choose to buy cigarettes and those who choose not to buy, is there any self-selection bias in the sample? Why not using, for instance, Heckman's method to correct for that? Please, clarify why this is not done.
Response: There might be an aspect of selection bias in the sample. Sadly, the dataset that we use for our analysis does not contain any information about households that do not purchase cigarettes. In other words, wedo not know anything about non-purchasing households (nothing about their expenditure, household characteristics, etc…). In other words, the non-purchasing households do not exist in our dataset and therefore we cannot use Heckman's method. The best we can do is estimate conditional elasticities, a point we make in the manuscript and whose limitations we explicitly acknowledge.
8. Table 1 could also inform the number of clusters (or the proportion of them out of the total) that have households used in the final estimation. If there is a large proportion of them that are not used, the authors should discuss the implications of this. In addition, it would be nice to understand why the average proportion of households per cluster is so low. Usually clusters are a set of blocks or small neighbourhoods or even villages. Less than two households per cluster seems a very small amount. Is this correct? Is there any implication of this in the results (see comment just below)?
Response: This is well noted comment from Dr. Paraje. We have now included a new summary statistic in Table 1 that tells how many clusters there were in total in the survey and how many we actualy use for the analysis. The former is simply referred to in the table as "Total number of clusters" and the latter as "number of effective clusters". For 2005, the total number of clusters was 322 and the number of clusters actually used in the analysis (effective clusters) is 178. For 2009, the number of clusters was 318 and the number used in the analysis 121. Ideally we would want to use all the clusters in the analysis given that the Deaton estimator relies on the number of clusters for its consistency properties. However, we cannot use clusters that do not have even a single household purchasing cigarettes and therefore these clusters have to be removed from the analysis . And given that cigarettes are not a commonly consumed commodity in Uganda, there is going to be a preponderance of clusters without a single household purchasing cigarettes. The major limitation coming out of this is that our elasticity estimates might be influenced by clusters having extreme values (i.e outliers). We address this possibility in the Robustness section of the paper where we remove from our analysis any unit values that are extreme using the approach proposed by Guindon et al (2011). The results are not driven by outliers and therefore not sensitive to the small nature of the sample.
Coming to the second issue of very low cluster sizes (i.e number of households per cluster), it is not surprising as households that consume cigarettes are in anyway going to be very few in Low and Middle Incomecountries and Uganda is no exception. However, the issue of small cluster sizes does not appear to be fatal when using Deaton's method.  argues that the consistency properties of the price elasticity estimator depend on the number of clusters and not on the number of households per cluster. In other words, his estimates converge on the true population parameter when the number of clusters increases and not when the number of households per cluster increases. This is a consequence of the fact that his method relies on between-cluster variation and not on within-cluster variation. So the number of clusters is much more important than the number of households per cluster (cluster size).  shows, via a series of Monte Carlo experiments, that his proposed estimator performs very well even with very few observations per cluster. In his own estimates of price elasticities for various commodities in Cote D'Ivoire , his average cluster size is 1.9 implying that there are some clusters with only one household. He does not flag this asn issue given the aforementioned results of the Monte Carlo exercises discussed above.
Please leave your comments for the authors below This is a well written paper and makes important contribution to the economics of cigarette smoking in the context of Sub-Saharan Africa. It uses a household panel survey to estimate price elasticity of conditional cigarette demand in Uganda.
While the study makes the best use of available data in applying Deaton's model, authors need to make further clarifications to the methods of data analysis and recognize the limitations of the study more clearly.
1. Table 1 shows average household expenditure decreases in real terms. Is it consistent with overall economic growth in the country?
Response: The Ugandan economy did grow between 2005 and 2009. For example, per capita GDP growth averaged 5% between these two periods. On the other hand, inflation was relatively high. Between 2005 and 2009, the average inflation rate was 9%. This means the average rate of inflation was greater than the average rate of growth in incomes. This explains the decline in real household incomes/expenditure between the two periods.
2. There are less than two households per cluster on average. What is the range of the number of households by cluster? I am guessing that some clusters may have a single household. It can cause a limitation in using spatial variation in price for the identification of the effect of price changes. Because in these cases the cluster unit value is essentially the unit value of a household, which can hardly represent a full cluster. In fact, the finding in ANOVA that about 30% residual variation in unit value is due to within cluster variation might be an underestimate because of the representation of some clusters with single households. For these clusters, 100% variation comes from between cluster variation and cluster fixed effects cannot be estimated in equation 3.
Response: This is a very well noted issue from Dr. Nargis. The range of households per cluster runs from 1 to a maximum of 6. This is not surprising as households that consume cigarettes are in anyway going to be veryfew in Low and Middle Income countries and Uganda is no exception. However, the issue of small cluster sizes does not appear to be fatal when using Deaton's method.  argues that the consistency properties of the price elasticity estimator depend on the number of clusters and not on the number of households per cluster. In other words, his estimates converge on the true population parameter when the number of clusters increases and not when the number of households per cluster increases. This is a consequence of the fact that his method relies on between-cluster variation and not on within-cluster variation. So the number ofclusters is much more important than the number of households per cluster (cluster size).  shows, via a series of Monte Carlo experiments, that his proposed estimator performs very well even with very few observations per cluster. In his own estimates of price elasticities for various commodities in Cote D'Ivoire , his average cluster size is 1.9 implying that there are some clusters with only one household. He does not flag this as an issue given the aforementioned results of the Monte Carlo exercises discussed above.