Asia-Pacific Economic Statistics Week Seminar Component Bangkok, 2 – 4 May 2016 Name of author Abhiman Das (Professor, IIM Ahmedabad) Pulak Ghosh (Professor, IIM Bangalore) Anirban Sanyal (Research Officer, Reserve Bank of India) Organization Indian Institute of Management (IIM) – Ahmedabad & Bangalore Reserve Bank of India Contact address Department of Statistics and Information Management Modelling and Forecasting Division C-8, 6th Floor, Bandra Kurla Complex, Bandra (E) Mumbai- 400051 Phone:022-26578700 Ext-7503 Mobile:- 08879239691 Contact phone (+91) 9820414476, (+91) 8879239691, (+91) 9742065806 Email
[email protected] [email protected] [email protected] Title of Paper Monitoring consumer price trend using daily price data of online grocery stores in India Abstract Central banks and other policymakers monitor the trends in consumer price using different price indices. These indices are compiled by official agencies and are usually published at monthly frequency. However, the time lag of publishing the official data often creates bottleneck for policymaking as information on price situation remains unavailable at the time of framing policy decisions. In order to bridge the information gap, this paper proposed a methodology of tracking price momentum almost on real time basis using online grocery stores price data. Online grocery stores have been operationally successful in different advanced and emerging economies including India for a considerable period of time and have a large consumer base. Given the wide reach of such online groceries and their product coverage, we explore the information content of the daily pricing data from such online groceries in Indian context. We find concordance of price momentum between price quotations of e-retailers with the officially published data. We argue that such high frequency data can be effectively used for developing composite price indices which can track the price momentum in real time and can provide valuable policy inputs to policymakers.
I.
Contents
I.
Contents .............................................................................................................................................. 2
I.
Introduction.......................................................................................................................................... 3
II.
Literature Review ............................................................................................................................... 5
III.
Approach and data used ............................................................................................................... 6
1.
Step 1: Data cleansing and transformation ................................................................................ 7
2.
Step 2: Mapping online grocery commodities to CPI item basket .......................................... 7
3.
Step 3: Construction of Index ....................................................................................................... 9
4.
Step 4: Comparative analysis and deep-dive .......................................................................... 10
IV.
Empirical Findings ........................................................................................................................ 11
5.
Comparative Analysis .................................................................................................................. 11
6.
Deep-dive analysis ....................................................................................................................... 13
V.
Concluding remarks ......................................................................................................................... 15
VI.
References .................................................................................................................................... 16
I.
Introduction
Price stability has been one of the major mandate of central banks across countries. As price pressure builds up, inflation starts picking up resulting in discomfort for the policymakers. Thus monitoring price condition continues to remain one of the major concerns for any central bank. Generally different price indices (retail and wholesale) are released by official agencies which are helpful for price monitoring across economy. However such official data often come with a lag which creates huddle in effective policy making. Central Banks often track different high frequency indicators for monitoring the price momentum and detect any build-up of inflationary environment at the earliest. In this context, the usefulness of online grocery data has been analyzed at length across various literature where the information content of such real time price movement has been acknowledged as important source of information on price condition in terms of timeliness and accuracy (Cavallo and Ribogon (2011), Cavallo (2012 & 2013) etc.). Cavallo and Ribogon initiated the project titled ‘MIT Billion Prices Project’ (BPP) with the objective of using the real time price movements across large set of commodities for developing comparable price indices. In this process, BPP covered over five million commodities sold across 300 online retail chains in more than 70 different countries1. The price indices estimated using BPP project across different countries differed from the coverage of official estimates as online marketed commodities was found to be concentrated in some specific categories among other limitations. However in spite of these limitations, the estimated price indices were found to be tracking official price momentum to significant extent which opened up a new gamut of opportunities for real time price monitoring particularly in cases where the high frequency price data remains unavailable. The phenomenal growth of online grocery stores across countries points towards shifting consumer preference towards online ordering. As per a recent survey report of Nielson (April, 2015), more than 80% of 30000 respondents across countries find it convenient to order from online grocery stores. The advancement in technological growth powered by customer ease in accessing wide variety of products has been one of the major drivers of such behavioral change among consumers. Asia-Pacific Region has been a leading region for adopting such online initiatives in grocery market. As per the report, the expansion of mobile data usage and broadband connectivity drives the consumer preference in this region. The growth of online groceries has been found to be phenomenal in food and beverage segments as customers across Asia Pacific countries prefers ordering online groceries instead of visiting retail stores. The online size of grocery stores in India started 1
Source: Wikipedia (https://en.wikipedia.org/wiki/MIT_Billion_Prices_project)
peaking up since 2011 but the market size remains small compared to others 2. In India, the retail sector expanded significantly in recent times. India’s retail sector was ranked 5 in 2012 as per the Global Retail Development Index (GRDI) of At Kearney. Online purchases has also becoming part of this success story in India where sales growth of online retails of food items increased by 80% during January to March of 20133. Since 2011, the popularity of online grocery stores in India started peaking up when two of the major online stores namely Bigbasket and Zopnow started their operation. In recent times, 6 major online players rule the online grocery market in India. According to Mr. Hari Menon, Co-founder of Bigbasket, the changes in consumer behaviour is helping these online retailers to increase their business operations. Though these online grocery stores experienced significant impulses in terms of increasing business operations, the spatial coverage of these retailers majorly confined to major metro cities. Also the commodities offered through these online grocery portals is limited to food and beverage items compared to the usual CPI item basket. India adopted inflation targeting monetary policy in 2015 and consumer inflation was selected as the official target of inflation. Eventually the consumer price inflation is derived from CPI Combined index released at monthly frequency by Central Statistical Office with a lag of 11-12 days. Food and fuel items having weight of around 58% remains one of the major drivers of the consumer inflation. The easing price impact of fuel inflation was offset by food inflation in recent times. Though fuel inflation moves in tandem with global crude price, the food inflation is driven by primarily supply shocks. In such a situation, the real time tracking of food prices acts as early warning system for detecting any unfavorable movement in one of the major drivers of consumer inflation. With this background, this paper extends the real time price monitoring approach for primarily food items to Indian context using online groceries data obtained from one of the largest online grocery store of India. This paper contributes to the increasing literature of Big Data Analysis by incorporating real time tracking of price of food items using online grocery data in Indian context. Such exercise, though carried out across countries, have not been addressed in emerging market economies to a great extent and first of its kind in Indian context. The paper also addresses one of the major policy imperatives in terms of real time tracking of food inflation which may be used to detect any inflationary situation for consumer price inflation contributed by food items. The rest of the paper is structured as follows – Section 2 covers the literature review followed by approach and data used in Section 3. The empirical findings are illustrated in Section 4 and conclusion follows in Section 5.
2
“The retailer” – EY (http://www.ey.com/Publication/vwLUAssets/EY-the-retailer-july-september2015/$FILE/EY-the-retailer-july-september-2015.pdf) 3 Source: https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/BBG-Retail.pdf
II.
Literature Review
The utility of internet search data for tracking economic activities has been taken up across countries in recent times. Data obtained by monitoring internet activities have been used by several countries like America, UK, and Germany to produce a high frequency series that can capture macroeconomic activity and give accurate estimates of economic parameters in Real- time. Google being largest search engine, majority of such studies has considered Google Search engine data which is officially released in Google Trends platform at weekly frequency. Varian and Choi (2009) were the first to use Google Trends to estimate several macroeconomic indicators in the U.S economy. Google Trends was used to estimate automobile sales and initial claims to unemployment in the US, along with tourism in Hong Kong and consumer confidence in Australia .Theoretically this paper is similar to the doctoral thesis of Brian.D.Humphrey, (2010) under James W. Roberts, Faculty Advisor. He uses simple OLS techniques to show that indicators developed using Google search engine data can successfully, predict the local and national household sales in the United States. McLauren and Shanbhogue (2011) surveyed indicators for UK housing and labour markets using data from Google Insights by entering certain keywords. They used simple auto regressive models with lags of one and two months to create a causality between internet and published data. The model exhibited a good fit and data on keywords , “estate agents”, “RICS”,”HBF” were able to predict movements in housing prices. Conquests of Google Trends are not restricted to advanced economies, Yan Carrière-Swallow and Felipe Labbé (2010) of the Central Bank of Chile published research on relevance of internet data in emerging countries. Their paper now-casts Chilean Automotive sales using Key-word based search data of Google trends data. Using internet data to estimate economic activities, has a huge underlying assumption of high internet penetration. In recent times, World Bank identified Big Data as one of the potential source of information for areas like early warning system, nowcasting macroeconomic aspects, forecasting weather patterns etc. The usefulness of daily price data has been widely acknowledged for timely information on price condition at micro level. In his article titled “A Way, Day by Day, of Gauging Prices”, Justin Lahart (2010) observed that the daily price data contains information on development of price condition in retail market and thereby applauded initiative of collecting daily price data from online grocery markets. Cavallo and Rigobon (2011) initiated the Billion Prices Project (BPP) in MIT where the scrapped daily price information were collected for around 5 million commodities from major 300 online grocery markets across 70 countries. Though BPP started with as academic exercise, the objective of BPP was to use high frequency price information across major commodities and to develop supplementary price monitoring mechanism for policymakers. Michael Bordo, economist of Rutgers University who reviewed
the methodology of BPP, tagged the initiative as “brilliant way of measuring the deep fundamentals of inflation”. Later Cavallo (2012) argued that the price index constructed out of the daily data, is likely to provide alternate measure of inflation compared to official data and that too for countries like Argentina where official estimates have been subjected to criticism and the hyper-inflationary episodes have been observed frequently in recent past. On similar footing, Varian (2010) constructed price index (also called Google Price Index (GPI)), using google price data where he observed that the constructed index (GPI) tracks the development of pricing situation even better than the official index of consumer price. Having acknowledged the benefits of using the daily price quotes, one of the major criticism of this approach has been lack of coverage in terms of spatial distribution and product covered. Particularly this point holds good for emerging market economies (EME) as the coverage of such online groceries have been found to be limited in these economies. However the recent growth observed in the sales growth of these online grocery companies points towards increasing preference shift towards online ordering for food items and such evolution in consumer preferences vote for a relook into the feasibility of using daily price data in EME.
III.
Approach and data used
The paper uses daily price data from one of the major online grocery of India, Bigbasket.com where daily price of around 2200 commodities have been monitored over 5 months horizon. The data structure of the item level daily price quotations included the following details (Table 1) Table 1: Data structure – daily price data Item Code
Unique identifier of each commodity which is maintained by online grocery
Item Description
Description of the item
City
City name
Sale Price
Sale price quotation of commodities
MRP
Maximum retail price
The business operation of the aforesaid online grocery store has been found to be restricted only to 5 major cities namely Hyderabad, Bengaluru, Chennai, Mumbai and Pune. The products offered through this portal was majorly classified into food and beverage categories
of existing CPI item basket (base 2012=100). The approach adopted in this paper broadly follows the steps followed for constructing official estimate of consumer price index by India’s official agency, Central Statistical Office (CSO). The Central Statistical Office started publishing the new CPI with base 2012=100 in recent times. Apart from the revamping the consumption basket and corresponding weights based on 68th round of household consumer expenditure survey of NSSO, the recent base revision also adopted new methodology of computing the index value as the new index value is computed as geometric average of commodity prices following Jevons’s approach. The empirical analysis carried out in this paper adopts Jevons’s approach to estimate the representative price across commodities and subsequently rolls up the price across cities and commodity groups using weighted average. Step-wise illustration of the approach has been described below
1. Step 1: Data cleansing and transformation The daily price data has been cleansed prior to processing. The items which does not have any reported price, has been excluded from the analysis. On the other hand, sale price generally includes the discount provided by the company for marketing purpose and hence MRP has been used as representative price quotation across commodities. However as the information on MRP is not available for Bengaluru, the sale price has been used as proxy for MRP for Bengaluru only. Also the entire exercise was carried out excluding Bengaluru for checking the robustness.
2. Step 2: Mapping online grocery commodities to CPI item basket A comparative study of unique commodity list across different cities indicate that the maximum number of commodities are being offered in Bengaluru whereas minimum number of commodities are being offered in Pune (Chart 1).
Chart 1: Unique commodities offered in different cities
No. of unique commodities offered
876
765
702
614 525
Bangalore
Hyderabad
Mumbai
Pune
Chennai
Next the unique list of commodities which are available from daily price data, has been mapped against CPI consumption basket in order to perform the aggregation according to CPI item basket. Also the mapping provides input to assess the extent of coverage of item basket in comparison with CPI basket and thereby helps to explain any diverging momentum observed in the estimated price index. Here the items which constitutes of the broad item groups of CPI and index data is available, are mapped against the Bigbasket items for comparative analysis. In this paper, the mapping revealed that majority of items covered by BigBasket falls under Food and Beverage item group of CPI consumption basket. Also the larger proportion of items mapped against commodities like ‘Other fresh Fruit’, ‘Other vegetables’, ‘Palak and other leafy vegetables’ etc. points towards lack of mapping between these two item baskets (Table 2). Table 2: Summary of Mapping with CPI item basket
CPI Items Palak and other leafy vegetables other fresh fruits other vegetables gourd, pumpkin mango beans, barbati apple banana (no.) onion processed food brinjal tomato potato flower (fresh): all purposes
No. of items mapped 260 257 254 161 138 94 92 72 72 70 64 55 51 46
CPI Items grapes green chillies carrot lemon (no.) papaya cabbage orange, mausami (no.) ginger (gm) fruit juice and shake (litre) garlic (gm) lady's finger pears/nashpati watermelon coconut (no.) dhania (gm) peas guava others: birds, crab, oyster, tortoise, etc. groundnut cauliflower leechi berries pan: leaf (no.) jackfruit dates parwal/patal, kundru raisin, kishmish, monacca, etc. tamarind (gm) turmeric (gm) other nuts black pepper (gm) sugar - other sources Total
No. of items mapped 41 40 37 32 32 29 27 26 24 24 24 21 21 20 17 15 14 14 11 9 9 8 8 7 6 6 5 5 5 4 2 2 2231
3. Step 3: Construction of Index Since the benchmark indicator for the analysis is CPI item level data which are available at monthly frequency, the daily price has been converted to monthly frequency using simple average over days. The next step is to derive the retail price index at commodity level. For that, the retail price (MRP) of granular commodities are aggregate using geometric average to derive the representative price relative in line with the practice followed by official
agencies in India. The derived representative price relative represents the average change in price of the item (belonging to CPI item basket) and use of geometric mean smoothens the price fluctuations across commodities within CPI item to certain extent. 𝑐𝑖𝑡𝑦
𝑐𝑖𝑡𝑦 𝐶𝑃𝐼𝑖𝑡
𝑐𝑖𝑡𝑦
where 𝐶𝑃𝐼𝑖𝑡
=
𝑃𝑖𝑗𝑡
𝑐𝑖𝑡𝑦
𝑃𝑖𝑗0
× 100
is the price relative of ith CPI commodity at time t for particular city, 𝑃𝑖𝑗𝑡 & 𝑃𝑖𝑗0
is the price of jth commodity falling under ith CPI item at time t. Here the price relative is calculated with respect to time point April 2015 and hence the price relative for Apr-15 would be 100. The price relative in subsequent months would thus represent the relative change in prices of ith commodity of CPI basket compared to Apr-15 price. The representative price of CPI item is them aggregated across cities using the relative weight of urban consumption share of the respective states in which the cities belong. The consumption share of each state has been estimated by CSO at the time of preparing the new index of CPI. Using the consumption share, the aggregate price relative of each CPI commodity has been derived as follows 𝑎𝑔𝑔𝑟 𝐶𝑃𝐼𝑖𝑡
=
1 ∑𝐶𝑐=1 𝑤𝑖𝑐
𝐶
× ∑(𝑤𝑖𝑐 × 𝐶𝑃𝐼𝑖𝑡𝑐 ) 𝑐=1
where 𝑤𝑖𝑐 is the consumption share of i-th commodity in City ‘c’, c=1(1) C.
4. Step 4: Comparative analysis and deep-dive The index thus obtained provides information on the month on month variation in the commodity price. Here the variation in price fluctuation can be attributed by seasonal variation also. With this background, CPI item level indices are rebased at Apr-15=100 for comparative purpose. Assuming the seasonal pattern of the price fluctuation remains invariant across these two series of indices, the comparative analysis has been carried out using visual inspection. If the derived series is found to track the official price data, the information content of the daily price data is likely to provide insight about official data at higher frequency. On the other hand, any divergent momentum between these two sets of indices would be analyzed further to identify the reason of divergence which may attributed towards lack of mapping, coverage and price inelasticity in retail and online market.
IV.
Empirical Findings
1. Comparative Analysis The price index derived at each commodity level indicates a mixed scenario. Within vegetables, the price momentum has been found to be moving in line with official data for items like ‘Potato’, ‘Onion’, ‘Tomato’ and ‘Palak’ (Chart 2-2(a)) whereas divergent momentum has been observed in case of ‘Cabbage’ and ‘Brinjal’ (Chart 2(b)) Chart 2: Price momentum at item level (vegetables)
Chart 2(a): Price momentum at item level (vegetables)
Chart 2(b): Price momentum at item level (vegetables)
On the other hand, the price momentum observed from the derived index, is found to differ from the official estimates in most of the items (Chart 3-3a) Chart 3: Price momentum – Fruits
Chart 3a: Price momentum – Fruits
220
114
Pappaya
112
200
110 180
108 106
160
104 140
102 100
120
98 100
96
80
94
Apr-15
May-15
Jun-15
CPI (Derived)
Jul-15
Aug-15
Composite
2. Deep-dive analysis Further divergence in price momentum has been analyzed in terms of spatial disparity in price condition and mapping issues. For instance, city-wise price momentum of ‘Apple’ indicates wide divergence among 4 cities under consideration (Chart 4) Chart 4: City-wise price divergence
Chart 4: City-wise price divergence (Contd.)
Such wide divergence among the price pattern of items contributes to the divergence of the derived index with official data as the official agencies uses state level price quotations for developing the price index. Apart from the spatial coverage, the mapping of items also influences divergence in the momentum. The varieties of apple obtained from daily data, comprises of varieties which are generally of supreme quality (primarily imported) and thereby the representativeness of the price behaviour is likely to diverge from the official data (Table 3). Table 3: Mapping of commodities from daily price with CPI basket
Apple
CPI
BigBasket Fresho Apple - Fuji
Fresho Organically Grown - Apple
Fresho Apple - Royal Gala
Fresho Apple - Queen
Fresho Apple - Washington
Fresho Apple - Shimla
Fresho Apple - Green
Fresho Apple - Golden Delicious
Fresho Apple - Kinnaur
Fresho Apple - Granny Smith
Fresho Apple Fuji Premium
Fresho Apple - Indian
V.
Concluding remarks
The rapid expansion of online grocery stores and shift in consumer preferences in recent times points towards using high frequency price data for developing competitive price indices against official statistics. Cavallo and Rigobon (2011) started the initiative of analysing the price momentum exhibited using large set of commodities from major online groceries across 70 countries. However similar exercise is hardly found in case of emerging market economies. This paper tries to assess the information content of daily price data in view of the official estimates in Indian context using online data of one of the largest online grocery store in India. The paper uses the item level price quotations to map against the CPI consumption basket and derives the retail price indices at item level following same methodology as official data. The empirical findings suggest that the daily price data is able to track the price momentum across commodities which matches with the official data. However divergence in price momentum is also observed in certain commodities. Further detailed analysis reveals that the divergence of price momentum can be attributed to regional coverage and items covered under these online services. Such divergence between online data and official estimates has also been observed by Cavallo (2012) when he compared the official price indices of Argentina with daily online prices. As these online groceries are spread out only in metro cities, the price quoted in these online groceries are often representative of premium quality of items. As the price experience differs among the different regions of country, the cumulative behaviour of the macro-level price momentum is often misleading in nature. The spatial dimension of the daily price data also provides detailed view on the price condition prevailing in different parts of the country and thereby enables the policymakers to take appropriate policy actions on real time basis. The paper contributes to the literature of Big Data analytics by suggesting an alternative data source which is timely available and provides valuable insights about the price directions. Such analysis which is an extension of Billion Prices Project (BPP), is first of its kind in context of emerging market economies and is expected to provide important insight about using alternative source of information for deriving price indices. The paper uses daily price data of one of the major online groceries for 5 months and tries to compare the price momentum. Further scope includes extending the timeline of analysis along with increasing coverage of the item basket by incorporating larger information set from other online groceries.
VI.
References
[1] Brian.D.Humphrey, "Forecasting Existing Home Sales using Google Search Engine Queries", Ph.D Thesis, Duke University, 2010 [2] Cavallo, A. and Roberto Ribogon, ‘The Distribution of the Size of Price Changes’, MIT Press, 2011 [3] Cavallo, A. ‘Scraped Data and Sticky Prices’, MIT and NBER, 2012 [4] Cavallo, A., Online and official price indexes: Measuring Argentina’s inflation. Journal of Monetary Economics (2012), http://dx.doi.org/10.1016/j.jmoneco.2012.10.002 [5] Cavallo, A., Guillermo Cruces and Ricardo Perez-Truglia, ‘Inflation Expectations, Learning and Supermarket Prices’, MIT Press, May 2015 [6] Community Whitepaper, "Challenges and Opportunities with Big Data", A community white paper developed by leading researchers across the United States, 2011 [7] Choi,Hyunyoung and Hal Varian, "Predicting the present with Google trends" (December 2011) [8] Hal .R.Varian, "Big data :new tricks for econometrics", 2011 [9] Y. Fondeur and F. Karame,"Can Google data help predict French youth unemployment?" (2013) [10] Nyman et. al., "Big data and economic forecasting: a top-down approach using directed algorithmic text analysis”, Centre For The Study Of Decision-Making Uncertainty, Faculty Of Brain Sciences, University College, London [11] Nick McLaren (of the Bank’s Conjunctural Assessment and Projections Division) and Rachana Shanbhogue (of the Bank’s Structural Economic Analysis Division.),"Using internet data as economic indicators" [12] Nikolaos Askitas and Klaus F. Zimmermann, "Google econometrics and unemployment forecasting", 2013 [13] Yan Carriere-Swallow Felipe Labbe,"Nowcasting With Google Trends In An Emerging Market", 2012