
The workshop hosted at Bharat Mandapam, in New Delhi, India / Photo: MOSPI
Official statistics have traditionally relied on standardized methods like censuses, surveys, and administrative records. However, the rapid digital transformation of economies, businesses, and societies has led to an explosion of new data sources. By integrating these with traditional data, and enabling innovative uses, we can begin to close persistent data gaps and generate real-time, granular insights. When combined with frontier technologies such as data science and AI, these innovations present a powerful opportunity to modernize the production of official statistics. Achieving this, however, requires balancing tradition with innovation and addressing challenges around methodological rigor, data privacy, analytical capabilities, and data governance.
These opportunities and challenges formed the core of the discussions during the National Workshop on Using Alternate Data Sources and Frontier Technologies for Policy Making, held on June 5-6, in Delhi, India. The workshop, co-hosted by the Ministry of Statistics and Programme Implementation (MoSPI) and NITI Aayog, with the World Bank as a knowledge partner, brought together over 400 experts, including statisticians, data scientists, regulators, researchers, and policy professionals, from central and state governments, academia, think tanks, the private sector, and international organizations.
"The workshop is timely and relevant in today’s data-driven governance landscape, where the future lies in the intelligent integration of traditional and alternate data to generate holistic insights. As we embrace technologies like AI, we must do so with clarity of purpose and thoughtful balance," said Dr. V. Anantha Nageswaran, Chief Economic Advisor to the Government of India.
The central question posed to participants was: How can traditional official statistics benefit from alternative data sources and modern data science techniques?
The workshop featured four parallel technical tracks (discussed in the subsequent subchapters), each focusing on distinct data sources and technologies with potential for official statistics to improve statistical productivity, quality, and analytical insights to support evidence-based policymaking:
Mobile Phone Data (MPD): a new frontier for smarter policy
One of the most groundbreaking topics discussed was mobile positioning data, which is location data from mobile phone networks that shows how people move in real time. First used widely during the COVID-19 pandemic, to track mobility and support public health responses, MPD now has much broader applications. MPD offers valuable insights for disaster response, urban planning, transport infrastructure, migration, and tourism. For example, it can help monitor foot traffic at tourist destinations, identify travel routes, and assess how long people stay in different places—all in near real time. When combined with immigration, spending, and survey data, it creates a comprehensive tourism analytics toolkit to support policy and boost hospitality and retail sectors.
The global use of MPD in official statistics is growing. Following Eurostat’s feasibility study in 2014, countries like Estonia, Finland, Spain, and Indonesia have adopted MPD to enhance tourism data. In 2024, Estonia’s central bank released the first national methodology for producing tourism statistics using MPDi.
Figure 1: Turning “Messy” Mobile Positioning Data into Tourism Statistics: an example from Dubai
In India, with over 1.2 billion mobile subscribers, there is vast potential to leverage this data. The workshop united statisticians, telecom regulators, policymakers, and major network operators like Airtel and Jio to discuss collaboration, data privacy, analytical capacity, and ensuring reliable estimate with support from the World Bank’s Global Data Facility’s initiative to harness mobile data for policy.ii
Scanner data: revolutionizing retail and price analysis
Another important workshop topic was retail scanner data, collected from barcode scanners at retail points of sale. The discussions explored how this data can modernize the way we understand consumer behavior and compile price indices. This data reveals what people are buying, where, in what quantities, and at what prices. It enables more frequent and detailed updates to Consumer Price Indices (CPI), which are vital for economic planning, wage setting, and monetary policy. As e-commerce grows, new techniques like web scraping through APIs offer even more digital price data—especially for goods like airfare, streaming services, or e-books that are now almost exclusively purchased online.
In India, a large proportion of retail sale still takes place in small shops where transactions are recorded manually (if at all), and the final purchase price is determined through bargaining. However, the country’s organized retail and e-commerce sectors are expanding rapidly. Projections suggest they will account for around 20% of all retail sales in 2025 with a trend of further growth up to 34-38% in 2030.
The workshop enabled MoSPI and state statisticians to engage with major retailers like Reliance and V-Mart, as well as regulators to plan a pilot for using scanner data and other alternate sources for price statistics.
Geospatial applications: mapping the future
Geospatial applications were another highlight of the workshop. These include satellite imagery, GIS mapping, and drone-based data collection (including underwater drones), which are transforming the way we understand land use, population shifts, and environmental change. The workshop brought together statisticians from MoSPI, satellite images providers from the India Space Research Organization (ISRO), geospatial experts from academic and research institution in India and abroad, state statistical organizations and remote sensing agencies and GIS technologists from the private sector.
Experts from ESRI and WorldPop Lab at the University of Southampton, and others demonstrated how continuous capture of high-resolution imagery can help detect changes in land use and urban development and can be used to update sampling frames and estimate population density and distribution more accurately.
Another innovative example came from the World Bank’s Blue Economy Pathways study, which piloted a National Ocean Accounts Framework in Tamil Nadu. Using state-of-the-art geo spatial technologies and remote sensing data, the study mapped critical ecosystems like coral reefs, mangroves, and seagrass beds, helping monitor their health and track changes over time, detecting areas of degradation or recovery. This work is now feeding into MoSPI’s efforts to develop Ocean Ecosystem Accounts nationally. Finally, novel geospatial analytical techniques can integrate various geo-referenced data sources and indicators to identify growth hubs, pockets of poverty and zones of vulnerability that require special attention of policymakers.
Data science and artificial intelligence: the game changer
No discussion about modern data sources would be complete without data science and artificial intelligence. These technologies are changing how we collect, process, and analyze information. From machine learning models that detect anomalies to AI tools that automate classification, their applications for statistics are vast. Small and large language models (SLMs and LLMs) are also reshaping how users interact with data. Imagine, querying multiple trusted data sources database using natural language, asking questions and getting charts, summaries, or maps. StatGPT is one of several efforts to provide a platform for statistical organizations to query, transform, analyze, visualize, and interpret statistical data. The World Bank’s Data for AI and AI for Data initiative is also providing innovative ways to make data understandable and AI-ready.
Workshop panels explored how to harness AI while addressing challenges such as privacy risks, ethical use, institutional readiness, and capacity gaps. The consensus: AI must be seen as a force multiplier for statisticians. But leveraging its full potential will require targeted investment in technology, upskilling, and stronger partnerships with centers of excellence.
What’s next: future-proofing official statistics through collection action
The workshop concluded with draft action plans for the four priority areas: mobile positioning data, scanner data, geospatial applications, and data science and AI. What distinguishes these plans from previous efforts is the central role of collaboration—not only among government agencies, but also with academia, international organizations, think tanks, and critically, the private sector. Private industry involvement brings in cutting-edge expertise and access to innovative data sources shifting from siloed initiatives to a more integrated, cooperative approach, in line with recommendations from the World Bank’s World Development Report 2021: Data for Better Lives.
Next steps include implementing proof-of-concept pilots with the aim of scaling to production. Embracing these innovations can transform official statistics into a dynamic and forward-looking tool that empowers policymakers with the insights needed to navigate complexity and drive smarter, faster decisions.
Comments provided by Federico Polidoro, Siim Esko (Positium), Anuja Shukla , Ayago Esmubancha Wambile, Simonti Chakraborty