Image © Andrea Leopardi via Unsplash

ISOC
The Data Dump Dilemma: Why Public Digital Platforms Risk Failing the People They Serve

By Saadia AzimGuest AuthorChief Operating Officer, Bangla Sahayata Kendra (BSK)

 

When Typos Erase Your Existence 

In Assam, India, a man named Ibrahim Ali was denied Indian citizenship because his father Nurul Islam’s name was recorded as “Late Nurul” in a digitized document based on a 1965 voter roll. A woman from Baksa district was declared a foreigner because she couldn’t recall her father’s electoral details from the 1960s. 

These aren’t bureaucratic quirks. They’re life-altering judgments. Over 85,000 such judicial cases are pending, many hinging on minute spelling errors, unclear legacy documents, and procedural mismatches. 

This represents digital exclusion at its most dangerous, where identity is reduced to data points and any inconsistency can transform citizens into stateless people. In these cases, data is no longer a bridge. It becomes a barrier.  

Governments across the Global South are racing to digitize services. The promise is enticing: efficiency, transparency, inclusiveness. Yet the underlying verification infrastructure remains broken, unclear, and disconnected from lived realities. Building more sophisticated digital infrastructure on top of fundamentally flawed data systems risks scaling exclusion rather than solving it. Data-driven governance, without adequate safeguards, can strip people of their identity and rights. 

From my experience operating one of India’s largest public service delivery platforms, which coordinates over two million data points daily across 3,561 digital kiosks, I have witnessed first-hand that more data does not mean better governance. It often means more confusion and more friction.  

Digital Exclusion in Public Services 

Digital exclusion extends beyond citizenship verification into everyday digital services. Consider the seemingly simple task of paying an electricity bill online. People must provide not just their consumer number, but also their full name, complete address, mobile number, and sometimes even parental details, all to pay a utility bill. The process typically requires One Time Password (OTP)-based validation, exposing mobile numbers, consumer details, and other personal information during each transaction. For other services, like banking and health, the data demands are even more exhaustive.  

For those with limited digital literacy, particularly in rural areas where roughly 60% of people are not actively utilizing the Internet despite having access, this creates a cycle of dependency. They must rely on others to navigate online services, sharing sensitive details that must be inputted with perfect accuracy to match across multiple verification points. A single typo means the transaction fails, bills go unpaid, and essential services may be disconnected. 

What emerges is a pattern of excessive data collection for even the most basic interactions with the state. This not only burdens people but also heightens the risk of data leaks and misuse.  

Data as Burden, Not Bridge 

The current digital public infrastructure (DPI) models often treat data as an endpoint, a proof of process or compliance, rather than a governance tool. Once collected, data tends to sit in isolated silos, with little standardization across departments. This fragmentation defeats the very premise of digital interoperability. 

While India’s DPI processes millions of transactions daily through platforms like UPI, the underlying architecture struggles with interoperability, forcing people to repeatedly submit the same documents across different portals. This systemic congestion isn’t just a technical inconvenience; it’s a governance failure where the very infrastructure meant to democratize access instead creates new barriers through performance delays and system instability. 

The operators, too, are left juggling fragmented dashboards without access to the logic behind the system. A pension application flagged for “document mismatch” provides no indication of which document failed and at what point. A school admission form returns an “error in data validation” without specifying the field. This lack of clarity creates unnecessary obstacles and erodes trust at both ends. 

This is also the essence of the data dump dilemma. A data dump occurs when platforms demand exhaustive personal details upfront, treating people as data sources rather than service recipients. Unlike targeted data collection that serves a clear purpose, these digital platforms operate on a “collect everything, sort later” principle. The result is bloated databases filled with redundant information that creates friction for users while offering little additional value to governance.  

DigiLocker, India’s flagship document storage platform serving over 43 crore users, regularly experiences server overloads during peak periods like when exam results are published, forcing people to wait and retry multiple times. UMANG, the unified government services app, frequently displays “session token expired” and “something went wrong” errors due to heavy traffic at department servers.  

Even when people correct their data, updating names, addresses, or Aadhaar numbers, the system may still reject them. Why? Because backend memory often defaults to older entries stored in fragmented caches or unsynchronized databases. This ‘memory persistence’ means the system trusts its own flawed recollections more than the person standing in front of it.   

When public utilities like digital portals don’t forget when they should—or refuse to remember when they must, they cease to serve and begin to punish. 

Low Digital Literacy + High System Complexity = Digital Disenfranchisement 

In rural India, digital literacy is not a given. Most people interact with digital systems through intermediaries such as Bangla Sahayata Kendras (public service kiosks). This makes the quality of interface, language accessibility, and procedural clarity critically important. Yet many platforms are designed with the assumption of a digitally fluent user. English-heavy forms, unclear error messages, and lack of visual aids further alienate users.

Operators also may struggle with rigid formats, high rejection rates due to system errors, and minimal training. For example, uploading a land record might require image compression that is neither explained nor facilitated by the platform. If the server fails, the entire day’s applications may be lost. 

This creates stress points where people feel humiliated and operators helpless. The digital promise becomes a real-world disappointment. Assisted digital access, community engagement, and continuous operator feedback are strengths that need to be scaled alongside platforms, not left behind in the rush to expand. Moreover, the architecture of most government platforms is designed for upward reporting, not downward communication. Dashboards are built to monitor volume, not value. Success is measured in metrics like “total applications processed” or “portal uptime,” not in terms of user satisfaction, equity of access, or resolution rates. 

This top-down design language becomes exclusionary. It privileges data extraction over civic engagement, and visibility over usability. 

Reimagining Data Architecture: From Extraction to Empowerment 

To truly serve people, DPIs must take a human-centred approach. This requires: 

  1. Interoperable and adaptive platforms: Systems should be designed to talk to each other. People should not have to submit the same documents repeatedly. Inter-departmental data integration can significantly reduce redundancy.
  2. Localized and multilingual interfaces: Platforms must speak the language of their users. This includes regional languages, voice interfaces, and visual storytelling for non-literate users.
  3. Transparent feedback loops: When a service fails or data mismatches occur, the system must explain why and offer clear remedial steps. Feedback should be treated as valuable data, not noise.
  4. Operator-centric tools: The real-time experience of operators must inform platform updates. These frontline workers are the human face of DPIs and their input is invaluable.
  5. Inclusive design labs: Regular usability testing with diverse groups of people should be institutionalized. If the design fails the most vulnerable, it fails everyone. 

The Global South: A Cautionary Tale or a Leadership Opportunity? 

As digital platforms proliferate in governance systems, the risks of exclusion, inefficiency, and unclear data practices are not unique to India but indicative of a broader structural challenge across the Global South. 

Governance is not code. It must account for history, diversity, language, and power. In contexts where literacy, infrastructure, and trust deficits are high, DPIs must serve not only as service delivery tools but also as instruments of inclusion. This means investing in community consultations before rollout, translating digital rights into regional languages, and building grievance redressal systems that are accessible both online and offline. 

From Dashboards to Communities 

The future of digital public infrastructure does not lie in grander dashboards or larger datasets. It lies in smaller questions: Can a widow in remote Purulia in India’s West Bengal access her pension without help? Can a migrant worker in India’s Bihar get a caste certificate in one, instead of five, attempts? Can an operator in India’s remote Uttar Pradesh region resolve a system error without having to call multiple helplines?  

If the Internet is the utility, then people are its rightful consumers. And any utility that forgets its user is bound to fail. It is time we moved beyond the data dump and toward digital dignity. 

Disclaimer: Viewpoints expressed in this post are those of the author and may or may not reflect official Internet Society positions.