With the recent release of the Kris Gopalakrishnan Committee’s Non-personal Data (NPD) governance framework, the debate on who should derive value from NPD has escalated. Startups, in particular, will be significantly impacted by the committee’s proposals, since complying with the data sharing obligations might require them to forego considerable investments put into building their own datasets, and by virtue, risk losing their competitive edge.
To help startups recognise and identify the impact of the proposed NPD regulations, Ikigai Law organised an interactive virtual roundtable session, ‘Unscramble: Implications of the non-personal data framework on startups’, on September 2, 2020.
The discussion saw wide participation from the startup community with representatives from healthtech, fintech and deeptech sectors. Led by Aman Taneja, Sreenidhi Srinivasan, and Nehaa Chaudhari of Ikigai Law, the session covered key issues under the proposed framework such as its impact on startup innovation, definitional challenges, overlaps with the personal data protection framework, the pricing of data, and compliance challenges.
Impact On Innovation: Villainising Success?
Srinivasan sought insights on the impact of data sharing obligations on startups relying on data for a competitive advantage. While the committee focusses on increasing innovation, if data sharing is mandated, the incentive to innovate and experiment with data will be diminished, given that startups would have to share that data with their competitors.
Manuj Garg, cofounder of healthtech startup myUpchar emphasised that companies spend time, effort and invest resources in creating datasets. Forced data sharing will wipe out these investments, and subject startups to financial losses. “We would not have been able to survive if the NPD framework was already in place,” Garg added.
Voicing his concerns with the compulsory sharing proposals, Cred legal and policy head Hardeep Singh said that the proposals could allow larger companies to access data held by smaller entities for cheap. Mandatory sharing requirements would also negate any ‘first mover’ advantage that companies may have accrued.
Stressing that the collection of data by itself is not ‘wrong’, Singh suggested that the law differentiate between the use and abuse of large datasets. Companies abusing their market position may warrant scrutiny, albeit, under the competition law framework, companies using data to offer better services should not be subject to data sharing obligations. As Singh elaborated, “The data sharing proposals seemingly villainise success”.
What Exactly Is Non-Personal Data
The law proposes three types of non-personal data, with considerable potential for overlap, namely, public NPD, community NPD, and private NPD.
Ashutosh Chadha, vice president and head of public policy and government affairs for South Asia at Mastercard, explained, “If a municipal corporation is repairing sewers in a colony and collects data for this purpose, is it public NPD — by virtue of being collected by a public body —, or is it community NPD — because it relates to the colony of people where the sewer belongs?”
In many cases, it is hard to determine the community to which the data belongs. For example, while aggregated cab traffic data relating to a colony of professionals such as doctors, engineers and lawyers, could be entrusted with the colony resident welfare association, the sub-community of the professionals may also stake a claim to it. It is unclear who is accountable for this data, and how such conflicts would be resolved.
The framework also does not address the potential of conflict within a community itself. Neharika Srivastava, director, legal, insurance giant Aon asked if the broad scope of the term ‘community’ could possibly include foreign communities. Given that many datasets relating to a group of customers including foreign customers can form part of a ‘community’, it is possible for the framework to extend overseas as well.
Interface With PDP: Personal Data Standards For NPD?
The NPD framework will require companies to obtain user consent before anonymising data and using it. In addition to potentially leading to consent fatigue with users, this can also create practical challenges for startups, especially for third party data processors and companies with substantial customer churn.
Vinaya Sathyanarayana, founder and CEO, Sthana.ai explained that many users may sign up to a platform, but the number eventually drops off. In such a scenario, how does a company obtain the user’s consent?
According to Sathyanarayana, platforms would effectively be disallowed from anonymising such data, and other ‘historical’ data.
Moreover, Parag Agarwal, head of partnerships at Doxper questioned the need for companies to invest resources for obtaining user consent, when the framework offers no incentives for data sharing. According to him, it would be simpler for companies to claim that their datasets do not have the necessary consents, and be excluded from any mandatory sharing requirements.
Also among the major concerns is the inclusion of inferred data as part of ‘private NPD’. Aon’s Srivastava pointed out that inferred data also falls under the definition of personal data under the PDP Bill.
Wriju Ray, CBO, IDfy said, “Inferences can be derived from personal data without looking into its ‘personal’ nature”. Inferences are already subject to portability, erasure, and correction obligations under the forthcoming personal data protection law.” Making inferred data available to competitors under the NPD framework makes any investments into deriving inferences redundant.
Doxper’s Agarwal further observed that the classification of NPD as sensitive is oxymoronic, given that anonymisation is meant to be irreversible. NPD datasets are aggregated from several unidentifiable individuals with the application of adequate privacy and security controls in most cases. Considering that NPD by its very nature does not relate to any individual, it should not be subject to the same safeguards as personal data.
Putting A Price On Data: One Man’s Trash Is Another Man’s Treasure
Determining the value of data can be tricky. Datasets that are useful for one organization may hold little to no value to another. According to Rishabh Ladha, cofounder and CBO of Squadvoice, data has little inherent value in isolation, and only gains value depending on its use. So the approach of arriving at a singular value of a dataset, depending on ‘value-add’, is bound to create complications such as overvaluation or undervaluation. Determining the quality of data can also present several challenges. For instance, data points that carry an element of bias may be less valuable, despite the input of any ‘value-add’.
It is also unclear how the law will determine a fair market value for data. As Ladha highlighted, “This may actually benefit larger companies who can push up the market value of data and price-out smaller companies.” Unlike the bigger companies, startups with lesser funds may be unable to purchase such datasets, leaving them at a disadvantage.
Ladha also questioned if anonymisation is as simple as the committee assumes it to be. Anonymising data by itself is a value addition. The requirement to share ‘raw’ anonymised data ignores the time and effort required to anonymise data. As Agarwal later observed, the cost of collecting consent should be added to the price of the dataset. Free sharing of such data will impose further losses onto startups.
Compliance And Business Challenges: A Case Of Overregulation?
Over and above existing sectoral laws and the NPD framework proposes several compliance requirements on ‘data businesses’. Doxper’s Agarwal voiced concerns over the ‘data volume’ threshold proposed. According to him, there is a lot of uncertainty in how this threshold will be defined, and that it may become a moving goalpost. Because of this, it is possible that startups could also qualify as a ‘data business’.
Richa Mukherjee, public policy and corporate affairs, PayU, pointed out, “Given that payment companies are already subject to a host of stringent RBI regulations, the framework can have the unintended impact of overregulating the movement of data.”
There is a strong likelihood for potential overlaps with sectoral regulations, which can create excessive compliance burdens that can stunt innovation and growth in several sectors. According to Panduranga Acharya, Swiggy’s legal and regulatory director, developing regulations for NPD while India’s personal data law is not settled will only create more confusion for startups. Further, as Arjun Alexander, AVP of neobanking company, Open stated that the rights, duties and obligations of actors under the framework are not clearly defined, which can result in an unclear and burdensome compliance regime.
While talking about compliance from a technical standpoint, Venkata Pingali, cofounder and CEO, Scribble Data argued that the Indian workforce may not have the capacity to actually comply with the proposed requirements, as “The data collection and classification processes of most organisations are very messy”.
The absence of data standardisation processes and tools, especially in the context of metadata, will disallow meaningful compliance. Data sharing in the absence of well-accepted standards will create more uncertainty and ‘misunderstanding’ of data, which will in turn make the implementation of the proposed framework more chaotic, said Venkata.
How Do Startups Engage With The NPD Framework?
The NPD committee is accepting comments until September 13, 2020. Considering the potential impact of the NPD framework, data-reliant startups should consider responding to the committee. The window for submitting comments presents an opportunity for startups to ensure that their concerns are duly represented before the committee.