MeitY Releases Draft Guidelines on Data Anonymisation for E-Governance

September 5, 2022

Ministry of Electronics & Information Technology (MeitY) has releases Draft Guidelines for e-Governance projects which is open for public consultation till 21 September, 2022.

The guidelines suggest various techniques and SOPs that e-governance projects can adopt to anonymise the data they gather (and then harness it for other projects). They also aim to support the implementation of data anonymisation provisions in policies and laws enacted by the government.

News Updates is Brought to You by SATiiTV.COM

The draft report was commissioned by the Ministry of Electronics & Information Technology (MeitY) and prepared by the Standardization Testing Quality Certification (STQC) Directorate and Centre for Development of Advanced Computing (C-DAC). A full list of policymakers involved in framing the guidelines is available in Annexure 2 of the uploaded PDF.

How to participate?

Email your feedback to Shubhanshu Gupta, Principal Technical Officer at CDAC: shubhanshug[at]cdac[dot]in. Remember to copy the following email address when making your submission: headits[at]stqc[dot]gov[dot]in.

Why anonymise e-governance-related data?

The draft guidelines are clear in their belief that data can play a role in empowering both e-governance and the nation. Emphasising that ‘data-tech’ is now considered a good and central to international dialogue and collaborations, the draft goes on to add that government entities are the ‘most extensive’ data fiduciaries in India. This makes them responsible for protecting the privacy of the reams of citizen data collected through large-scale, interconnected e-governance projects. Data anonymisation—or removing identifiable attributes from a data value to protect a person’s identity—is one such method, according to the guidelines. It both protects citizens while also allowing for the use of anonymised data for limited purposes in other e-governance projects.

Does anonymisation work?

The draft guidelines add that data anonymisation is not a silver bullet solution to ensure privacy, arguing that it should be a component of a larger privacy-by-design approach to an e-governance project’s operations. However, it offers a muted response to arguments that data can never truly be anonymised (and privacy risks mitigated), musing that time and technological advancements will offer more robust solutions to these issues.

When was this released?

In July 2022, according to the PDF uploaded online. However, brief news reports on the draft guidelines—and a similar set of recommendations for mobile security—prominently appeared around August 30th, 2022. As a result, some commentators have recently questioned MeitY’s muted marketing of the guidelines and the consultation period.

One step further for non-personal data policy?

The draft guidelines appear to be the next step in India’s policy tryst with harnessing the ‘non-personal data’ (data not related to an individual, or anonymised personal data) of citizens. Since 2020, India has been flirting with the idea of managing and sharing non-personal data to tap its ‘social and public value’ and improve digital innovation and entrepreneurship. These plans then found their way into India’s now-withdrawn proposed data protection laws, although there are now rumours that it may be separately regulated from personal data.

Then came the National Data Governance Framework Policy released by MeitY in May 2022, which seeks to build ‘a vast repository of anonymised, non-personal data obtained from government ministries, departments and organisations, alongside anonymised data voluntarily disclosed by private entities’.

While the government purportedly seeks to improve the quality of business and governance through such non-personal data policy frameworks, they could harm hard-earned competitive edges in India’s tech sector, while still raising privacy concerns. Also, while the 2022 draft guidelines want to help the government build ‘vibrant, diverse and large base of datasets for research and innovation whilst maintaining informational privacy’, whether the State is competent to do so, or whether datasets filled with questionable data can indeed improve governance, largely remains to be seen.

Who Are the Stakeholders Involved in the Anonymisation of Data?

Professional users:

People who use the anonymised data captured or processed by an e-governance organisation (the application that captures this data is called the ‘owner application’). These can include citizens, call centres, third-party service providers, researchers, and data analysers. It can also include departments or applications that have to access the data produced by the source application.

Processors: Teams involved in processing the data captured by the owner application. They convert raw data into anonymised data. They include development and testing teams, production support, and system administrators.

What is the recommended SOP for anonymising data?

The 15-step SOP for organisations undertaking data anonymisation suggests:

Step 1: Determine which datasets required anonymisation. Consider data collected from all possible sources.

Step 2:  Devise a release model or policy on how the anonymised data will be released and to whom. Decide whether this dataset will be publicly available, or shared with controlled groups.

Step 3: Identify the teams required within the organisation to perform anonymisation. Identify their roles and responsibilities.

Step 4: Determine which data directly identifies an individual (direct identifiers like phone numbers, and interestingly, Aadhaar) and which data indirectly does so (quasi-identifiers like sexual orientation or religious belief). This will help decide which data should be anonymised and the techniques to do so.

Step 5: First mask—or anonymise—direct identifiers. This keeps the dataset free from re-identification risks.

Step 6: Conduct threat modelling for quasi-identifiers, so identify what information could be revealed as a result of them.

Step 7: Determine the re-identification risk threshold based on the anonymisation techniques deployed.

Step 8: Determine the anonymisation techniques for quasi-identifiers, and document the process.

Step 9: Import sample data from the original database and document the same.

Step 10: Based on steps 6-9, perform a trial anonymisation and assess whether the results meet risk-limitation expectations. Review and correct errors, and ensure that risk is below the re-identification threshold.

Step 11: Now, anonymise all quasi-identifiers across the dataset.

Step 12: Stop to evaluate the actual identification risks for the anonymised data again.

Step 13: Compare this risk with the threshold laid out by policymakers—if it falls short, evaluate and repeat the testing.

Step 14: Determine access controls for sharing anonymised data. Data owners should ensure that the parties they share the information with use it for a limited purpose and that it is not misused. Organisations receiving data should confirm that they will not attempt to re-identify it.

Step 15: Document the anonymisation procedure. This will help auditors identify potential flaws in anonymisation too.

The Draft also recommends conducting a risk assessment post-release of the anonymised data. It broadly recommends putting systems in place to report data privacy incidents to concerned stakeholders within specified timeframes. To minimise these events, the Draft emphasises training e-governance officials in data anonymisation techniques across the life cycle of data processing (collection, processing/usage, archival, deletion/destruction).