India's unique identification project (also known as the Aadhar Plan), which completed the collection of demographic and biometric data earlier this week, was more than 500 million Indians-the largest of its kind in the world today.
The implementation of the project has been accompanied by conflicting voices from privacy and security as well as other aspects. The latest developments in the Aadhar project have raised concerns about its methods of capturing, storing, and managing data, especially the role of an American start-up company, MongoDB.
MongoDB is a NoSQL database start-up company that raised money last year from the CIA-funded In-q-tel agency. In-q-tel is an independent, non-profit capital agency backed by the CIA and other U.S. intelligence agencies.
In the past few days, several Indian media reports have quoted the views of political parties and activists in the country, suspecting that privacy data for the Aadhar project had been embezzled, directed at the Infosys co-founder of Nandan Nilekani, the head of the project.
There are also reports of articles that include MongoDB among the critics.
Governments around the world are increasingly wary of the US National Security Agency (NSA) eavesdropping, and anything that is linked to U.S. government intelligence is buzzing. Not only that, because of the imminent election of India next year, the country's political views have reached an unprecedented level.
The timing of such accusations cannot come at a worse time, at least for this ambitious identification project, which Aadhar is awaiting passage of a congressional bill that will be fully recognized as a constitutional institution this year.
The author visited the Aadhar Project office in Bangalore (Bangalore) and, to tell you the truth, according to the staff who presented me with the information, although there were accusations that large contracts contain content that shares data with MongoDB, Aadhar uses MongoDB open source code. And does not touch sensitive data. The meeting also had the opportunity to learn how the world's largest biometric database works and how to deal with security and privacy concerns.
Not only that, India's sole identity bureau (the unique Identification authority of India) has refuted accusations of sharing Indian national data with any U.S. agency.
What does Aadhar mean for India?
The first is to clarify the context of Aadhar, what does this project mean for a country like India? More than 500 million people in the country do not have any formal identification (ID) or such credentials, which leads to many other problems, such as lack of government subsidies, bank accounts, application for loans, driving licences, etc. The Aadhar database project, currently recorded at a rate of 1 million Indian nationals a day, is expected to register to complete about 1.2 billion people by the end of next year, and will become the largest biometric database on Earth.
The biggest advantage of obtaining a 12-digit length Aadhar code is the ability of the government to link bank accounts to poor people, direct cash interests and other subsidies for bank transfers. Currently, nearly 40 million bank accounts in India have matched Aadhar data.
A report by CLSA, a market research institute, shows that more than 40% of the government's $250 billion trillion in subsidies and other national entitlements are aimed at the country's poor, but will be wasted in the next few years. The Aadhar program can remove the link between processes and direct cash transfers to people in need of government subsidies to curb corruption in this way.
But think tanks and activists, including the Internet and social centres in Bangalore (Pew for the Internet & powering), have always been skeptical about privacy issues and even questioned how effective the whole project could be.
Penetrate the world's largest biometric database
The authors have tried to meet with Aadhar project officials to understand security issues, current progress, and their response to criticism using MongoDB technology.
In Friday Aadhar finally agreed to meet me at headquarters in the southern suburbs of Bangalore, where Intel and Cisco's headquarters are located. On the surface, the Aadhar Technology Center, which stores all of India's national data (currently 5 petabytes), is nothing like a government building-it's easy to think of as one of the nearby Intel or Cisco office buildings.
Inside, I came to a central location with more than 10 TV screens in the room, a few more than 20-year-old young engineers sitting excitedly in front of their own computer keyboard, tapping on the data packet transmission of storage information, the scene is like an advanced control center. The TV screen They stared at showed the records of these data parcels (each 5MB or so), starting at 30,000 entry centers nationwide, with at least three information verification processes. The verification process package carries out a repeat test for each file to ensure that the same person will not be generated two times Aadhar numbers.
That is to say, every new data file has to run a "repeat" Test against all existing files, which is now over 500 million.
Former Intel engineer Srikanth Nadhamuni helped design the Aadhar technology platform in September 2010, which is currently running at Bangalore's Khosla lab. He told me that these packets are processed by 2048-bit encryption, which triggers the self-destruction (self-destruction) function once an unauthorized invocation attempt is attempted.
Criticism of MongoDB
So why does Aadhar have to work with MongoDB at first? Will this partnership continue?
Sudhir Narayana, assistant Director-General of the Aadhar Technology Center, said MongoDB was only one of several products initially selected for data retrieval, including MySQL, Hadoop, and HBase. Unlike MySQL, which can only store population data, MongoDB can also store images.
But then Aadhar gradually shifted most of the database work to the MySQL platform because they realised that MongoDB couldn't handle large numbers of data, millions of data parcels.
They are already using database fragmentation (db sharding) technology: Storing data packages on different machines to ensure that the system does not crash when the volume of data increases.
This approach helps Aadhar reduce dependency on MongoDB and instead use MySQL to store most of the data.
Ashok Dalwai, Deputy Director-General of the Aadhar Technology Center, told me that MongoDB could not invoke any biometric data.
"We think that using open source technology can avoid relying too much on a supplier, but that doesn't mean we're compromising security in any way." Ashok Dalwai said.
In our interview, a MongoDB spokesman suggested that we go to the company's website to read the statement on In-q-tel investment.
More importantly, India's only identity agency (UIDAI) has been using MongoDB Open-source technology long before the start-up company got its investment from In-q-tel. Crunchbase's data show that MongoDB raised a total of $7.7 million from Red Hat, Intel Capital and In-q-tel in 2012 alone.
What is the future of Aadhar?
Aside from all the controversy, Aadhar will complete the goal of inputting more than 1.2 billion Indian national data in 2014, with a total database of petabytes. The current project is progressing at a rate of 1 million people a day, starting next year with an estimated 2 million people per day, bringing the remaining 700 million into the database system.
(Responsible editor: The good of the Legacy)