The leak could be one of the largest on record, cybersecurity experts say, highlighting the risks of collecting and storing large amounts of sensitive personal data online, especially in a country where authorities have wide and uncontrolled access to such data. have data.
The vast amount of Chinese personal data had been publicly accessible since at least April 2021 through what appeared to be an unsecured backdoor link — a shortcut web address that offers unrestricted access to anyone with knowledge of it — according to LeakIX, a site that detects and indexes exposed databases online.
Access to the database, which does not require a password, was cut after an anonymous user announced the more than 23 terabytes (TB) of data for sale for 10 bitcoin — about $200,000 — in a post on a hacker forum last Thursday. †
The user claimed that the database was collected by the Shanghai Police Department and contained sensitive information about one billion Chinese citizens, including their names, addresses, mobile numbers, national ID numbers, ages and places of birth, as well as billions of records of phone calls to police to to report on civil disputes and crimes.
An example of 750,000 data entries from the database’s three main indexes was included in the seller’s mail. CNN verified the authenticity of more than two dozen entries from the seller-provided sample, but was unable to access the original database.
The Shanghai government and police have not responded to CNN’s repeated written requests for comment.
The seller also claimed that the unsecured database was hosted by Alibaba Cloud, a subsidiary of Chinese e-commerce giant Alibaba. In a statement to CNN, Alibaba said it is investigating the incident and would communicate further updates.
But experts CNN spoke to said it was the owner of the data that was at fault, not the company hosting the data.
“As it stands, I think this would be the biggest leak of public information to date — certainly in terms of the magnitude of the impact in China, we’re talking about the bulk of the population here,” he said. Troy Hunt, a Microsoft regional director based in Australia.
China is home to 1.4 billion people, meaning the data breach could potentially affect more than 70% of the population.
“It’s kind of a case where the genie can’t go back in the bottle. Once the data is available in the form it appears to be now, there’s no going back,” Hunt said.
It’s unclear how many people accessed or downloaded the database during the 14 months or longer that it was available to the public online. Two Western cybersecurity experts who spoke to CNN were both aware of the database’s existence before it was brought to the public eye last week, suggesting it could be easily discovered by those who knew where to look.
Vinny Troia, a cybersecurity researcher and founder of dark web intelligence firm Shadowbyte, said he first discovered the database “around January” while searching for open databases online.
“The site I found it on is public, anyone (could) access it, all you need to do is create an account,” Troia said. “Since it opened in April 2021, any number of people could have downloaded the data,” he added.
Troia said he downloaded one of the database’s main indexes, which appears to contain information on nearly 970 million Chinese citizens. But it was difficult to judge whether the open access was a mistake by the database owners, or whether it was an intentional shortcut meant to be shared with a small number of people, he said.
“They either forgot it, or they purposely left it open because it’s easier for them to access,” he said, referring to the authorities responsible for the database. “I don’t know why they would do that. It sounds very careless.’
Unsecured personal data — exposed through leaks, breaches or some form of incompetence — is an increasingly common problem facing businesses and governments around the world, and cybersecurity experts say it’s not uncommon to find databases that are accessible to the public.
But the latest data breach is of particular concern, cybersecurity researchers say, not only because of its potentially unprecedented volume, but also because of the sensitive nature of the information contained within.
A CNN analysis of the database sample found police files spanning nearly two decades from 2001 to 2019. While the majority of submissions are civil litigation, there are also criminal files ranging from fraud to rape.
In one case, a Shanghai resident was subpoenaed by police in 2018 for using a virtual private network (VPN) to evade China’s firewall and access Twitter.
In another file, a mother called the police in 2010 after she accused her father-in-law of raping her 3-year-old daughter.
“There could be domestic violence, child abuse, anything in there, I find that much more concerning,” said Hunt, Microsoft’s regional director.
“Could this lead to extortion? We often see extortion of individuals after data breaches, examples where hackers can even try to bribe individuals.”
Bob Diachenko, a security researcher from Ukraine, first came to the database in April. In mid-June, his company discovered that the database had been attacked by an unknown malicious actor, who destroyed and copied the data, leaving a ransom note demanding 10 bitcoins for recovery, Diachenko said.
It’s not clear if this was the work of the same person who advertised the sale of the database information last week.
By July 1, the ransom note was gone, according to Diachenko, but only 7 gigabytes (GB) of data was available — instead of the 23 TB originally advertised.
Diachenko said it suggested the ransom had been resolved, but the owners of the database had continued to use the exposed database for storage until it was shut down over the weekend.
“Maybe there was a junior developer who noticed and tried to delete the notes before senior management noticed,” he said.
This story was updated on Wednesday with additional developments.
CNN’s Philip Wang contributed to the report.