Headline
DeepSeek’s Popular AI App Is Explicitly Sending US Data to China
Amid ongoing fears over TikTok, Chinese generative AI platform DeepSeek says it’s sending heaps of US user data straight to its home country, potentially setting the stage for greater scrutiny.
The United States’ recent regulatory action against the Chinese-owned social video platform TikTok prompted mass migration to another Chinese app, the social platform “Rednote.” Now, a generative artificial intelligence platform from the Chinese developer DeepSeek is exploding in popularity, posing a potential threat to US AI dominance and offering the latest evidence that moratoriums like the TikTok ban will not stop Americans from using Chinese-owned digital services.
DeepSeek, an AI research lab created by a prominent Chinese hedge fund, recently gained popularity after releasing its latest open source generative AI model that easily competes with top US platforms like those developed by OpenAI. However, to help avoid US sanctions on hardware and software, DeepSeek created some clever workarounds when building its models. On Monday, DeepSeek’s creators limited new sign-ups after claiming the app had been overrun with a “large-scale malicious attack.”
While DeepSeek has several AI models, some of which can be downloaded and run locally on your laptop, the majority of people will likely access the service through its iOS or Android apps or its web chat interface. Like with other generative AI models, you can ask it questions and get answers; it can search the web; or it can alternatively use a reasoning model to elaborate on answers.
DeepSeek, which does not appear to have established a communications department or press contact yet, did not return a request for comment from WIRED about its user data protections and the extent to which it prioritizes data privacy initiatives.
As people clamor to test out the AI platform, though, the demand brings into focus how the Chinese startup collects user data and sends it home. Users have already reported several examples of DeepSeek censoring content that is critical of China or its policies. The AI setup appears to collect a lot of information—including all your chat messages—and send it back to China. In many ways, it’s likely sending more data back to China than TikTok has in recent years, since the social media company moved to US cloud hosting to try to deflect US security concerns
“It shouldn’t take a panic over Chinese AI to remind people that most companies in the business set the terms for how they use your private data” says John Scott-Railton, a senior researcher at the University of Toronto’s Citizen Lab. “And that when you use their services, you’re doing work for them, not the other way around.”
What DeepSeek Collects About You
To be clear, DeepSeek is sending your data to China. The English-language DeepSeek privacy policy, which lays out how the company handles user data, is unequivocal: “We store the information we collect in secure servers located in the People’s Republic of China.”
In other words, all the conversations and questions you send to DeepSeek, along with the answers that it generates, are being sent to China or can be. DeepSeek’s privacy policies also outline the information it collects about you, which falls into three sweeping categories: information that you share with DeepSeek, information that it automatically collects, and information that it can get from other sources.
The first of these areas includes “user input,” a broad category likely to cover your chats with DeepSeek via its app or website. “We may collect your text or audio input, prompt, uploaded files, feedback, chat history, or other content that you provide to our model and Services,” the privacy policy states. Within DeepSeek’s settings, it is possible to delete your chat history. On mobile, go to the left-hand navigation bar, tap your account name at the bottom of the menu to open settings, and then click “Delete all chats.”
This collection is similar to that of other generative AI platforms that take in user prompts to answer questions. OpenAI’s ChatGPT, for example, has been criticized for its data collection although the company has increased the ways data can be deleted over time. Regardless of these types of protections, privacy advocates emphasize that you should not disclose any sensitive or personal information to AI chat bots.
“I would not input personal or private data in any such an AI assistant,” says Lukasz Olejnik, independent researcher and consultant, affiliated with King’s College London Institute for AI. Olejnik notes, though, that if you install models like DeepSeek’s locally and run them on your computer, you can interact with them privately without your data going to the company that made them. Additionally, AI search company Perplexity says it has added DeepSeek to its platforms but claims it is hosting the model in US and EU data centers.
Other personal information that goes to DeepSeek includes data that you use to set up your account, including your email address, phone number, date of birth, username, and more. Likewise, if you get in touch with the company, you’ll be sharing information with it.
Bart Willemsen, a VP analyst focusing on international privacy at Gartner, says that, generally, the construction and operations of generative AI models is not transparent to consumers and other groups. People don’t know exactly how they work or the exact data they have been built upon. For individuals, DeepSeek is largely free, although it has costs for developers using its APIs. “So what do we pay with? What do we usually pay with: data, knowledge, content, information,” Willemsen says.
As with all digital platforms—from websites to apps—there can also be a large amount of data that is collected automatically and silently when you use the services. DeepSeek says it will collect information about what device you are using, your operating system, IP address, and information such as crash reports. It can also record your “keystroke patterns or rhythms,” a type of data more widely collected in software built for character-based languages. Additionally, if you purchase DeepSeek’s premium services, the platform will collect that information. It also uses cookies and other tracking technology to “measure and analyze how you use our services.”
A WIRED review of the DeepSeek website’s underlying activity shows the company also appears to send data to Baidu Tongji, Chinese tech giant Baidu’s popular web analytics tool, as well as Volces, a Chinese cloud infrastructure firm. In a social media post, Sean O’Brien, founder of Yale Law School’s Privacy Lab, said that DeepSeek is also sending “basic” network data and “device profile” to TikTok owner ByteDance “and its intermediaries.
The final category of information DeepSeek reserves the right to collect is data from other sources. If you create a DeepSeek account using Google or Apple sign-on, for instance, it will receive some information from those companies. Advertisers also share information with DeepSeek, its policies say, and this can include “mobile identifiers for advertising, hashed email addresses and phone numbers, and cookie identifiers, which we use to help match you and your actions outside of the service.”
How DeepSeek Uses Information
Huge volumes of data may flow to China from DeepSeek’s international user base, but the company still has power over how it uses the information. DeepSeek’s privacy policy says the company will use data in many typical ways, including keeping its service running, enforcing its terms and conditions, and making improvements.
Crucially, though, the company’s privacy policy suggests that it may harness user prompts in developing new models. The company will “review, improve, and develop the service, including by monitoring interactions and usage across your devices, analyzing how people are using it, and by training and improving our technology,” its policies say.
DeepSeek’s privacy policy also says the company will also use information to “comply with [its] legal obligations”—a blanket clause many companies include in their policies. DeepSeek’s privacy policy says data can be accessed by its “corporate group,” and it will share information with law enforcement agencies, public authorities, and more when it is required to do so.
While all companies have legal obligations, those based in China do have notable responsibilities. Over the past decade, Chinese officials have passed a series of cybersecurity and privacy laws meant to allow state officials to demand data from tech companies. One 2017 law, for instance, says that organizations and citizens should “cooperate with national intelligence efforts.”
These laws, alongside growing trade tensions between the US and China and other geopolitical factors, fueled security fears about TikTok. The app could harvest huge amounts of data and send it back to China, those in favor of the TikTok ban argued, and the app could also be used to push Chinese propaganda. (TikTok has denied sending US user data to China’s government.) Meanwhile, several DeepSeek users have already pointed out that the platform does not provide answers for questions about the 1989 Tiananmen Square massacre, and it answers some questions in ways that sound like propaganda.
Willemsen says that, compared to users on a social media platform like TikTok, people messaging with a generative AI system are more actively engaged and the content can feel more personal. In short, any influence could be larger. “Risks of subliminal content alteration, conversation direction steering, in active engagement ought by that logic to lead to more concern, not less,” he says, “especially given how the inner workings of the model are widely unknown, its thresholds, borders, controls, censorship rules, and intent/personae largely left unscrutinized, and it being already so popular in its infancy stage.”
Olejnik, of King’s College London, says that while the TikTok ban was a specific situation, US law makers or those in other countries could act again on a similar premise. “We can’t rule out that 2025 will bring an expansion: direct action against AI firms,” Olejnik says. “Of course, data collection may again be named as the reason.”
Updated 5:27 pm EST, January 27, 2025: Added additional details about the DeepSeek website’s activity.
Updated 10:05 am EST, January 29, 2025: Added additional details about DeepSeek’s network activity.