Security Hallucinations: Report Reveals DeepSeek’s Chat Histories and Internal Data Were Publicly Exposed
A cloud security firm discovered a publicly accessible, fully controllable database belonging to DeepSeek, the Chinese company that has recently made waves in the AI world. Within minutes of examining DeepSeek’s security, Wiz researchers found a critical vulnerability that exposed sensitive information.
An analytical ClickHouse database tied to DeepSeek was "completely open and unauthenticated," containing over 1 million instances of "chat history, backend data, and sensitive information, including log streams, API secrets, and operational details." The database also had an open web interface that allowed for full database control and privilege escalation. Internal API endpoints and keys were available through the interface and common URL parameters.
According to Wiz’s blog post, "much of the attention around AI security is focused on futuristic threats, but the real dangers often come from basic risks—like accidental external exposure of databases." As organizations rush to adopt AI tools and services from a growing number of startups and providers, it’s essential to remember that by doing so, we’re entrusting these companies with sensitive data. The rapid pace of adoption often leads to overlooking security, but protecting customer data must remain the top priority.
Ars has contacted DeepSeek for comment and will update this post with any response. Wiz noted that it did not receive a response from DeepSeek regarding its findings, but after contacting every DeepSeek email and LinkedIn profile Wiz could find on Wednesday, the company protected the databases Wiz had previously accessed within half an hour.
"The fact that mistakes happen is correct, but this is a dramatic mistake, because the effort level is very low and the access level that we got is very high," said Ami Luttwak, CTO of Wiz. "I would say that it means that the service is not mature to be used with any sensitive data at all."
DeepSeek’s R1 model, a freely available simulated reasoning model that DeepSeek and some testers believe matches OpenAI’s o1 model in performance, has sparked a blaze of volatility in the tech and AI markets. DeepSeek purportedly runs at a fraction of the cost of o1, at least on DeepSeek’s servers. The seemingly drastically reduced power needed to run and train R1 also rocked power company stock prices.
Ars’ Kyle Orland found R1 impressive, given its seemingly sudden arrival and smaller scale, but noted some deficiencies in comparison with OpenAI models. OpenAI told the Financial Times that it believed DeepSeek had used OpenAI outputs to train its R1 model, in a practice known as distillation. Such training violates OpenAI’s terms of service, and the firm told Ars that it would work with the US government to protect its model.
In examining DeepSeek’s systems, Wiz researchers found numerous structural similarities to OpenAI, seemingly so that customers could transition from that firm to DeepSeek. This raises concerns about the potential for sensitive data exposure and the impact on AI market competition.
Wiz Researchers’ Findings:
- A publicly accessible database containing over 1 million instances of chat history, backend data, and sensitive information.
- An open web interface allowing full database control and privilege escalation.
- Internal API endpoints and keys available through the interface and common URL parameters.
- Similarities in structural design to OpenAI, potentially for customer transition.
Implications:
- The rapid pace of AI adoption often leads to overlooking security, but protecting customer data must remain the top priority.
- Accidental external exposure of databases is a critical risk that organizations should address promptly.
- DeepSeek’s R1 model and potential similarities to OpenAI raise concerns about sensitive data exposure and market competition.
Conclusion:
The recent discovery by Wiz researchers highlights the importance of prioritizing security in AI adoption. Organizations must be vigilant in protecting customer data, as even basic risks like accidental external exposure of databases can have severe consequences. As the AI market continues to evolve, it’s essential to address these concerns and ensure that companies prioritize security alongside innovation.