A new report from Concentric AI highlights how Generative AI (GenAI) is exposing sensitive data at scale, creating serious problems for security teams . The report points out that unstructured data, duplicate files, and risky sharing practices contribute to this exposure .
GenAI tools, such as Microsoft Copilot, are adding complexity to the data security landscape . In the first half of 2025, Copilot accessed nearly three million sensitive data records per organization . Furthermore, there were over 3,000 user interactions per organization with Copilot, increasing the chances of sensitive data being modified or shared without proper controls . The report warns that shadow GenAI use, where employees rely on unsanctioned tools, adds further risk as organizations may not even know where their data is going .
Excessive sharing remains a core problem, with an average of three million sensitive data records shared externally, making up more than half of all shared files . Financial services firms had the highest percentage of external sharing involving sensitive data at 73 percent . Additionally, 'Anyone' links, which allow unrestricted access without sign-in, pose a particular risk, with a large share of files shared this way in healthcare containing sensitive data .
Data sprawl drives inefficiency and risk, with organizations having an average of 10 million duplicate data records and seven million stale records . GenAI data leakage occurs when generative AI systems infer and expose sensitive information through seemingly benign queries . Primary leakage channels include prompt oversharing, vector-store poisoning, model hallucination, and integration drift .
The consequences of GenAI data leakage are significant, including financial losses, legal penalties, and reputational damage . An IBM 2024 report highlights that AI-related data breaches cost organizations an average of $5.2 million, 28% higher than conventional breaches . A 2024 Kaspersky study revealed that 67% of employees share internal company data with GenAI without authorization . Harmonic Security's analysis found that 22% of files and 4.37% of prompts shared with GenAI tools contained sensitive data .
Concentric AI offers its Semantic Intelligence platform to address these risks . The platform discovers, classifies, and protects sensitive data, including Personally Identifiable Information (PII), Payment Card Information (PCI), and Intellectual Property (IP) . It leverages advanced AI and natural language processing to autonomously discover and classify sensitive data across various repositories . The platform also provides data security governance, access management, and data loss prevention . Concentric AI has incorporated new compliance capabilities into its Semantic Intelligence DSPM solution, helping organizations identify and remediate data risks against various compliance frameworks such as HIPAA, NIST, and GDPR .
