CRS Report--Generative Artificial Intelligence and Data Privacy: A Primer"Critics contend that such models rely on privacy-invasive methods for mass data collection, typically without the consent or compensation of the original user, creator, or owner. Additionally, some models may be trained on sensitive data and reveal personal information to users. In a company blog post, Google AI researchers noted, “Because these datasets can be large (hundreds of gigabytes) and pull from a range of sources, they can sometimes contain sensitive data, including personally identifiable information (PII)—names, phone numbers, addresses, etc., even if trained on public data.” Academic and industry research has found that some existing LLMs may reveal sensitive data or personal information from their training datasets."