As generative artificial intelligence advances, data privacy concerns are growing. AI models such as ChatGPT process vast amounts of information instantaneously, raising questions about how much personal data they retain and how easily that data could be exposed.
Dr. Niloofar Mireshghallah, a postdoctoral scholar for Computer Science at the University of Washington, spoke at the University of Wisconsin-Madison on Monday about growing concerns over data integrity and privacy in generative artificial intelligence.
“Privacy isn’t just an incentive for companies to look good anymore,” Mireshghallah said. “They actually need to build better models. And the consequences of having a data leak is much higher than ever because the data is more intimate than it ever was.”
AI privacy risks and model leakage
Large language models (LLMs) such as ChatGPT can access personal information by sifting through vast amounts of training data collected from billions of online resources, according to a 2023 study. Researchers at Google found that by using keywords, users could extract contact information such as names, phone numbers and email addresses from these models.
“The control we have over what we produce is very little,” Mireshghallah said.
Data scientists use membership interference testing to determine what information a model has memorized from its training data. This data, sourced from articles, books and other written media, teaches the model how to understand and generate language by providing insights into the relationships between words.
A simple way to test if a data point was used in the model’s training is to set a “loss score,” which measures certainty in predicting the next word while generating text, Mireshghallah said. A lower loss score suggests the model may have memorized the data rather than generating content based on learned patterns.
To further assess memorization, researchers use the “area under curve”(AUC) metric, which measures how well a model can differentiate between two sets of data. A higher AUC score may mean the model memorized its training data, making it easier for attackers to identify the data and increasing the risk of leakage.
LLMs exhibit recency bias, meaning they are more likely to memorize data seen later in their training, according to Mireshghallah. This is because once AI models understand a language, it can recall more information.
“It’s harder for me to memorize a poem in Chinese than it is for me to memorize it in English or Farsi because I already know the language. And I think it’s the same thing with models,” Mireshghallah said. “If you have a model that has an idea of the language, it will soak up new data much faster.”
Models struggle with human-like reasoning in increasingly complex scenarios
A 2024 study by Mireshghallah and her colleagues tested whether LLMs were able to mimic human reasoning when deciding whether to disclose secrets. They presented models with questions of increasing complexity, resembling real-world scenarios.
Their results found that models like ChatGPT and GPT-4 were more likely to disclose private information in scenarios where humans would know to keep the secret, displaying a lack of human-like reasoning within the models that places privacy at risk.
Among some of the failed scenarios used was one relating to a surprise birthday party where the AI was asked to generate to-do lists for the fictional people involved. However, the model inadvertently informed the guest of honor, telling them to “remember to attend.”
Mireshghallah also cited a scenario of a Secret Santa event where OpenAI assigned each person secret gifts, only to then generate a sent email telling everyone what gift they’re buying for whom while simultaneously telling them to keep the gifts secret for a surprise.
“The model clearly understands that this is a surprise, but it doesn’t understand what a surprise is. That’s why it revealed everything. The model as its reasoning keeps telling itself, ‘Okay, I should keep this secret,’ but in an autoregressive way, it commits to revealing it,” said Mireshghallah.
Models with autoregressive characteristics utilize past data to predict future values, demonstrating gaps in AI’s ability to apply knowledge in a human-like way.
What the future holds
At the end of her talk, Mireshghallah cited her most recent research on measuring linguistic creativity in AI models. The main challenge, she said, is making AI “more novel.”
Using a creativity index to differentiate between human and LLM-generated text, it was clear to see that the LLMs patched together data from training sets rather than producing truly original content, a feature of human creativity.
“Humans are novel. If I ask you to imitate an existing piece of art, you would never imitate it exactly as it is,” Mireshghallah said. “And that’s because of the randomness we’ve seen in our backgrounds and where we come from.”
By introducing randomness into AI model training, she argued, LLMs could become more creative and more human-like, and in turn, more adept at safeguarding information.
Mireshghallah wrapped up her talk by calling for collaboration between users and policy-makers on how to improve the security of LLMs, such as adding random pieces of data to the training information to create more diverse outputs while improving human-computer interaction so models can better understand the kinds of privacy people are searching for.