Can AI keep a secret? The implications of human-like AI and data privacy

As generative artificial intelligence advances, data privacy concerns are growing. AI models such as ChatGPT process vast amounts of information instantaneously, raising questions about how much personal data they retain and how easily that data could be exposed.

Dr. Niloofar Mireshghallah, a postdoctoral scholar for Computer Science at the University of Washington, spoke at the University of Wisconsin-Madison on Monday about growing concerns over data integrity and privacy in generative artificial intelligence.

“Privacy isn’t just an incentive for companies to look good anymore,” Mireshghallah said. “They actually need to build better models. And the consequences of having a data leak is much higher than ever because the data is more intimate than it ever was.”

AI privacy risks and model leakage

Large language models (LLMs) such as ChatGPT can access personal information by sifting through vast amounts of training data collected from billions of online resources, according to a 2023 study. Researchers at Google found that by using keywords, users could extract contact information such as names, phone numbers and email addresses from these models.

“The control we have over what we produce is very little,” Mireshghallah said.

Data scientists use membership interference testing to determine what information a model has memorized from its training data. This data, sourced from articles, books and other written media, teaches the model how to understand and generate language by providing insights into the relationships between words.

A simple way to test if a data point was used in the model’s training is to set a “loss score,” which measures certainty in predicting the next word while generating text, Mireshghallah said. A lower loss score suggests the model may have memorized the data rather than generating content based on learned patterns.

To further assess memorization, researchers use the “area under curve”(AUC) metric, which measures how well a model can differentiate between two sets of data. A higher AUC score may mean the model memorized its training data, making it easier for attackers to identify the data and increasing the risk of leakage.

LLMs exhibit recency bias, meaning they are more likely to memorize data seen later in their training, according to Mireshghallah. This is because once AI models understand a language, it can recall more information.

“It’s harder for me to memorize a poem in Chinese than it is for me to memorize it in English or Farsi because I already know the language. And I think it’s the same thing with models,” Mireshghallah said. “If you have a model that has an idea of the language, it will soak up new data much faster.”

Models struggle with human-like reasoning in increasingly complex scenarios

Enjoy what you're reading? Get content from The Daily Cardinal delivered to your inbox

A 2024 study by Mireshghallah and her colleagues tested whether LLMs were able to mimic human reasoning when deciding whether to disclose secrets. They presented models with questions of increasing complexity, resembling real-world scenarios.

Their results found that models like ChatGPT and GPT-4 were more likely to disclose private information in scenarios where humans would know to keep the secret, displaying a lack of human-like reasoning within the models that places privacy at risk.

Among some of the failed scenarios used was one relating to a surprise birthday party where the AI was asked to generate to-do lists for the fictional people involved. However, the model inadvertently informed the guest of honor, telling them to “remember to attend.”

Mireshghallah also cited a scenario of a Secret Santa event where OpenAI assigned each person secret gifts, only to then generate a sent email telling everyone what gift they’re buying for whom while simultaneously telling them to keep the gifts secret for a surprise.

“The model clearly understands that this is a surprise, but it doesn’t understand what a surprise is. That’s why it revealed everything. The model as its reasoning keeps telling itself, ‘Okay, I should keep this secret,’ but in an autoregressive way, it commits to revealing it,” said Mireshghallah.

Models with autoregressive characteristics utilize past data to predict future values, demonstrating gaps in AI’s ability to apply knowledge in a human-like way.

What the future holds

At the end of her talk, Mireshghallah cited her most recent research on measuring linguistic creativity in AI models. The main challenge, she said, is making AI “more novel.”

Using a creativity index to differentiate between human and LLM-generated text, it was clear to see that the LLMs patched together data from training sets rather than producing truly original content, a feature of human creativity.

“Humans are novel. If I ask you to imitate an existing piece of art, you would never imitate it exactly as it is,” Mireshghallah said. “And that’s because of the randomness we’ve seen in our backgrounds and where we come from.”

By introducing randomness into AI model training, she argued, LLMs could become more creative and more human-like, and in turn, more adept at safeguarding information.

Mireshghallah wrapped up her talk by calling for collaboration between users and policy-makers on how to improve the security of LLMs, such as adding random pieces of data to the training information to create more diverse outputs while improving human-computer interaction so models can better understand the kinds of privacy people are searching for.

Support your local paper

Donate Today

The Daily Cardinal has been covering the University and Madison community since 1892. Please consider giving today.

Can AI keep a secret? The implications of human-like AI and data privacy

Takeaways from Dr. Niloofar Mireshghallah’s talk on the implications of generative AI and privacy and data integrity concerns at the University of Wisconsin-Madison on Monday.

Visa terminations are ‘deeply troubling,’ seem “arbitrary and unjust,” Mnookin says in newspaper column

Wisconsin Supreme Court upholds Evers’ 400-year school funding veto

Faculty call on UW-Madison to help fight Trump’s attacks, support international students