The integration of Large Language Models (LLM) with other recovery-based applications (so-called application-embedded LLMs) can introduce new attack vectors; adversaries can now attempt to indirectly inject LLMs with prompts placed in publicly accessible sources. Credit: arXiv (2023). DOI: 10.48550/arxiv.2302.12173
ChatGPT’s explosive growth has been impressive. Just two months after its introduction last fall, 100 million users had taken advantage of the AI chatbot’s ability to engage in pranks, discuss politics, generate compelling essays and write poetry.
“In the 20 years since the Internet space, we cannot recall a faster ramp-up in a consumer Internet application,” analysts at investment bank UBS declared earlier this year.
That’s good news for programmers, hobbyists, commercial interests, consumers, and members of the general public, all of whom can reap immeasurable benefits from enhanced transactions powered by AI brainpower.
But the bad news is that whenever there is a breakthrough in technology, scammers are not far behind.
A new study, published on the preprint server arXivdiscovered that AI chatbots can be easily hijacked and used to retrieve sensitive user information.
Researchers at Saarland University’s CISPA Helmholtz Center for Information Security reported last month that hackers can employ a procedure called indirect flash injection to surreptitiously insert malicious components into a user-chatbot exchange.
Chatbots use Large Language Model (LLM) algorithms to detect, summarize, translate, and predict text sequences based on massive data sets. LLMs are popular in part because they use natural language prompts. But that feature, warns Saarland researcher Kai Greshake, “could also make them susceptible to specific adversarial indications.”
Greshake explained that it could work like this: A hacker slips a message into a zero-point, ie invisible, source on a web page that the chatbot will likely use to answer a user’s question. Once that “poisoned” page is retrieved in a conversation with the user, the flag is silently activated without the need for the user to enter any further information.
Greshake said that Bing Chat was able to obtain personal financial details from a user by engaging in an interaction that led the bot to access a page with a hidden notice. The chatbot posed as a Microsoft Surface Laptop seller offering discounted models. The bot was then able to obtain email IDs and financial information from the unsuspecting user.
University researchers also discovered that the Bing Chatbot can view content in a browser’s open tab pages, broadening the scope of its potential for malicious activity.
The Saarland University article is appropriately titled “More Than You Asked For.”
Greshake warned that the growing popularity of LLMs ensures more trouble is on the way.
In response to a discussion of his team’s report on the Hacker News Forum, Greshake said: “Even if you can mitigate this specific injection, this is a much bigger problem. It goes back to the injection itself: what is instruction and what is code”. ? If you want to extract useful information from a text in an intelligent and useful way, you will have to process it.”
Greshake and his team said that given the potential for rapidly spreading scams, there is an urgent need for “further investigation” of such vulnerabilities.
For now, chatbot users are advised to exercise the same caution they would for any online transaction involving personal information and financial transactions.
More information:
Kai Greshake et al, More Than You Asked For: A Comprehensive Analysis of New Rapid Injection Threats for Large Language Models Embedded in Applications, arXiv (2023). DOI: 10.48550/arxiv.2302.12173
arXiv
© 2023 Science X Network
Citation: ‘Indirect Prompt Upend’ Attacks Could Upend Chatbots (March 9, 2023) Accessed March 10, 2023 at https://techxplore.com/news/2023-03-indirect-prompt-upend-chatbots. html
This document is subject to copyright. Apart from any fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.