Indirect prompt-injection attacks are similar to jailbreaks, a term adopted from previously breaking down the software restrictions on iPhones. Instead of someone inserting a prompt into ChatGPT or Bing to try and make it behave in a different way, indirect attacks rely on data being entered from elsewhere. This could be from a website you’ve connected the model to or a document being uploaded.
“Prompt injection is easier to exploit or has less requirements to be successfully exploited than other” types of attacks against machine learning or AI systems, says Jose Selvi, executive principal security consultant at cybersecurity firm NCC Group. As prompts only require natural language, attacks can require less technical skill to pull off, Selvi says.
….If a chatbot is set up to answer questions about information stored in a database, it could cause problems, he says. “Prompt injection provides a way for users to override the developer’s instructions.” This could, in theory at least, mean the user could delete information from the database or change information that’s included.
Read more:
Burgess, M. (2023, May 25). The Security Hole at the Heart of ChatGPT and Bing. Wired. https://www.wired.com/story/chatgpt-prompt-injection-attack-security/