Appendix DContext Manipulation Attacks

As discussed in Chapter 7, “Weaponizing Social Intelligence,” a   very simple example of Python code leveraging the OpenAI API with the GPT-3.5-turbo model shows how system instructions can be used to inform the AI assistant of how it is intended to operate and how an initial question can be used to guide the conversation in a specific direction:

import openai
 
openai.api_key ='' # Add OpenAI API Key Here
 
role = "You are an AI assistant."
 
initial_msg = "What would you like to chat about?"
 
convo = [{"role": "system", "content": role},
         {"role": "assistant", "content":
          initial_msg}]
 
# Convo Start
print(f'\BOT: {initial_msg}')
user_reply = input('\n\nYOU: ')
convo.append({"role": "user", "content": user_reply})
 
while True:
    r = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=convo)
    bot_reply = r['choices'][0]['message']['content']
    convo.append({"role": "assistant", "content": bot_reply})
    print(f'\n\BOT: {bot_reply}')
    user_reply = input('\n\nYOU: ')
    convo.append({"role": "user", "content": user_reply})

This example introduces a highly generic context, with very little implemented in terms of restrictions. Interactions with the system illustrate how the established context informs its answer to the question(s) presented to it. When asked what it is, the bot informs the user that it is “an artificial intelligence designed to assist and communicate with users through text-based conversations.”

BOT What would you ...

Get The Language of Deception now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.