About two weeks ago, we launched our research project and text-based AI (sandbox) escape game Doublespeak.chat. We give the OpenAI’s Large Language Model (LLM, A.K.A. ChatGPT) a secret to keep: its name. The player’s goal is to extract that secret name. We believe we'll never win the cat-and-mouse game, but we can all have fun trying!
We had a simple idea: we prime an LLM (Large Language Model), in this case ChatGPT, with a secret and a scenario in a pre-prompt hidden from the player. The player's goal is to discover the secret either by playing along or by hacking the conversation to guide the LLM's behavior outside the anticipated parameters.