AI prompting is mostly used for chatbots, but using careful prompt programming, it can be employed do drive interactive applications. Giving ChatGPT the context as a structured dataset (i.e. in JSON), it is able to disambiguate and identify entities that were ambiguously identified by the user, and provide your program with structured data.

To illustrate the concept, I have written a small “interactive fiction” game, where the user is presented with text describing a fantasy environment, that can be navigated and interacted with issuing command as “go west” or “eat the apple”.

Describing objects with basic definitions

The developer needs to provide only basic definitions of the objects in the game, and use prompt-programming to have them rendered into a coherent description and parse complex requests from the user.

This is how a room is described in the game data:

const RustySword: IFObject = addObject(
{ name: "Rusty Sword",
  description: "An ancient looking sword",
  extra: ["seen better days", "unsharpened"] })
const Entrance: IFRoom = addRoom(
{ name: "The Entrance",
  description: "the outside of a large construction, in front of a stone gate",
  extra: ["wild", "immerse in the forest", "ancient"],
  e: "Atrium",
  objects: [RustySword] })

The code is pretty self explanatory, but the prompt we provide to ChatGPT for it to understand how we want to render the room makes it very clear:

You are an interactive fiction game.
The user will provide you with json data about the current room the character
is in:
* Generic description (in the field 'description').
* Additional attributes of the room, usually a short list of adjectives,
  in the field 'extra'.
* Exits, in the fields 'n' for nord, 'e' for east,
  's' for south, 'w' for west, 'nw' for north-west, 'ne' for north-east,
  'sw' for south-west and 'se' for south-east; the value will be a short
  description of where directions lead to.
* A list of objects contained in the room, each of wich will have the
  attributes, 'name', 'description' and 'extra'; in the room description,
  you will have only to include the name and eventually render the 'extra'
  attributes.
You can pick and chose which of the elements in 'extra' field for
rooms and objects you want to render each time.

And the result looks like:

You are standing outside a large, wild construction immersed in the forest. The edifice, worn by the ticking of the ages, rests before you. Its imposing stone gate stands robust, tallying its grandeur. This is the mythical entrance, challenging and intriguing.

To the east, a short journey will lead you to the Atrium. Your gaze moves to the ground and you spot something unusual. A Rusty Sword lies abandoned. It is an ancient looking sword, unsharpened and clearly having seen better days. Could it be of any use? Well, that is for you to discern.

As a side note, I love the way ChatGPT decided to render the information I provided in the extra fields. It is possible to control how verbose/stylish the rendering looks adding requirements for more flourished or succinct output in the prompt.

Parsing user commands

You can leverage the power of ChatGPT to break down user commands into more structured requests that chan drive the logic of your program.

This prompt categorises the actions that the user wants to perform and recognises the entities on which they are performed, returning a structured object that can be analysed programmatically:

You are an interactive fiction game. The user will provide you with a
json object, with the input stored in the field 'command'.
Reply with a json object containing:
* The 'action' the user wants to perform. It can be:
  * 'move' if the user types a directional command as north, n, southeast,
    sw etc.
  * 'examine' if the user wants more information about a certain object,
    the room it's in or themselves.
  * 'use' if the user wants to use a specific object. * 'take' pick up an
    object from the room.
  * 'drop' if the user wants to drop an object in the player's inventory.
  * 'unknown' in other cases.
* The optional 'object' on which the action is applied.
  - If the action refers to the player themselves, set this field to "player".
  - If the action refers to the current location, set this field to "room".
  - if the action is "move", set this field to the direction the user wants
    to go (one or two letters, i.e. 'n' for north, 'sw' for south-west etc.)
  - If the action is applied on an object, set this field to the named object.
* The optional 'target', which is the additional ultimate target acted upon,
  if any.
* An optional 'error' field containing a coincise explanation of why
  the action is misunderstood (i.e. a mispelling), or cannot be
  completed (i.e. the named object is not part of the given context).
  This include actions with ambiguous or non existing objects or targets.

The elegance of this method is that it allows to break down the user actions in categories and provides us with a coherent message we can turn to the user directly when the category is not recognized. For example:

Your command ‘blurb the fuzz and get on with it now!’ is not recognized. Please try to use different verbs or check your spelling.

Disambiguating target items

Once the command is broken down, we can try and associate the objects the user was referencing to what we know about the world.

We can leverage the power of ChatGPT to identify even vaguely described objects. So, let’s say that the user enters the command:

> tell me more about that sparkly thing

Now, the player happens to have in its inventory this object:

const Ring: IFObject = addObject({
  name: "Golden Ring",
  description: "an ornate ring",
  extra: ["made of pure gold", "precious", "encrusted with diamonds"] })

We could infer that, being made of gold and encrusted with diamonds, this may be the sparkly thing the player is referring to. We can program the prompt to return exactly this object:

You are an interactive fiction game, and I need you to disambiguate the
object the player is referring to. The user data is a json object with one
field called "referred" which is the object named to the user, and a
structure "player" that contains:
- an inventory, with the set of objects the player is carriyng,
- a room with a field called 'objects' that contains the list of objects
  that are in the room.
Every object has a field 'name' and possibly a list of 'extra',
containing additional information about that object.
I want you to return a json object with a single 'object' field; if you
can disambiguate the input, its value must be the exact 'name' field
of the object the user refers to; otherwise, return an empty json object.

The trace of the operations fired when we parse this command is:

[Log] { (bundle.js, line 193) "action": "examine", "object": "sparkly thing" }
[Log] Disambiguating "sparkly thing" (bundle.js, line 341)
[Log] chatbot returned {"object":"Golden Ring"} (bundle.js, line 348)
[Log] Disambiguated "sparkly thing" as {"name":"Golden Ring","description":"an ornate ring","extra":["made of pure gold","precious","encrusted with diamonds"]} (bundle.js, line 354)

On the first line we can see that ChatGPT was able to break down the sentence and correctly categorize the request “tell me more about…” as an “examine”. Then, providing ChatGPT with the player object, that has the golden ring in their inventory field, we get the correct object identifier. We can then serach for that identifier in the game database and retrieve the correct object, and finally pass it to the text geneartor:

The player is currently in possession of a Golden Ring which is visually compelling. At first glance, it takes the form of an ornate ring, exquisitely fashioned and unerringly eye-catching. Further examination of this artifact reveals that it has been sculpted from pure gold, denoting its intrinsic value. The ring is identified to be precious, an aspect that intensifies its overall allure. Adding to its splendid aesthetic are diamonds tucked into its body, their brilliant shimmer contributing to the ring’s overall majestic demeanor. The Golden Ring appears to hold immense worth, both in value and beauty.

Also, this prompt has the ability to find missing object; the sequence after the command:

> eat the apple

is as follows:

[Log] { (bundle.js, line 193) "action": "use", "object": "apple" }
[Log] Disambiguating "apple" (bundle.js, line 341)
[Log] chatbot returned {} (bundle.js, line 348)
[Log] Can't disambiguate "apple" ({}): What do you mean with apple? (bundle.js, line 357)

It is to be noticed how ChatGPT was able to capture the relationships between objects, and generate descriptions for the room contents and, especially, for the items that the user is carrying in their inventory (also noticing that the ring was a ‘single item’ in the player’s inventory), without explicit prompting.

Prompt Optimisation

The prompts presented here are relatively heavy weight; I made them purposefully verbose also for the readers of this article to get a bit more context about what was going on behind the scene.

As the cost of OpenAI API usage is per-token (input + output), for real-world applications you’ll want to fine-tune and reduce them to the bare minimum.

A technique that I found effective in order to test, debug and optimise the prompt is that of giving them to ChatGPT free web interface, and test them against raw JSON input to see how Chat GPT responded.

The project

At the moment, this is just a demo I wrote to demonstrate the potential of programmatic AI prompting, but as I have been an interactive fiction writer in the past, I plan to extend this project as a base for an AI-based interactive fiction engine. If you are interested, you can follow the the project on GitHub: https://github.com/jonnymind/AIF#readme

Conclusions

Programmatic AI prompting is very early, but the potentials are already visible. With the increase of offers in AI APIs, we can expect a concrete reduction of the time-to-market of highly professional and versatile applications — just consider that this little game just took about half a day of coding.

Original Article on Medium