DnD-like game using GOAP

Emmanuele Villa

Emmanuele Villa

Developed for the AI4vg exam @ UniMi: a turn-based tactical rpg played on a grid map with a GOAP AI.

This post is about a university project that I developed for the “Artificial intelligence for videogames” exam and it’ll be a tl;dr of the full relation that you can find along with the code in my github repository.

The map

Before thinking about which data structure will represents our map in the code, we’ll ”develop” a
simple map editor using the best map editor of all time: Google sheet.
Google sheet isn’t only super fast and simple to use, but it can also help us visualize the map using the conditional formatting on the cells.
After creating the map on Google sheet, we’ll download it in a csv format (but saving the file in .txt because Unity3d’s TextAsset inspector field only accept *.txt files) and we’ll have it ready for
parsing.
In the next image, you can see an example of a map created in this way and fed to the Core library:

Each cell contains a string with two chars, the first one representing the terrain type (G=grass, R=river, S=stone) and the second one representing the cell’s height.

By using conditional format-
ting with colors, we can clearly visualize both values at glance

The map in the game will be drawn using a free set of isometric tiles on the Unity Asset Store, named 2D Pixel Art – Isometric Blocks.

There are a lot of rules and unit tests related to the movement and the validity of actions that may or may not be executed by a particular creature based on their position and status, which I will not dwell into here, but you can find every detail in the full pdf.

There are a lot of design patterns and algorithms used, such as BFS search for calculating the creature movement, a (sort of) chain of responsibility to build the list of available actions and so on.

GOAP

Goal Oriented Action Planning refers to an AI algorithm that aims to optimize the execution of multiple available actions in an order and in a combination such that it optimize the achievement of a goal.

For this project I’ve drafted a list of 4 goals:

  • Reduce enemy health points: this is the main priority of every D&D battle, the sooner the enemy dies the best it is. Reality is a bit more complicated than that, because a human player in a real d&d game may want to save their best skills for another battle. In this algorithm we assume that this is the last battle of the day.
  • Increase ally hit points: if a creature is not able to harm another or if an ally it’s very injured the second goal comes into play to heal it. This goal has also an opposite face as “don’t reduce ally hit points” that comes into play for avoiding attack of opportunities and such
  • Buff an ally: some classes have spells that buffs or bonus action abilities that buffs an ally. This goal comes into play in that occasions
  • Ranged/Melee position: this is a watershed goal: if I have more course of actions that leads to the same GOAP score, I can then chose one or the other based on the character position. A ranged character wants to be as far as possible from the enemies, while a melee creature wants to get near the enemy

Every one of these goals have a dynamic insistence based on the creature knowledge. For example the “Increase ally hit points” insistence increases inversely to allies’ total hit points

The algorithm

When a creature takes its turn, this is what should happen theoretically:

  • I calculate the list of available action
  • For each action, I execute it in a new copy of the game state
  • In each copy, I then calculate the list again and execute the second action in a second-level copy of the game state
  • I go on until I’ve exhausted all the actions in all the new game states
  • I then calculate the goals fulfillment in each of them and I choose the one to execute

This is the GOAP algorithm in a nutshell, but unfortunately there’s a major problem in this approach: performance.

Let’s take a level 5 monk as an example: they can move up to 9 cells, attack 3 times, use up to 4 skills using ki points, including the possibility of attacking an additional time. In D&D 5e you can split the movement as you wish, and you can see that 8 actions split amongst 81 reachable cells creates a ridiculous amount of combinations in the order of the millions.

Code optimization

Since the main algorithm is a search-based algorithm, the best way to optimize it is by branch pruning, aka avoid visiting a branch that I can already know it won’t achieve my goals better than others.

The first way to do this is by applying priorities to the actions. After I build the available action list I only execute the ones with the higher priority:

We define the following priorities:
Cast a spell: priority 5
Attack an enemy: priority 4
Use a special ability: Priority 3
Move: Priority 2
Fallback actions: Priority 1

To apply this check we need to swap from a BFS to a DFS, so we’ll have the result of the high-priority actions sooner. In C# this is very easy, it only means we replace the Queue with a Stack.

Another big optimization is allow the creatures to only move at the beginning and at the end of their turn. In years of playing I never saw a creature moving in the middle of a sequence of attacks if the opponent is still alive, and in case of death of an enemy we’ll simply re-run the algorithm, so this won’t be a problem.

A human player can easily think in advance: if I have an ability that grants me an additional attack, I will only use it if I want to attack afterwards, or I will not use it at all if I can see that there are no targets available for the attack.
Unfortunately, our search algorithm can’t think like this, so we currently end up with actions that
waste resource without any gain.


We could avoid this problem with 3 solutions:
Since we are using GOAP, we could add a special don’t waste resources goal that returns an infinite negative result, to discard the action sequence. While this works, we are just discarding the leaves, so it doesn’t do much
To prune more branches, we can modify the ”GetAvailableAction()” method to only return the actions that are valuable now, and not after others. For example, if there are available attacks to do, we won’t tell the AI that it can increase the number of available attacks until it uses all of them. While we are removing ”ownership” of the validation from the GOAP, there’s a lot of CPU time saved this way
Another approach could be to batch some actions together. For example, we won’t return to the AI the action ”get another attack”, but instead we’ll return ”get another attack + attack”, forcing that branch search. While this solution is cleaner and faster, it doesn’t take into account all possible edge cases and requires more effort in the implementation.

In the projects I’ve implemented the second solution, and the number of branches visited dropped
by a lot. Taking the monk as an example (which is the creature with the most combinations of abilities) after applying solution 2 the algorithm passed from evaluating 61226 possible course of actions to only 6270, with the same output.

Share:

Facebook
Twitter
Pinterest
LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *

On Key

Related Posts

The Renshuu Widget for Android

Are you looking for a way to keep your Renshuu study schedules front and center without constantly opening the app? The Renshuu Widget for Android might be just what you need!