In case you have any remarks or questions on these tutorials they are always welcome, preferably via the slack channel wiki-content-feedback. (You can notify the education committee specifically by adding @educo to your message.) You may also send us an email at education@serpentineai.nl.
Welcome to the final lesson of the series! We will use the code of the previous lessons and apply it to a different game: Heist. Althought this game is similar, it has some important differences, so in this lesson we will need to make a few changes to our code.
Heist is another procgen environment in which the player must find its way through a maze. However, in this environment the player must also collect keys and unlock doors. The final goal is to collect the diamond. The inputs are the same as maze, but now you can also move in the diagonal directions. However, we wont use those extra directions.
The maze covers at least 3x3 blocks and at most the full map of 13x13 blocks (like in the image below).
For all important differences between maze and heist, check out the differences
tab.
In any given turn, an agent can choose from one of these five actions:
Action | info | int |
---|---|---|
Stop | This action is a pass. | 0 |
Left | Move left on the maze. | 1 |
Up | Move up on the maze. | 3 |
Down | Move down on the maze. | 5 |
Right | Move right on the maze. | 7 |
It's recommended that you play the game yourself a few times, so you can see and experience the game for yourself. Look for the differences you see in this game compared to maze.
To play, just run the following line in your command prompt:
python -m procgen.interactive --env-name heist
Listed here are a few important differences in heist compared to maze. Think about how you would adapt our code to make it work for heist.
The first step is to create a new agent. Create a new python file next to your agent.py
called agent_heist.py
. In this file you should create a class AgentHeist
that inherits from our Agent
. It also should override two functions from Agent
. In case you are not sure how inheritence works, check out the agents tutorial from the machine learning track.
The class structure will be as follows:
...
from core.agent import Agent
class AgentHeist(Agent):
def __init__(self, venv: procgen.ProcgenEnv):
super().__init__(venv)
def compute_action(self, observation: np.ndarray) -> np.ndarray:
""" Calculate the best action for the agent. """
maze, scale = self.extract_maze(observation, grid_size=(13, 13)) # smaller grid
... copy code ...
def extract_sprites(self, image: np.ndarray, scale: float) -> Dict[str, Tuple[int, int]]:
""" Extracts the templates position from the input image, and returns their center location. """
... copy code ...
You should copy the code of def compute_action
and def extract_sprites
, because we want to override this code in this lesson.
This will be the first thing we override. The gridsize (i.e. the amount of blocks) in heist is different than in maze. So we need to update the grid_size
parameter in def compute_action
accordingly. What is the grid size of this game?
We need to switch from the game maze to the game heist. So where you create your maze procgen environment you should now pass the parameter env_name='heist'
.
Also, instead of creating Agent
, you must now create AgentHeist
.
There is also an alternative way to do this if you want (this implementation makes it a little easier to switch between environments):
...
from core.agent_heist import AgentHeist
if __name__ == '__main__':
env_name = 'heist'
venv = procgen.ProcgenEnv(num_envs=1, env_name=env_name, render_mode='rgb_array', use_backgrounds=False)
if env_name == 'heist':
agent = AgentHeist(venv)
run_environment(venv, agent, render=True)
if env_name == 'maze':
agent = Agent(venv)
run_environment(venv, agent, render=True)
If you run the code, what do you see? You should see the heist environment, but with the old agent trying to solve the game. The agent will fail every time, so let's make some improvements!
In this game we have a different player and goal, but we also have keys and locks between the player and goal. Here is a link to download the new templates which you can add to the template
folder: templates-heist.zip
Keys | Locks | Player |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
|
![]() |
![]() |
We load these templates in the same way we did for maze:
self.templates = dict(
gem=self.load_template("gem.png"),
key_1=self.load_template("key1.png"),
key_2=self.load_template("key2.png"),
key_3=self.load_template("key3.png"),
lock_1=self.load_template("lock1.png"),
lock_2=self.load_template("lock2.png"),
lock_3=self.load_template("lock3.png"),
player=self.load_template("player-heist.png")
)
However, note that in heist the player has 8 different rotations. So we still need to add 7 rotations.
Try to use the rotate
function from scipy.ndimage
to add the different rotations to the templates. The function is used like this: rotate(input, angle)
.
self.templates = dict(
gem=self.load_template("gem.png"),
key_1=self.load_template("key1.png"),
key_2=self.load_template("key2.png"),
key_3=self.load_template("key3.png"),
lock_1=self.load_template("lock1.png"),
lock_2=self.load_template("lock2.png"),
lock_3=self.load_template("lock3.png"),
player_0=rotate(self.load_template("player-heist.png"), angle=0),
player_45=rotate(self.load_template("player-heist.png"), angle=45),
player_90=rotate(self.load_template("player-heist.png"), angle=90),
player_135=rotate(self.load_template("player-heist.png"), angle=135),
player_180=rotate(self.load_template("player-heist.png"), angle=180),
player_225=rotate(self.load_template("player-heist.png"), angle=225),
player_270=rotate(self.load_template("player-heist.png"), angle=270),
player_315=rotate(self.load_template("player-heist.png"), angle=315),
)
With the help of some dictionary comprehensions we can write this down in a lot fewer lines.
self.templates = dict(
gem=self.load_template("gem.png"),
**{f"key_{idx}": self.load_template(f"key{idx}.png") for idx in range(1, 4)},
**{f"lock_{idx}": self.load_template(f"lock{idx}.png") for idx in range(1, 4)},
**{f"player_{angle}": rotate(self.load_template("player-heist.png"), angle=angle)
for angle in range(0, 361, 45)}
)
There are a few lines we have to change in def extract_sprites
.
N_objects
from the call to matchTemplates
. This makes it possible to match as many templates as possible, as long as the score_threshold
is reached.template.split()
, but now we don't need to do that here anymore. Previous we were only interested in 1 matching player object, but now there are also multiple key templates, that are not same, since key_1 != key_2
.score_threshold
needs to be optimized. Try to find a good value for the score_threshold
, a float between 0 and 1. def extract_sprites(self, image: np.ndarray, scale: float) -> Dict[str, Tuple[int, int]]:
hits = matchTemplates(self.templates.items(), image, maxOverlap=0., score_threshold=?)
image[:] = drawBoxesOnRGB(image, hits, showLabel=True)
centers = {}
for template, (x, y, w, h), score in hits.to_numpy():
centers[template] = int((x + w / 2) / scale), int((y + h / 2) / scale)
return centers
There is a part in the code of the compute action
function that we need to change. It is the following:
if len(sprites) == 2:
path = self.extract_path(maze, sprites, draw_on_image=True)
action = self.extract_action(path)
return np.array([action])
There's a few lines we have to change and add. Every tab holds such a change and inside it a (hidden) code block with the solution.
Instead of checking len(sprites) == 2
we need to do something else to check whether or not the player and goal are found.
if 'gem' in sprites and any(key.startswith('player') for key in sprites):
In the above solution we check if the gem
and a template starting player
is found. The key.startswith('player')
makes sure that we do no take the angle into account.
We want to determine which of the rotated templates of the player is the found template. How do you find this template?
sprites['player'] = sprites[list(key for key in sprites if key.startswith('player'))[0]]
The above code will work, because there is only 1 player with a certain angle, because of the maxOverlap=0.
in the `extract sprites. It is very unlikely that there will be more than 1 player angle at the same time with a very high score threshold.
We also want to determine which of the other templates is our goal. Think about the order in which the keys, locks and the gem need to be collected.
sprites['goal'] = sprites[(sorted(key for key in sprites if key.startswith('key')) + ['gem'])[0]]
The above uses the following properties:
key_1
comes before key_2
, etc..if 'gem' in sprites and any(key.startswith('player') for key in sprites):
sprites['goal'] = sprites[(sorted(key for key in sprites if key.startswith('key')) + ['gem'])[0]]
sprites['player'] = sprites[list(key for key in sprites if key.startswith('player'))[0]]
path = self.extract_path(maze, sprites, draw_on_image=True)
action = self.extract_action(path)
return np.array([action])
Sometimes the player gets stuck at a corner. Theres multiple solutions to this. It could maybe be done by using diagonal movement or maybe by taking a random action when the observation hasn't changed in a few times. You are allowed to be creative.
We came up with this code to make it work:
class AgentHeist(Agent):
def __init__(self, venv: procgen.ProcgenEnv):
...
self.previous_sprites = dict()
self.previous_sprites_same_count = 0
def compute_action(self, observation: np.ndarray) -> np.ndarray:
...
# Move around annoying corners.
same_sprites = tuple(sprites.items()) == tuple(self.previous_sprites.items())
if same_sprites and self.previous_sprites_same_count > 3:
self.previous_sprites_same_count -= 3
return np.random.randint(0, self.action_space, 1)
self.previous_sprites = sprites.copy()
self.previous_sprites_same_count += int(same_sprites)
...
from typing import Tuple, Dict
import numpy as np
import procgen
from MTM import matchTemplates, drawBoxesOnRGB
from scipy.ndimage import rotate
from core.agent import Agent
class AgentHeist(Agent):
def __init__(self, venv: procgen.ProcgenEnv):
super().__init__(venv)
self.templates = dict(
gem=self.load_template("gem.png"),
**{f"key_{idx}": self.load_template(f"key{idx}.png") for idx in range(1, 4)},
**{f"lock_{idx}": self.load_template(f"lock{idx}.png") for idx in range(1, 4)},
**{f"player_{angle}": rotate(self.load_template("player-heist.png"), angle=angle) for angle in
range(0, 361, 45)}
)
self.previous_sprites = dict()
self.previous_sprites_same_count = 0
def compute_action(self, observation: np.ndarray) -> np.ndarray:
""" Calculate the best action for the agent. """
maze, scale = self.extract_maze(observation, grid_size=(13, 13))
sprites = self.extract_sprites(observation, scale)
# Move around annoying corners.
same_sprites = tuple(sprites.items()) == tuple(self.previous_sprites.items())
if same_sprites and self.previous_sprites_same_count > 3:
self.previous_sprites_same_count -= 3
return np.random.randint(0, self.action_space, 1)
self.previous_sprites = sprites.copy()
self.previous_sprites_same_count += int(same_sprites)
# Determine the action to move
if 'gem' in sprites and any(key.startswith('player') for key in sprites):
sprites['player'] = sprites[list(key for key in sprites if key.startswith('player'))[0]]
sprites['goal'] = sprites[(sorted(key for key in sprites if key.startswith('key')) + ['gem'])[0]]
path = self.extract_path(maze, sprites, draw_on_image=True)
action = self.extract_action(path)
return np.array([action])
return np.random.randint(0, self.action_space, 1)
def extract_sprites(self, image: np.ndarray, scale: float) -> Dict[str, Tuple[int, int]]:
""" Extracts the templates position from the input image, and returns their center location. """
hits = matchTemplates(self.templates.items(), image, maxOverlap=0., score_threshold=0.6)
image[:] = drawBoxesOnRGB(image, hits, showLabel=True)
centers = {}
for template, (x, y, w, h), score in hits.to_numpy():
centers[template] = int((x + w / 2) / scale), int((y + h / 2) / scale)
return centers
If you have followed the tutorial up unto this point, you should see a similar results as on the left side (this includes the moving around corner code as can be seen at around the 8 second mark). The right side is displaying how to the pathfinding is used to create a path from the player to the chosen goal.
In this tutorial we have shown you an adaption to our maze agent to show how you can quickly use one agent in another game.
Note that more formally you would create a class Agent
with descendents AgentHeist
, AgentMaze
, and AgentChaser
etc. where the Agent
class by itself is not usable (i.e. abstract
). This is in contrast to our Agent
class that is already adapted for use in the maze game.
That was all for now. We hope you learned a lot and also had some fun on the way! You now know how to solve a complex environment with computer vision only!