In case you have any remarks or questions they are always welcome at either the education commissie, via the slack channel ec-helpme, or at our e-mail address education@serpentineai.nl.
In this lesson we are going to talk about the following things:
The GitHub links for this lesson are: Browse, Zip, Diff.
The links are currently only available for members.
In the last lesson we got our agent to move in squares. That's a great start, but it will not win us a game of pommerman. For that we want to, among other things, move around to all tiles we can reach. To accomplish this, we will need to know more about our location and the board layout. Luckily this information is stored in obs
, and we can extract it. obs
is a dictionary, an ordered (unordered in python 3.6 and lower) collection that holds key:value
pairs.
Key value | Type | info |
---|---|---|
'board' | np.array(8, 8) | The game board as a two dimensional array |
'position' | [int, int] | The agents x, y (row, col) position in the grid. The values are 0 up to and including 10. |
'ammo' | int | The agent's current ammo (number of boms) |
'blast strength' | int | The range of the bomb fire (in x and y) |
'can kick' | int (0 or 1) | Whether the agent can kick or not. |
'teammates' | int | Which agent is the teammate, if there is no teammate the value is -1 |
'enemies' | [int, int, int] | Which agents are the enemy, if there is a teammate the value is -1 |
We will use the position and board a lot when deciding which action to take each turn, so we want to make seperate variables for them. But where should we make them? We want to use them each time we decide what action to add to our queue. Because we only want to add new actions to our queue if it's empty, the best place to put the assignments is right after we check if something is left in our queue. Let's call the variables my_location
and board
respectively. Recall that you can acces a dictionary with dictionary[key]
.
def act(self, obs, action_space):
if not self.queue:
my_location = obs['position']
board = obs['board']
Note that we didn't add the keyword self
to these variables. This means these variables can only be referenced inside the function and will be deleted when the function terminates. This fits the purpose of the variables as the information they store is irelevant after the function returns. (my_location
as well as board
will be different after the turn ends, as your agent moves as a result of the act
function.)
Before we move on to using the variables board
and my_position
let's look at what information they hold. As can be seen in the table above, the position consists of two integers, namely the y and x coordinates of our agent. This data is stored as a tuple, an ordered immutable (not changeble) collection. We can easily extract this data by: row, col = my_location
. This gives two variables we can easily work with. This is called tuple unpacking and is specific for Python.
The board is a np.array(8, 8)
. What does this mean? We'll start off with np
, which is an abreviation for numpy, a module that is usefull for multidimensional array objects, such as 2D and 3D grids. This is a handy module, which we will use a lot in the future, so instead of importing specific functions we will import the whole thing. By convention we will import it as the abbreviation np
at the top of our file.
import numpy as np
In our code this is also the actual first line, as it is convention to put imports of whole external libraries on the first line. For now we will only import it, we will use it later on.
np.array
means the data type array is defined in the library numpy. Finally, (8, 8)
are the dimensions of the array. Meaning the height as well as the width go from 0 to 7, with (0, 0)
being the top left corner. Every position in this 2D array contains a number representing the state of the tile with the indices as y and x coördinates. See the tab for the specific meaning per number.
number | meaning |
---|---|
0 | Passage |
1 | Rigid wall |
2 | Wooden wall |
3 | Bomb |
4 | Flames |
5 | Fog |
6, 7, 8 | Specific power-ups |
9 to 13 | Specific agents |
Now that we understand board
and my_location
, we need to be able to check whether we can move to all neighbouring tiles. We'll make a separate function for this, so the act function doesn't become one big function that has to do everything. This is a programming paradigm (pattern or model) that is used very often and is referred to as the Single-responsibility principle (SRP). First we will take a look at check left.
Let's start with defining a function to check whether our agent can move to the left. Because this check is functionality of MyAgent
we can put the function in this class, beneath the act
function. We'll call the function check_left
and take the board
and my_location
as parameters. Remember that we define the function within a class so the first parameter should be self
. See the tab: [skeleton], for the skeleton code.
The syntax for accessing a np.array
such as the board is np.array[row, col]
. To check what's on the tile left of our agent we thus have the syntax board[row, col - 1]
. We don't want to return what this value is though, but only whether it is passable (True
) or not (False
). Recall from the table above that the value for passage is 0
.
def check_left(self, board, my_location):
pass
We use the pass statement here so the code will still work. Without it the code will not run and give an error (try and see for yourself).
Side note: when we create a function inside a class, it is referred to as a
method
. For more information check difference between method and function.
row, col = my_location
is an easy way to extract the data of my_location
.
Because we want to distinguish between to cases an if
statement is a perfect fit. Also remember that we need to use double equality signs because we are comparing instead of assigning values.
def check_left(self, board, my_location):
row, col = my_location
if board[row, col - 1] == 0:
return True
return False
Although this code probably gives the expected result if you test it once, it is not optimal:
0
), it would try to access the tile in the column -1
.board[row, col - 1]
should be a 0
. And if the pommerman game were to change to another way to number the tile states you would have to change every number in your code.In this section we will handle to pitfalls that were indicated in the full code
tab.
0
), it would try to acces the tile in the column -1
.board[row, col - 1]
should be a 0
To fix the first problem we could use another if statement, but this would mean another line of code. So we'll use a way that is a bit less obvious, by replacing col - 1
with max(col - 1, 0)
. Now we pick col-1
if it is bigger than 0, but if it becomes smaller we will take the value 0
instead. In other words we check the tile our agent is on if we are in the left most column instead of the -1
. Although this is not what we want to check, the tile state will never be 0
/Passage as our agent is on that tile.
def check_left(self, board, my_location):
row, col = my_location
if board[row, max(col - 1, 0)] == 0:
return True
return False
The solution for the second problem is to have a variable with the value 0
, that we use instead of an actual 0
. This means the code is easier to read and if the value were to change, we'd only need to change it once. Luckily for us, these values already have a name. They are in the constants.py
file of pommerman so let's import that into our code. We already import Action
at the top of our code, and the class we need now is called Item
. An import can be extended using a comma:
from pommerman.constants import Action, Item
To get the value of passage, we'll now need to type Item.Passage.value
. Although this is longer than typing 0
, it is a better habbit to do this. It makes your code clearer for yourself and others and makes it way easier to compensate for changes made elsewhere. The final function now looks like this
def check_left(self, board, my_location):
row, col = my_location
if board[row, max(col - 1, 0)] == Item.Passage.value:
return True
return False
The function we made now enables us to work on pathfinding in the next lesson, but for now we'll integrate this method (function in a class) into our current pattern of walking in a square. Try and only append the Action.Left
to the queue if check_left
is true. Remember to use the keyword self
, as check_left
is defined in the same class as act
.
Checking one side isn't enough for pathfinding though, so we should also code check_right()
, check_up()
and check_down()
. This is a bit tedious, but for now it is to hard to make one function for all 4 directions. Luckily copy pasting is a thing, but you should be careful though, as you will need to exchange max()
for min()
in 2 cases.
def act(self, obs, action_space):
# Main event that is being called on every turn.
if not self.queue:
my_location = obs['position']
board = obs['board']
if self.check_left(board, my_location):
self.queue.append(Action.Left)
if self.check_right(board, my_location):
self.queue.append(Action.Right)
if self.check_up(board, my_location):
self.queue.append(Action.Up)
if self.check_down(board, my_location):
self.queue.append(Action.Down)
# If we cannot move in any direction, send a pass
if not self.queue:
self.queue.append(Action.Stop)
return self.queue.pop(0)
Now that we've got all the functionality implemented for this lesson let's talk about how we can clear things up even more. A very crucial part of the functionality of our code is the functions, so it's good to know what they do. The name check_left()
gives some clue to what it does, but it does certainly not tell everything, it could still be any of the following cases:
So it's good to give more information than just the name of a function. We'll do this on the first line of the function, like so:
def function_name(parameter_1, parameter_2):
""" Describe functionality """
actual code
Add such a line for all your check functions.
As we also saw in this lesson, it is very important to know the datatype of something (e.g.: np.array, tuple or int). This is not only important for knowing what syntax to use in a function, but also to know what you need to give as parameters, if you call the function. Some of this information you'll forget over time, so it is good to write this down as well. Using comments for this is possible, but not ideal as it can get messy really quickly. Luckily python has a more clear syntax for this, called 'type hinting':
def function_name(self, parameter_1: data_type, parameter_2: data_type) -> data_type:
This way you can remember which data type each parameter of your function is. Add these to all the check functions as well.
We went with the following docstring, but any line that is similar will probably work. The most important thing is to mention what we check, and what we return.
""" Checks if the field on the left is passable, if it is we returns True. """
To get the full docstring you can just type """ (3 double quotation marks) and hit Enter. You automatically also get @param and @return statements, to explain your input parameters and elaborate on what your function outputs exactly.
For now, we're only using a partial docstring to keep it simple. Choosing clear variable names already helps a lot in explaining your code to others.
def check_left(self, board: np.array, my_location: tuple) -> bool:
""" If the field on the left of our position is a Passage, return True. """
row, col = my_location
if board[row, max(col - 1, 0)] == Item.Passage.value:
return True
return False
Please note that these types are no guarentee in Python, since at run time the type of variables might change. This is the value that you are expecting and Pycharm will try and warn you whenever you give different values to it, but the code will still try to run if you give it other data types.
Now that we have are finished implementing this we are going to commit this to the file management system. In the terminal (which you can find within PyCharm on the lower left) we are going to type the following commands:
git add *
git commit -m "finished Pommerman lesson 2"
git push
The last line (git push
) is only for the users that have created a repository on GitHub. Now if you would go and check it out in your own repository on GitHub.com you will see that the file serpentine/my_agent.py
has been updated with the above code and the commit message is added.
In this lesson we have taken a look at the position and board information that is stored in the observation. We can now check if we can actually move the way that we want and otherwise we do not move. In the final part we added documentation and type hinting to help understand the code more and make it easier for other people to check your code.
If you are unsure if your code is correct, and you are a member, you can use the GitHub links to check. (back to top)