In case you have any remarks or questions they are always welcome at either the education commissie, via the slack channel ec-helpme, or at our e-mail address education@serpentineai.nl.
In this lesson we are going to talk about the following things:
The GitHub links for this lesson are: Browse, Zip, Diff.
The links are currently only available for members.
Over the last few lessons we have kept adding more and more code to our main program my_agent.py
. If we would keep working into this file it would become larger and larger and more intertwined with different functionalities. Right now we have behaviour logic (such as moving) combined with pathfinding (searching) in the same agent class. By only leaving the logic in this class, we can quickly see the general proccess of our Agent, without having to worry about how it works.
Similar to how we moved Directions
to a new file, we are now going to move Pathfinding
(or BFS) to a new file. The main reason of making this change, is because we want to have different pathfinding based on our needs. Think about the following extra scenarios:
If we want the pathfinding to be more general, we have to provided the option of picking the objects it can pass to, and determine how we want the pathfinding results back. This will create a split in the moving logic (MyAgent
) and pathfinding / search algorithms (Pathfinding
).
We are not directly implementing the extra scenarios, but instead are going to prepare for the extra options and for this we are going to make a Pathfinder
class. Now we have already used classes three times, the first time is MyAgent
and the other two are Direction
and Directions
.
In order to start, create a new file pathfinder.py
in serpentine
and think about all the methods that you can move to the new class Pathfinder
. Only prepare the methods, there is no need yet to fully code the methods, since we will update the arguments and want to improve them.
Tip: If you want to define a function/method/class without any body code, you can use the pass
or ...
(Ellipsis) variable, to indicate that you will provide some code later. For example:
class Pathfinder:
pass
or
class Pathfinder: ...
The following methods have been excluded:
can_place_bomb
: is not performing any search for a path, but uses information to perform a check.create_danger_map
: is building a map based on data, no path finding is performed here.move_to_save_location
: debatable, but excluded because there is no searching of a path, only verification of found paths (positions).import numpy as np
from directions import Direction
class Pathfinder:
def __init__(self): ...
def in_bounds(self, location: tuple) -> bool: ...
def check_direction(self, board: np.array, location: tuple, direction: Direction) -> bool: ...
def reverse_path(self, came_from: dict, came_from_direction: dict, goal_location: tuple) -> list: ...
def create_path(self, board: np.array, my_location: tuple, goal_location: tuple) -> list: ...
def find_reachable_safe_locations(self, board: np.ndarray, danger_map: np.ndarray, location: tuple) -> list: ...
def explodable_neighbors(self, board: np.ndarray, location: tuple) -> int: ...
Looking at the methods that we have to implement, we can split them up into three groups:
in_bounds
, check_direction
create_path
, find_reachable_safe_locations
, explodable_neighbors
,reverse_path
The first clean up is going to create a bfs search, which allows us to input a list of Item
values that are safe to move on (e.g.: [Item.Passage, Item.Kick]
). For this you have to implement all of the above stages. An example usage would be:
pathfinder = Pathfinder(width=8, height=8)
came_from, came_from_direction = pathfinder.bfs(board, my_location, goal_location, [Item.Passage])
path = pathfinder.reverse_path(came_from, came_from_direction, goal_location)
Try and simplify the code as much as possible such that it looks similar to the BFS pseudocode in lesson 3. For this you will need an extra method (try and figure out which one).
Full code
solution, in case you want to see if you can find it out yourself.The new method is neighbors
, this will return all the valid neighbors. This means that we no longer have to check the nodes in bfs
. This is moving the code from one location to another location, but we will see that we can reuse this method for more things.
When using type hinting, you can create your own types, by defining them at the beginning of the module. For this the new type Point
is made, which is a Tuple[int, int]
, or a location as we used before.
Also notice that we no longer have a find_reachable_safe_locations
in our interface, because we can use a different trick. We will get back to that in the Applying Pathfinder
section.
import numpy as np
from typing import List, Dict, Tuple
from collections import deque
from pommerman.constants import Item
from directions import Direction, Directions
Point = Tuple[int, int]
class Pathfinder:
def __init__(self, width, height):
self.width = width
self.height = height
def in_bounds(self, row: int, col: int) -> bool:
""" Return True if the position is positive and inside the bounds. """
def passable(self, board: np.array, location: tuple, passable: List[Item]) -> bool:
""" Return True if the board position is passable. """
def neighbors(self, board: np.array, row: int, col: int, passable: List[Item]) -> List[Tuple[Point, Direction]:
""" Return all valid neighbours of a selected point. """
def bfs(self, board: np.array, start: Point, end: Point, passable: List[Item]) \
-> Tuple[Dict[Point, Point], Dict[Point, Direction]]:
""" Perform a bfs to locate the end point. """
def reverse_path(self, came_from: Dict[Point, Point], came_from_direction: Dict[Point, Direction], end: Point) \
-> List[Direction]:
""" Return a list of directions to take to reach the end point (empty if unreachable). """
import numpy as np
from typing import List, Dict, Tuple
from collections import deque
from pommerman.constants import Item
from directions import Direction, Directions
Point = Tuple[int, int]
class Pathfinder:
def __init__(self, width, height):
self.width = width
self.height = height
def in_bounds(self, x: int, y: int) -> bool:
""" Return True if the position is positive and inside the bounds. """
return 0 <= row < self.height and 0 <= col < self.width
There are two ways of getting Item
values, either using the name
or the value. For using the value we use round brackes Item(value)
(Item(0) ==
Item.Passage), while when using the name we use square ones
Item[name] (
Item['Passage'] == Item.Passage
, case sensetive!). This simplifies the check for the Item, since there is no longer the need for Item.Passage.value
.
def passable(self, board: np.array, location: tuple, passable: List[Item]) -> bool:
""" Return True if the board position is passable. """
return Item(board[location]) in passable
The reason for neighbors.reverse
is that the diagonal path is now preferred against going all the way up and right. This is more of a visial upgrade than required.
def neighbors(self, board: np.array, row: int, col: int, passable: List[Item]) -> List[Tuple[Point, Direction]]:
""" Return all valid neighbours of a selected point (row, col). """
neighbors = [(tuple(np.array((row, col)) + direction.array), direction) for direction in Directions.NEIGHBORS]
if (row + col) % 2 == 0: neighbors.reverse()
result = []
for (row, col), direction in neighbors:
if self.in_bounds(row, col) and self.passable(board, (row, col), passable):
result.append(((row, col), direction))
return result
In the earlier code we used list
and did pop(0)
to get the left element. This is afctually quit expensive to do, because removing the first element of a list requires all other elements to move one index to the left. A deque is a linked list, which is a data structure that is good for adding and removing data at either end of the list (start and beginning). Again not required, but a nice thing to know.
The second thing that might be a bit strange is {start: None}
and {start: Directions.ZERO}
. These are dictionaries that are using a variable as key name. Therefore you cannot use dict(start=None)
, since start
will then be seen as a str
(string), instead of a variable name.
def bfs(self, board: np.array, start: Point, end: Point, passable: List[Item]) \
-> Tuple[Dict[Point, Point], Dict[Point, Direction]]:
""" Perform a bfs to locate the end point. """
frontier = deque()
frontier.appendleft(start)
came_from = {start: None}
came_from_direction = {start: Directions.ZERO}
while len(frontier):
x, y = frontier.popleft()
if (x, y) == end: break
for next_node, direction in self.neighbors(board, x, y, passable):
if next_node not in came_from:
frontier.append(next_node)
came_from[next_node] = (x, y)
came_from_direction[next_node] = direction
return came_from, came_from_direction
By reusing the end
point (goal_location
), we can remove the variable current
from our old code. When you revisit a function/method always check if you can simplify it afterwards. The basic building order is:
def reverse_path(self, came_from: Dict[Point, Point], came_from_direction: Dict[Point, Direction], end: Point) \
-> List[Direction]:
""" Return a list of directions to take to reach the end point (empty if unreachable). """
parent = came_from.get(end, None)
path = []
while parent is not None:
path.append(came_from_direction[end])
end, parent = parent, came_from.get(parent, None)
return list(reversed(path))
In our solutions a few special tricks were used, this summarizes all of the above tricks:
We can access Enum
items using ()
(for value) and []
(for name), this allows for code such as:
from pommerman.constants import Item
Item.Passage == Item(0) # True
Item.Passage == Item(1) # False
Item.Passage == Item['Passage'] # True
Item.Passage == Item['passage'] # KeyError (case sensitive)
Item(0) == Item['Passage'] # True
When we are always going over the Directions.NEIGHBORS
in the same order, there will be some preference for exploring the first ocurring direction (LEFT
or W(est)
). This will lead to paths going around the outside. Now for us human we usually prefer to take a more direct path, and therefore we have to reverse the neighbors directions after every other step. Visually it will look like this:
X X X X X X 0 0
0 0 0 X 0 X X 0
0 0 0 X 0 0 X X
0 0 0 X 0 0 0 X
The left value performs the pathfinding without any changes, and the right on with adding a neighbors.reverse
part after every other step. There is a quick trick of getting the right pattern, by reversing
the values where the sum of the x
and y
index are even. Since odd and even will always switch between steps, if x+y
is even, any step will change that number to odd, and vice versa. In code it will be similar to:
neighbors = Directions.NEIGHBORS # W E N S
if (x + y) % 2 == 0: neighbors.reverse() # S N E W
This is a data structure that is good for adding and removing items at the beginning or ending of a list, it takes more time to index an item. This means that using indices (list[3]
) is more expensive. Generally there is always a trade off between different data structures.
from collections import deque
linked_list = deque() # []
linked_list.append(5) # [5]
linked_list.append(6) # [5, 6]
linked_list.appendleft(2) # [2, 5, 6]
linked_list.popleft() # [5, 6]
linked_List.pop() # [5]
When using variables to initialize a dictionary key, we always have to use {key: value}
format. When using dict(key=value)
, the key
will always be represented as a str
(string).
start = (5, 6)
start_as_variable = {start: None} # equivalent { (5, 6): None }
start_as_string = dict(start=None) # equivalent { 'start': None }
The last thing we have to do is to include the explodable_neighbors
again. The interface we used is:
def explodable_neighbors(self, board: np.ndarray, x: int, y: int, items: List[Item]) -> int:
Now we can see the power of using the new neigbors
method.
def explodable_neighbors(self, board: np.ndarray, x: int, y: int, items: List[Item]) -> int:
""" Return the count of neighbors that are in `items`. """
return len(self.neighbors(board, x, y, items))
Alternatively this can be written as
def explodable_neighbors(self, board: np.ndarray, x: int, y: int, items: List[Item]) -> int:
""" Return the count of neighbors that are in `items`. """
return sum(1 for _ in self.neighbors(board, x, y, items))
See the extra
tab for an explanation about the above solution.
You might wonder why to use the sum
function there. For the functionality it doesn't really matter that much, but it shows you a different way of iterating
over items. Remeber the list comprehension
from before? The sum
solution is a generator comprehension
. Let's give an example of a generator we have already seen before:
for x in range(10):
print(x)
So what does a generator
mean? It means that the values 0
to 9
will be produced 1 by one instead of all at once. A generator
knows how to produce the values, rather than contains all the values. In other words range
is a generator
that:
0
, then 1
, 2
, ...[0, 1, 2, ..., 9]
and then return the first item of the list, second item, etc...For a full overview check out any of the following links:
Before updating the my_agent.py
to use the new PathFinder
class, it is handy to commit the current state of the project.
git add .
git commit -m "created new Pathfinder class"
git push
Now if shit hit the fence
, you can roll back all changes to the previous commit, which is a lot easier than trying to undo untill it works again (been there done that).
The big red button of undo everything and just put me back on where I was after the last commit in git is: press here
(or):
git restore .
This is also shown when you type git status
in the terminal
.
As the section suggest, go and implement your new awesome Pathfinder
class. For mental support there is this great summerizing meme about why we need to refactor from time to time.
The following sections have to be updated in MyAgent
:
Section | To do |
---|---|
__init__ |
Create a variable that hold and initializes the Pathfinder class. |
act |
Update the interface, for reverse_path . |
can_place_bomb |
Update the interface for the checks. |
move_to_safe_location |
Use the new pathfinder interface (make the correct calls) |
find_reachable_safe_locations |
Merges the dangermap with the original map and can reuse the old bfs. |
in_bounds check_direction reverse_path create_path explodable_neighbors |
Remove (that should be easy |
A useful comment has been added to include the explicit assumption that is made for the explodable
items. When you add comments in-line, make sure that they tell you something that cannot be directly seen from the code (such as out of the box assumptions).
def __init__(self, character=characters.Bomber):
super().__init__(character)
self.pathfinder = Pathfinder(width=8, height=8)
self.queue = []
# We assume we are always Agent0 for the explodable items!
self.explodable = [Item.Agent1, Item.Agent2, Item.Agent3, Item.Wood]
self.passable = [Item.Passage, Item.Kick, Item.ExtraBomb, Item.IncrRange]
Update
for direction in self.create_path(board, my_location, goal_location):
self.queue.append(direction.action)
to
came_from, came_from_direction = self.pathfinder.bfs(board, my_location, goal_location, self.passable)
for direction in self.pathfinder.reverse_path(came_from, came_from_direction, goal_location):
self.queue.append(direction.action)
def can_place_bomb(self, board: np.ndarray, bomb_life: np.ndarray, ammo: int, my_location: tuple) -> bool:
""" Checks if you can place a bomb, if there is no bomb already placed and you have enough ammo return True. """
row, col = my_location
return bomb_life[my_location] == 0 and ammo > 0 \
and self.pathfinder.explodable_neighbors(board, row, col, self.explodable)
This code only checks if the end position is on a dangerous place, in the next lesson we will upgrade this to take into account the full path.
def find_reachable_safe_locations(self, board: np.ndarray, danger_map: np.ndarray, location: tuple) -> list:
came_from, came_from_direction = self.pathfinder.bfs(
board=board, start=location, end=(-1, -1), passable=self.passable)
safe_positions = [position for position in came_from if danger_map[position] != 1]
return safe_positions
Update
for safe_location in fully_safe_locations:
if self.explodable_neighbors(obs['board'], safe_location):
return safe_location
to
for row, col in fully_safe_locations:
if self.pathfinder.explodable_neighbors(obs['board'], row, col, self.explodable):
return (row, col)
Just another meme to help you on the way
When everything is checked it is time to merge the new branch lesson-7
, to the old master branch.
lesson-7
branch.git checkout master
(or main
).git merge lesson-7
, this will merge the branch lesson-7
onto the current branch (master
or main
).If all went well you are done (Yeah)! If not we probably created some conflicts, which have to be resolved before merging. This usually happens when you changed things in both branches (lesson-7
and master
/ main
). To read more about merging and how to resolve possible conflicts, checkout the docs from git about merging, or contact the education committee via the Slack channel ec-helpme, or via email education@serpentineai.nl.
As final part, you can create a new branch for the next lesson using git checkout -b lesson-8
.
This lesson splits a lot of our old code into a new file. This makes it more clear what the main program is doing and let all the other files figure out how to do it. Breaking things up into smaller pieces, which each have a part of the total responsibility is part of the Object Oriented Programming style or OOP in short.
As with everything you have to find the right balance, between splitting things up into smaller classes, methods or functions, and maintaining easily readable programs. In the next lesson we are going to suggest some upgrades such as looking at actively picking up upgrades and improving path picking.