In all programming languages you are usually free to write however you want and that is great to put an idea into practice and test them out. However it also allows for messy code, in large or a complicated project this does not scale well, resulting in bottlenecks that could shut down your entire project. One of the essential skills to master programming in complex systems, such as artificial intelligence algorithms, is the proper use of classes.
Here we would like to teach you our approach for implementing autonomous agents. For this we use abstract classes. The theory is relatively similar in most programming languages, but the syntax might differ greatly.
We are going to give you an implementation of an abstract class (blueprint) that you need to fill out. Then we are going to use your implementation and test it on an already existing framework.
Since we do not want to change the framework, your implementation has to match our description exactly.
Click ont the next tab to see the test framework and in the explanation section is a blueprint of the BaseAgent.
import gym
# Here we need to import your agent
agent = BaseAgent()
# The gym environment
env = gym.make('CartPole-v0')
# The training loop
for _ in range(10_000):
state = env.reset()
done = False
score = 0
while not done:
# Here your agent needs to return an action
action = agent.action(state)
state, reward, done, info = env.step(action)
score += reward
# This is where you can train your agent given the score
agent.train(score)
# Now we are testing your loading and saving
agent.save()
# We remove the object just to be sure
del agent
# We recreate your agent and load it
agent = BaseAgent()
agent.load()
# This is the testing loop, try and get a high score!
score = 0
for _ in range(100):
state = env.reset()
done = False
while not done:
action = agent.action(state)
state, reward, done, info = env.step(action)
score += reward
print("Final score:", score / 100)
Unfortunately this is not yet implemented. If you have some knowledge of Java and would like to implement this please contact the education commission or the board.
In Serpentine we use the environment OpenAI Gym a lot, so in this example we are going to make a small example using the default agent layout.
import abc
from abc import ABC
class AbstractAgent(ABC):
""" Abstract classes inherit from the ABC base class. """
def __init__(self, *args, **kwargs):
""" Initialization method for the agent. """
pass
def __str__(self):
""" Internal python method that converts classes to strings. """
pass
def __repr__(self):
""" Echoed when it is evaluated as an object. """
pass
@abc.abstractmethod
def train(self, score):
""" Abstract method that trains the agent. """
pass
@abc.abstractmethod
def action(self, obs):
""" Abstract method that returns an action. The return type is not specified yet. """
pass
@abc.abstractmethod
def load(self):
""" Abstract method that loads an internal state. """
pass
@abc.abstractmethod
def save(self):
""" Abstract method that saves an internal state. """
pass
Trying to instantiate the abstract class will lead to an error. In order to see this we will have to inherit from it. Or in other words create a new class that uses the abstract class as a blueprint. Note that we use the NotImplementedError to indicate that we are calling a function that is not yet ready.
class BaseAgent(AbstractAgent):
def __init__(self, *args, **kwargs):
pass
def __str__(self):
return self.__repr__()
def __repr__(self):
return f"<class {self.__class__.__name__}>"
def train(self, score):
raise NotImplementedError
def action(self, obs):
raise NotImplementedError
def load(self):
raise NotImplementedError
def save(self):
raise NotImplementedError
Unfortunately this is not yet implemented. If you have some knowledge of Java and would like to implement this please contact the education commission or the board.
Create a (self learning) agent that is able to play Cartpole. bonus points if you can beat it. You should use the skeleton code provided in Problem and Explanation. If you want to have your agent displayed on the bottom of this page you can send your agent to education@serpentineai.nl. The high score gets bragging rights.
Created by: Thymen Rijpkema
Edited by: Dik van Genuchten
Implementations by: Thymen
First try it out before checking out the submitted solutions for the best training practice.
By Thymen Python (score=96)
This implements a parameter based agent, that guesses random parameters until it finds a good one, it is capped at a score of 200. Since these parameters only work in some of the starting positions it is not a beating implementation of CartPole.
import numpy as np
class BaseAgent(AbstractAgent):
def __init__(self, *args, **kwargs):
self.param = np.random.rand(4)
self.best_score = 0
self.best_param = np.random.rand(4)
def __str__(self):
return f"BaseAgent (param={self.param}, best_param={self.best_param}, best_score={self.best_score})"
def __repr__(self):
return f"<{self.__class__.__name__} (param={self.param})>"
def train(self, score):
if score > self.best_score:
self.best_score = score
self.best_param = self.param
else:
self.param = np.random.rand(4)
def action(self, obs):
return 0 if np.sum(np.multiply(self.param, obs)) < 0 else 1
def load(self):
with open("best_param.npy", "rb") as file:
self.param = np.load(file)
def save(self):
with open("best_param.npy", "wb") as file:
np.save(file,self.best_param)