Multi-Brain HyperNEAT Experiments in the Team Patrol domain
Multi-Brain HyperNEAT
is a software framework that extends
HyperNEAT
to evolve agents that possess multiple brains. Different brains can
be used in different situations, making it easy to evolve multimodal
behavior.
The team patrol domain is divided into two
tasks: advance, in which the three
robots spread out to the three
segments of the plus sign, and
return, in which the robots must
return to their original starting
positions.
The videos on this page can be viewed in a playlist here.
Team Patrol: One Module (1M)
Each robot in the team has a separate network controller,
but they are all generated using the same one module CPPN using
Multi-Agent HyperNEAT. An input signal in each controller network
indicates when the robots should return from their patrol.
All robots manage to reach their patrol way point, but one
of them gets stuck when the return signal activates.
Team Patrol: Situational Policy Geometry (SPG)
Use of situational policy geometry means that each of the three robots
can have two separate network brains. Each is created using a different
situation input in the CPPN genome. As a result, the controller networks
do not need a return signal input. Instead, at the point when the return
signal would be issued, a separate network is used, which allows all
robots to return home successfully.
Team Patrol: Two Modules with preference neurons (2M)
Each robot has two separate network controllers,
and decides which one to use at each moment based on
outputs of preference neurons within each controller.
One robot uses its green network (going to top), another
uses its red network (going right), and only one makes
use of both networks (going left). The network on the
right gets stuck while returning.
Team Patrol: Three Modules with preference neurons (3M)
Each of the three robots have three separate network controllers.
All use the blue and green modules, though one uses its blue module
so seldom and briefly that the color does not show up clearly on
the timeline. One robot also makes brief use of its red module.
Two of the three robots manage to return home once the return
signal activates.
Team Patrol: Module Mutation Duplicate (MM(D))
This application of module mutation duplicate results in each robot
having four network controllers, though each robot generally only
uses two modules. One robot uses the green module almost exclusively,
but thrashes back and forth between green and blue when turning the
corner on the way home. The other two robots use the red module to
stay in the dead ends while waiting for the return signal, but use
their primary module most of the time otherwise (green in one case,
and blue in another). Thrashing module use also occurs with these
robots, one of which gets stuck alternating back and forth between
red and blue modules before completely returning home.
Team Patrol: Module Mutation Previous (MM(P))
Each robot of this team produced by module mutation previous
has many unused controller networks. In fact, there are even
"networks" that lack all links, or that do not connect many
of the inputs to the outputs of the robots, making them unusable.
However, there are viable controllers, and each team member
makes use of two. One robot makes distinct use of blue and
beige controllers, while the other two stick primarily to one
controller and make only thrashing usage of another.
The robot that favors the blue module gets stuck and
fails to return home.
Team Patrol: Module Mutation Random (MM(R))
Each robot has five controllers produced by module mutation random,
though each only uses two modules in order to perfectly patrol the
maze and return home. All robots make exclusive use of a red module
in order to reach their individual way points. When the return signal
activates, all robots briefly make use of a green module to turn away
from the dead end. One of these robots continues to use the green module
to get it all the way back home, but the others only make thrashing
use of the green module after responding to the return signal.
Team Patrol: Multitask (MT)
These robots advance into the maze and return home
synchronously, each with the use of two separate
controllers produced by their multitask CPPN. Each
robot has one controller for advancing to its way point
that was produced by one set of CPPN outputs, while the
controllers used for returning home were produced by a
second set of CPPN outputs. The result is the perfect
behavior in the movie above.