MM-NEAT Applied to Partial Observability Ms. Pac-Man
This page presents research in a
partially observable variant of Ms. Pac-Man
done by undergraduate student Will Price as
part of Southwestern University's Summer research program
Our entry into the Ms. Pac-Man Vs. Ghost Team Competition at the 2018 Computational Intelligence and Games
conference won first place
in the Ms. Pac-Man track of the competition under the name Squillyprice01.
This research extends Dr. Schrum's
in the standard fully observable version of Ms. Pac-Man
and uses the MM-NEAT software package
(an extension of NEAT).
Although the code for evolving these agents is in the MM-NEAT repository,
the source code for the competition entry is at this link.
The key enhancement needed in the partial observability domain are models of
the pills and ghosts that allow for predictions about unseen state which are
fed into the sensors of the neural network controllers.
The videos on this page can be viewed in a playlist here.
Three Module Multitask Beats All Levels
The left side shows the maze as experienced by Ms. Pac-Man, who cannot see ghosts in the gray areas (partial observability). However, it maintains a model of where it believes ghosts could be, which is indicated by red and blue squares. The agent is actually controlled by an evolved neural network that has three output modules (that create colored movement paths): one for when it senses no ghosts (green), one for when it senses any threats (red), and one for when it senses only edible ghosts (blue). This evaluation cuts off once every level is beaten once. The final score of this agent is 16,120 points.
One Module Network Reaches Second Level
This network uses only one output module and only reaches the second level (final score 7,750). This is a standard neural network that behaves relatively poorly. Some one-module networks actually perform reasonably well, though generally not as well as three module multitask.
Two Module Network Reaches Fourth Level
This network performs much better, and reaches the fourth level with a score of 13,640. However, although the network has two modules that it can switch between using preference neurons, it has evolved to only use one module all of the time (this is why there is always a green trail behind Ms. Pac-Man). Strangely, preference neuron networks focus on only one module in the partially observable domain, in part because the sensors that are used (split sensors) make it easy to treat threat and edible ghosts differently without using different modules.
Three Module Network Reaches Second Level
This network has three preference modules, but like the two module network above only uses one module. However, unlike the network above, it performs very poorly, only reaching the second level with a score of 1,161. Some three module networks perform much better than this, though all generally only use one module.
This network is another example of a preference neuron network doing well by only using one module. This network reaches the fourth level with a score of 15,670. Module Mutation Duplicate is able to create and use new modules based on previous modules, but evolution simply focuses on one module instead.