Videos have surfaced in the last week the attempts of an AI to learn how to play Mario. His name is Rupert and he just passed level 2.
In these videos, Rupert performs various actions: running, jumping, attacking enemies, falling off cliffs, and dying repeatedly. After each kill, Rupert remains persistent and tries again, largely repeating the moves that led to his previous loss. However, close observation reveals that Rupert is evolving and improving his in-game performance; learns.
Rupert works as a set of machine learning algorithms They are perfected from their own mistakes, with a clear goal: Successfully complete the level. The AI ​​knows which keys you can press and has the ability to show the game screen.
Unlike a human Mario player The AI ​​has no prior knowledge of the need to avoid Koopas or other game strategies. Rupert relies solely on positive and negative feedback. Essentially, Rupert experiments with random actions, keeping track of what works and what doesn’t, and tweaking his strategy over time.
Rupert’s approach is similar to the evolutionary process through deployment Concepts of “kind” and “generations”. The AI ​​performs “species-specific” strategy tests consisting of two to six trials. After an interval of 50 to 100 “species” The AI ​​combines the acquired knowledge into a new “generation”.
With every match, the AI ​​collects one “Fitness” score This increases based on how far Mario can move to the right and how fast he does it. The generations that show the greatest fitness are selected to “procreate” future generations. This means that the AI ​​builds its performance on successful behaviors and patterns while starting from a renewed foundation. As a result, your decision-making becomes more complex and demanding over time.
Although progress is gradual, Rupert managed to pass the first stage in 57 generations, which drew enthusiastic comments from viewers who celebrated its success.
However, Rupert might face challenges later in the game. The current reward system relies on Mario moving horizontally across the screen. However, in certain Super Mario levels, the goal is to move up to reach the goal, not to the right. Rupert’s performance in these situations remains to be seen.