Research on AI machine learning is not only focused on the academic field, but video games themselves are also indicators of the development of artificial intelligence. The OpenAI team recently published a machine learning research result, allowing AI artificial intelligence to learn by itself, and then play Minecraft, and achieve the level of 20 minutes for the character to build a diamond pickaxe.
This is called “Video PreTraining” (VPT) research was published by 9 engineers including Bowen Baker of the OpenAI teamThesis resultsthe team used VPT to conduct neural network training through a large number of unlabeled “Minecraft” game videos on the Internet, and at the same time added a small amount of labeled data models, and then fine-tuned the machine behavior through subsequent fine-tuning. The purpose is to allow AI to autonomously learn to make Diamond Pickaxe.
Briefly, VPT first replicates the model through behavior, uses 70,000 hours of labeled videos for Reinforcement Learning (Reinforcement Learning) training, plus 2,000 hours of IDM videos, including learning how to collect wood, turn logs into planks, and then make Workbench; the model also learns actions that human players often do in the game, including swimming, hunting animals, and even building a block stand vertically.
According to official calculations, it takes regarding 20 minutes to quickly make a diamond pickaxe on each randomly generated map of Minecraft, even for skilled veteran players, so they make follow-up fine-tuning work for the VPT model.
From the process point of view, the “Minecraft” player character must first collect wood to build work tables and sticks, then upgrade to stone tools to mine metal and make furnaces, and finally make iron pickaxes to mine diamond mines to make diamonds, which requires at least 24,000 operations.
Finally, following training, the VPT model successfully learned to make a diamond pickaxe in the shortest steps, and its performance in collecting props has reached or even exceeded the level of many players.
The research team said that the reason why Minecraft was chosen for machine learning is that it is one of the most active games in the world, with easy access to a large amount of free image data, and the game itself is an open-world sandbox craft gameplay. It operates like a computer program in the real world.
With the results of this research, OpenAI believes that VPT machine learning can expand more possibilities in various fields, even keyboard and mouse operation learning. They also believe that VPT can have more advantages in other computer computing fields. Good application vision.
In addition, they alsoThe MineRL 2021 Diamond Competition“The event cooperation encourages players to use VPT to solve the puzzles in the “Minecraft” game.