gpt-4o played "black myth: wukong", the ai "horse" took care of the boss, and the winning rate was superhuman

gpt-4o played "black myth: wukong", the ai "horse" took care of the boss, and the winning rate was super human

2024-09-28

is the game "black myth: wukong" just one monster?

i admit, when my friend questioned me like this, at that moment, i broke my guard.

it only took me less than half a day from the realization that yang jian must be captured to the death of tiger vanguard. if we want to clear "black myth", can we count on ai?

roll and dodge, keep the distance, and have a clear view of the monster's movements.

when the time came, the man of destiny suddenly swung his heavy stick.

with the help of the power of ai, a smooth combo was performed, and the boss was defeated without the ability to fight back. i don’t know how many gamers are crying.

the alibaba research team recently proposed a varp agent framework. and this ai "horse" was made by them.

it can be said that it is not a plug-in, but it is better than a plug-in.

the gpts who faced the great sage were really no worse than humans.

when ai faces the great sage, it is actually not that complicated.

traditionally, game ai relies on game apis to obtain environmental information and executable action data. but the problem is that not every game is willing to provide an open api, or even if it is provided, some apis are lacking in arms and legs, making it difficult to meet actual needs.

moreover, traditional methods always feel like there is something missing and cannot fully simulate the real gaming experience of human players.

based on this, the alibaba research team proposed a new varp (vision action role-playing) agent framework.

after receiving input game screenshots, the varp agent framework performs inference using a set of vlms, and finally generates code in python form to control the game character, including a combination of a series of atomic commands, such as light attack, dodge, heavy attack, and health recovery. wait.

varp contains three knowledge bases: situation base, action base and human guidance base, and two systems: action planning system and human guidance trajectory system.

simply put, the action planning system is equivalent to a librarian, responsible for finding the most appropriate materials from the situation library and the updateable action library.

based on the input game screenshots, the system selects or generates actions that fit the current situation, and then these actions and situations are stored or updated in these two libraries.

the human-guided trajectory system uses human operation data sets to improve the performance of varp in complex tasks, such as path-finding tasks and difficult combat tasks.

in the action library, "def new_func_a()" represents a new action generated by the action planning system, while "def new_func_h()" represents a new action generated by the human-guided trajectory system. "def pre_func()" indicates a predefined action.

in the game "black myth wukong" mentioned above, the research team set 12 tasks, 75% of which involved combat, and conducted benchmark tests using vlms including gpt-4o, claude 3.5 sonnet and gemini 1.5 pro. .

the results show,varp has a win rate of up to 90% in basic missions and simple to medium difficulty battles. however, when faced with difficult tasks, varp's performance easily drops, and its overall level is still not as good as that of human players.

in addition, when the varp agent processes decisions in the game, it cannot analyze every game frame (i.e. game screen) in real time because it relies on the inference speed of the visual language model (vlm).

in other words, it can't react to everything happening on the screen almost instantaneously like a human player can. instead, it can only process the game footage every few seconds, selecting some important frames (keyframes) for analysis and decision-making.

when "black myth: wukong" was launched, it was complained about the lack of a map and the existence of a large number of "air walls". therefore, without human assistance, the ai would be like a headless fly unable to find the boss.

generative ai has ignited the fire of world change. before it entered the public consciousness, ordinary people’s more intuitive bond with ai may have mostly originated from games.

in the history of video games, ai is far more important than we think

many people may not have thought that one of the first games to jump on the ai train would be the classic arcade game “pac-man.”

the prerequisite for the player's victory is to eat all the beans in the maze, and the seemingly silly colorful ghosts have different pursuit algorithms, and they will pursue the player in different paths and ways.

each ghost's algorithmic moves are extremely simple and lack any learning ability. once the player understands these rules, the difficulty of the game will plummet.

"metal gear solid" launched in 1987 marked another important milestone in game ai.

the ai characters in the game began to exhibit more complex behavior patterns, and for the first time a hostile response mechanism to players was introduced. if the player is spotted by the enemy, the enemy will trigger the alarm system, call for reinforcements, change patrol routes, and even set traps.

later, if the development process of ai and games is briefly listed in a series of landmark events, it is roughly as follows:

in 1997, ibm's "deep blue" defeated the human world champion in a chess match, marking a major breakthrough for ai in chess games.

in 2004, "half-life 2" was released. the ai characters in the game were able to make more complex decisions and interactions, improving the immersion of the game.

in 2011, ibm's "watson" defeated the human champion in the quiz show "jeopardy!" demonstrating ai's progress in natural language processing and knowledge reasoning.

in 2016, alphago defeated lee sedol in the go game, marking a major breakthrough for ai in complex strategy games.

in 2018, "red dead redemption 2" was released. the level of interaction between the game's ai characters and the environment was greatly improved, providing a highly realistic gaming experience.

in 2020, nvidia launched dlss technology, which uses ai to accelerate graphics rendering and improve game performance and image quality.

looking at the current game environment, the game still focuses on companionship, and ai is like an amplifier, amplifying this companionship countless times.

at this year's ces show, nvidia used avatar cloud engine (ace) to make game npcs "alive" and became popular in the industry.

in the demo called kairos, players were able to interact with jin, the ramen shop owner. although jin is just an npc, with the help of generative ai, he answers questions like a real person.

the combination of ai and games always makes people feel both love and hate.

take competitive games as an example. in the past, the method was just to adjust the difficulty rigidly, but now it is to imitate human operations to make the game experience more realistic.

supporters believe that when human-simulating ai becomes an opponent or teammate, it can in turn enhance the competitive feel of the game due to the lack of real players.

this is also the disadvantage. the player retention rate has been improved, but under the control of the system, players cannot escape the vortex of being played by ai.

in the early stage, there were bold words, in the middle stage, there was nonsense, and in the later stage, there was silence.

when we stay up all night just to win another game, it is hard to tell whether we are playing the game or the game is playing us.

especially when you realize that your teammates may be ai, the feeling of powerlessness is like a fist hitting cotton. my heart felt soft and had no focus.

old huang prophet! will future games be generated by ai?

even if you are a novice at coding, we can use ai to play games.

a few years ago, this would probably have been something we could only dream about, but the arrival of generative ai has given space for all of this to come to fruition.

at a small level, it is like creating a gpts to play a story-telling game. at a large level, it is an ai-assisted mini program game. although the interactivity is not interesting, it is better than the beautiful graphics.

taking things a step further, even 3a masterpiece-level games may be generated directly through ai rendering in the future.

last year, nvidia founder jensen huang predicted thatevery pixel in future games will be generated, not rendered.when this was said at the time, everyone might still be hesitant.

typically, it can take a week to create an environment for a small game, and longer for a studio project, depending on the complexity of the design.

last month, google deepmind announced its first “ai game engine”GameNGen。

it is capable of simulating the classic shooting game "doom" in real time at over 20 frames per second on a single tpu chip.

it works by using a diffusion model to predict each frame in real time, meaning that every moment in the game is generated in real time based on the complex interactions of the player's actions and the environment.

at that time, nvidia senior scientist jim fan couldn't help but sigh that "doom", which was wildly run by hackers in various places, was actually implemented in a pure diffusion model, and every pixel was generated.

later, more similar results are emerging.

not long ago, tencent also made a big move and launched a large model specifically for 3a open world games——GameGen-O。

gamegen-o can simulate characters, dynamic environments, and complex actions in various aaa games, such as "the witcher 3", "cyberpunk 2077", "assassin's creed" and "black myth: wukong", and the quality of the generated game scenes is also very high

in order to build the data set, tencent, which spent a lot of money, collected more than 32,000 game videos, each video was as short as a few minutes or as long as a few hours, and then selected 15,000 available videos through manual data annotation.

these curated videos are cut into segments through scene detection, and then rigorously sorted and filtered based on aesthetics, optical flow analysis, and semantic content.

electronic arts, an american game developer, also recently showed the industry a bright vision for future ai implementation in game development through a video titled "from imagination to creation."

in the video, players can use ai tools to create game scenes, characters and other content with just simple instructions.

ceo andrew wilson saidin the future, generative ai can improve more than half of the company’s development processes, is expected to design and create larger, more immersive game worlds within three to five years.

ai can not only improve the development efficiency of existing games, but also create new gaming experiences.

maybe you will say that no matter what advanced technology is used in the game, in the final analysis, fun is king.

but when gta 6 has repeatedly bounced and has been missing, we may also have the idea of doing it ourselves and having enough food and clothing.

after all, it would feel pretty good if you could create a "sin city" with your own hands.

news