news

Neural networks also have spatial awareness! Learn to create maps in Minecraft and publish in Nature journal

2024-07-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Machine Heart Report

Synced Editorial Department

This is the first time that a neural network has been shown to create its own maps.

Imagine that you are in a strange town. Even if the surroundings are unfamiliar at first, you can explore and eventually draw a map of the environment in your brain, which contains the location of buildings, streets, signs, etc. in relation to each other. This ability to build spatial maps in the brain is the basis for higher-level human cognition: for example, it is theorized that language is encoded by map-like structures in the brain.

However, even the most advanced artificial intelligence and neural networks cannot construct such a map out of thin air.

“There’s a sense that even the most advanced AI models aren’t really intelligent,” said Matt Thomson, assistant professor of computational biology and a researcher at Heritage Medical Research Institute. “They can’t solve problems the way we can; they can’t prove unproven mathematical results; they can’t generate new ideas.”

“We think it’s because they can’t navigate in conceptual space; solving complex problems is like moving in conceptual space, like navigation. AI does it more like rote learning — you give it an input, it gives you a response. But it can’t synthesize different ideas.”

A new paper from Thomson’s lab, published July 18 in the journal Nature Machine Intelligence, found that neural networks can build spatial maps using an algorithm called predictive coding.



  • Paper address: https://www.nature.com/articles/s42256-024-00863-1
  • Code address: https://github.com/jgornet/predictive-coding-recovers-maps

Led by graduate student James Gornet, the pair built environments in the game Minecraft, incorporating complex elements such as trees, rivers, and caves. They recorded videos of players randomly traversing the area and used the videos to train a neural network equipped with a predictive coding algorithm.

They found that the neural network was able to learn how objects in a Minecraft world are organized relative to each other and to "predict" the environment it would encounter as it moved through space.



The combination of the predictive coding algorithm and the Minecraft game successfully "taught" the neural network how to create spatial maps and then use these spatial maps to predict subsequent frames of the video. The result was a mean square error of only 0.094% between the predicted images and the final images.

More importantly, the team “opened up” the neural network (the equivalent of examining its internal structure) and found that representations of various objects were stored spatially relative to each other. In other words, they saw the map of the Minecraft environment stored in the neural network.

Neural networks can navigate maps provided to them by human designers, such as self-driving cars using GPS, butThis is the first time that a neural network has been shown to create its own mapsThis ability to store and organize information spatially will ultimately help neural networks become more “intelligent,” enabling them to solve truly complex problems like humans can.

The project demonstrates true spatial awareness in AI, something that is still not seen in technologies like OpenAI’s Sora., the latter has some weird glitches.

James Gornet is a student at Caltech in the Computation and Neural Systems (CNS) department, which covers neuroscience, machine learning, mathematics, statistics, and biology.

“The CNS program really provides a place for James to do unique work that wouldn’t be possible elsewhere,” Thomson says. “We are taking a biologically inspired machine learning approach that allows us to reverse engineer the properties of the brain in artificial neural networks, and we hope to learn about the brain in reverse. At Caltech, we have a community that is very receptive to this type of work.”

Neural network performing predictive coding

Inspired by the implicit spatial representation in predictive coding reasoning problems, we develop a computational implementation of a predictive coding agent and study the spatial representations learned by the agent while exploring a virtual environment.

They first created an environment using the Malmo environment in Minecraft. The physical environment has a size of 40 × 65 grid units and includes three aspects of the visual scene: a cave provides a global visual landmark, a forest makes the visual scenes similar, and a river with a bridge constrains how the agent can traverse the environment (Figure 1a).



The agent follows a path determined by an A* search to find the shortest path between randomly sampled locations and receives a visual image along each path.

To perform predictive coding, the authors built an encoder-decoder convolutional neural network with a ResNet-18 architecture for the encoder and a ResNet-18 architecture with transposed convolutions for the decoder (Figure 1b). The encoder-decoder architecture uses a U-Net architecture to pass the encoded latent unit to the decoder. Multi-head attention processes a sequence of encoded latent units to encode the past history of visual observations. The multi-head attention has h = 8 heads. For an encoded latent unit of dimension D = C × H × W, with height H, width W, and channels C, the dimension of a single head is d = C × H × W/h.



The predictive encoder approximates the predictive coding by minimizing the mean squared error between the actual observations and the predicted observations. The predictive encoder was trained for 200 epochs on 82,630 samples, using gradient descent optimization with Nesterov momentum, weight decay of 5 × 10^(-6), a learning rate of 10^(-1), and adjusted by OneCycle learning rate scheduling. The optimized predictive encoder achieved a mean squared error of 0.094 between the predicted and actual images, with good visual fidelity (Figure 1c).



See the original paper for more details.

https://techxplore.com/news/2024-07-neural-network-minecraft.html

https://www.tomshardware.com/tech-industry/artificial-intelligence/neural-network-learns-to-make-maps-with-minecraft-code-available-on-github