news

Google's AI weather forecast is published in Nature: simulating 22 days of weather in 30 seconds, with efficiency increased by 100,000 times!

2024-07-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


New Intelligence Report

Editor: Editorial Department

【New Wisdom Introduction】Google has proposed a new ML-based atmospheric circulation model, NeuralGCM, which saves several orders of magnitude of computational effort compared to traditional physics-based models, and reduces computational costs by 100,000 times, equivalent to 25 years of progress in high-performance computing. For 2-15 day weather forecasts, this method is more accurate than SOTA physical models.

Early this morning, Google CEO Pichai posted a message on X, announcing that NeuralGCM has made a major breakthrough in the field of climate modeling!

"NeuralGCM combines physics-based modeling with artificial intelligence to simulate the atmosphere 100,000 times more efficiently than other models, giving scientists a new tool to predict climate change."


This research result was also published in Nature. Most of the research team came from Google Research and DeepMind, as well as scientists from MIT, Harvard and ECMWF.


Paper address: https://www.nature.com/articles/s41586-024-07744-y

The model developed by Google, called NeuralGCM, can simulate the Earth's atmosphere quickly, efficiently and accurately.

Its significance lies in helping scientists make accurate predictions about the Earth's climate at a time when the Earth is warming at an unprecedented rate.

Which regions will face prolonged droughts as global temperatures rise? Where will coastal flooding become more frequent due to large tropical storms? How will wildfire seasons change as temperatures rise?

Faced with these urgent problems, the traditional physics-based General Circulation Model (GCM) seems to be somewhat stretched. GCM lacks sufficient stability when faced with long-term weather and climate simulations.

NeuralGCM is a machine learning-based method combined with traditional physical modeling, which greatly improves the accuracy and efficiency of simulation.

The 2-15 day weather forecasts generated by this approach are more accurate than current state-of-the-art physical models and reproduce temperatures over the past 40 years more accurately than conventional atmospheric models.

It marks an important step forward in the development of more powerful and accessible climate models.


NeuralGCM simulates the specific humidity pattern from December 26, 2019 to January 8, 2020

NeuralGCM revolutionizes climate modeling

Although traditional climate models have improved over the past few decades, they often have errors and biases due to scientists' incomplete understanding of how Earth's climate works and how models are built.

These models divide the space between the Earth's surface and the atmosphere into cubes with sides of 50-100 kilometers, and then predict weather changes in each cube over a period of time.

The model then calculates the movement of air and moisture based on the generally accepted laws of physics, which is the basis of weather forecasting.

But the problem is that the scale of 50-100 kilometers is too large.

Many important climate processes, including clouds and precipitation, vary on scales smaller than the cube size used by current models (millimeters to kilometers).

Moreover, scientists' physical understanding of some processes, such as cloud formation, is incomplete.

Therefore, these traditional models not only rely on first principles, but also use simplified models to generate approximations called "parameterizations" to simulate small-scale and poorly understood processes.

These simplifying approximations inevitably reduce the accuracy of physics-based climate models.

So, how does NeuralGCM solve this problem?

Like traditional models, NeuralGCM still divides Earth's atmosphere into cubes and calculates the physics of large-scale processes such as the movement of air and moisture.

The difference is that NeuralGCM no longer relies on "parameterized" approximations developed by scientists to simulate small-scale weather changes, but instead uses neural networks to learn the physics of these events from existing weather data.

A key innovation of NeuralGCM is that the numerical solver for large-scale processes was rewritten from scratch using JAX.

This enables the researchers to use gradient-based optimization to tune the “online” behavior of the coupled system over multiple time steps.

In contrast, previous attempts to use machine learning to enhance climate models have struggled with numerical stability because they used “offline” training, ignoring important feedbacks between small- and large-scale processes that accumulate over time.

Another benefit of writing the entire model in JAX is that it can run efficiently on TPUs and GPUs, whereas traditional climate models mostly run on CPUs.


NeuralGCM combines a traditional fluid dynamics solver with a neural network for small-scale physics. These components are combined through a differential equation solver to advance the system in time.

The Google team used ECMWF weather data from 1979 to 2019 to train a series of NeuralGCM models at 0.7°, 1.4°, and 2.8° resolutions.

Although NeuralGCM is trained based on weather forecast data, the team designed NeuralGCM as a general atmospheric model.

Accurate weather forecasts and climate predictions

Recent Earth-atmosphere machine learning (ML) models, including Google DeepMind’s GraphCast, have demonstrated revolutionary accuracy in weather forecasting.

To date, research on ML forecasting has focused primarily on short-term predictions, far short of the years to decades required for climate prediction.

Because multi-decade climate forecasts are difficult to reliably verify, the Google team evaluated NeuralGCM’s performance on climate-scale forecasts and assessed it as a weather model using the established WeatherBench 2 benchmark.

The NeuralGCM deterministic model at 0.7° resolution is comparable to the current most advanced models in terms of weather forecast accuracy, with a weather forecast accuracy of up to 5 days.

However, deterministic models lack the quantitative uncertainty required to produce useful forecasts over long lead times.

Forecast ensembles are generated from slightly different starting conditions to produce a range of equally likely weather conditions. The probabilistic weather forecasts produced by these ensembles are generally more accurate than deterministic forecasts.

The NeuralGCM ensemble model at 1.4° resolution outperforms the previous SOTA in terms of forecast accuracy from 5 to 15 days.

This performance improvement is due to the fact that NeuralGCM generates ensemble weather forecasts that are comparable to ECMWF's physically based SOTA model ECMWF-ENS.

NeuralGCM is the first published ML model to do this.

In the 2 to 15 day forecast, the NeuralGCM ensemble forecast is more accurate than ECMWF-ENS 95% of the time.

NeuralGCM also outperforms state-of-the-art atmospheric models in terms of climate timescale predictions.

Because NeuralGCM only simulates the atmospheric component of Earth's climate, the Google team compared its performance with physically based atmospheric models.

When predicting temperatures between 1980 and 2020, the average error of NeuralGCM's 2.8° deterministic model is one-third of the error of the Atmospheric Model Interpretation (AMIP), i.e. 0.25 vs. 0.75 degrees Celsius.


Comparison of the performance of NeuralGCM and AMIP in predicting the global mean temperature at 1000 hPa from 1980 to 2020

Because traditional atmospheric models have trouble simulating some aspects of Earth's atmosphere, climate scientists sometimes use higher-resolution models, such as X-SHiELD, which are more accurate but computationally expensive.

Compared to X-SHiELD, NeuralGCM’s 1.4° deterministic model reduced errors by 15-50% when predicting 2020 humidity and temperature data provided by the National Oceanic and Atmospheric Administration (NOAA).

During the 2020 climate simulation, NeuralGCM also predicted tropical cyclone patterns that matched the number and intensity of storms observed over the same regions that year.

NeuralGCM is the first machine learning-based model capable of generating such patterns.


NeuralGCM predicts tropical cyclone tracks around the globe for 2020 (the predicted number and intensity of storms match the actual number and intensity of cyclones recorded in the ECMWF Reanalysis v5 (ERA5) dataset)

Open, fast and efficient

NeuralGCM is several orders of magnitude faster and cheaper than traditional physics-based climate models.

Its 1.4° model is more than 3,500 times faster than X-SHiELD, which means that if researchers use X-SHiELD to simulate a year's atmosphere, it would take 20 days, while NeuralGCM would only take 8 minutes.

Moreover, scientists only need a computer with a single TPU (Tensor Processing Unit) to run NeuralGCM, while running X-SHiELD requires requesting the use of a supercomputer with 13,000 CPUs (Central Processing Units).

Overall, the computational cost of climate simulations using NeuralGCM is 100,000 times lower than that of X-SHiELD, equivalent to 25 years of progress in high-performance computing.

NeuralGCM simulates the atmosphere faster than state-of-the-art physical models while producing forecasts with the same accuracy

In this chart, NeuralGCM competes against two physical models, NCAR CAM and NOAA X-SHiELD, to compare the number of atmospheric simulation days they can generate within 30 seconds of computation time.

The three models run at different resolutions, with X-SHiELD having the highest resolution (0.03°), NCAR CAM6 having a resolution of 1.0°, and NeuralGCM having the lowest resolution (1.4°).

It is worth mentioning that although NeuralGCM operates at a low resolution, its accuracy is comparable to that of high-resolution models.

So, with comparable accuracy, we can see that NeuralGCM is able to generate 22.8 days of atmospheric simulation in 30 seconds, while X-SHiELD, a high-resolution physical model that must run on a supercomputer, can only generate 9 minutes!

This also eliminates the advantage of NCAR CAM6, which was previously favored by researchers due to its low computing cost.

The Google team has made the source code and model weights of NeuralGCM publicly available on GitHub for non-commercial use. They hope that other researchers can easily add new components to test hypotheses and improve model capabilities.

Additionally, because NeuralGCM can be run on a laptop rather than requiring a supercomputer, more climate researchers will be able to use this state-of-the-art model in their work.

Conclusion and Future Directions

NeuralGCM currently models only Earth’s atmosphere, but the Google team hopes to eventually incorporate other aspects of Earth’s climate system, such as the oceans and the carbon cycle, into the model.

In this way, NeuralGCM will be able to make predictions on longer time scales, not only predicting weather over days and weeks, but also on climate time scales.

All in all, NeuralGCM proposes a new way to build climate models that may be faster, less computationally expensive, and more accurate than existing models.

References:

https://www.nature.com/articles/s41586-024-07744-y

https://research.google/blog/fast-accurate-climate-modeling-with-neuralgcm/

https://x.com/sundarpichai/status/1815512751793721649