MSWEP V3: A new hourly, global precipitation dataset built with machine learning and gauge corrections
Researchers present MSWEP V3, a new global dataset of precipitation (rain and snow) that provides hourly estimates on a 0.1° grid from 1979 to the present. The product is updated in near real time with a latency of about two hours. The team describes MSWEP V3 as the first fully global, historical precipitation dataset built with machine learning at its core and then adjusted with gauge observations.
The dataset is made in two stages. First, machine-learning model stacks combine inputs from satellite products, reanalysis products (weather model outputs), air temperature fields, and fixed geographic information to produce baseline hourly precipitation fields. Those models were trained with hourly and daily observations from 15,959 rain gauges around the world. Second, the baseline fields are corrected using a much larger set of gauge observations: daily corrections use 57,666 stations and monthly corrections use 86,000 stations.
To check how well the uncorrected, machine-learning baseline works, the authors evaluated it against 19 other (quasi-)global gridded precipitation products using an independent set of 15,958 gauges that were not used in the first training step. The MSWEP V3 baseline reached a median daily Kling-Gupta Efficiency (KGE) of 0.69, a commonly used score where higher numbers indicate closer agreement with observations. That number was higher than the uncorrected values reported for several other products: ERA5 (0.61), IMERG-L V7 (0.46), GSMaP V8 (0.38), and CHIRP (0.31).
The authors also tested the effect of the daily gauge correction using leave-one-out cross-validation and found that the correction increased the median daily correlation by 0.09. The paper notes that the gain from gauge correction is limited because the baseline machine-learning estimates were already strong. The combination of the fast, high-resolution baseline and broad gauge corrections is intended to give both timely and reliable precipitation estimates for monitoring and decision support.