The M5 Competition
The M5 Competition ran from 2 March to 30 June 2020. It differed from the previous four ones in six important ways, some of which were suggested by the discussants of the M4 Competition.
-
It used hierarchical sales data, generously made available by Walmart, starting at the item level and aggregating to that of departments, product categories and stores in three geographical areas of the US: California, Texas, and Wisconsin.
-
Besides the time series data, it also included explanatory variables such as price, promotions, day of the week, and special events (e.g. Super Bowl, Valentine’s Day, and Orthodox Easter) that affect sales which are used to improve forecasting accuracy.
-
The distribution of uncertainty was assessed by asking participants to provide information on four indicative prediction intervals and the median.
-
The majority of the 42,840 time series display intermittency (sporadic sales including zeros).
-
Instead of a single competition to estimate both the point forecasts and the uncertainty distribution, there were two parallel tracks using the same dataset, the first requiring 28 days ahead point forecasts and the second 28 days ahead probabilistic forecasts for the median and four prediction intervals (50%, 67%, 95%, and 99%).
-
For the first time, it focused on series that display intermittency, i.e., sporadic demand including zeros.
Aim
The aim of the M5 Competition was similar to the previous four: that is to identify the most appropriate method(s) for different types of situations requiring predictions and making uncertainty estimates. Its ultimate purpose was to advance the theory of forecasting and improve its utilization by businesses and non-profit organizations. Its other goal was to compare the accuracy/uncertainty of ML and DL methods vis-à-vis those of standard statistical ones, and assess possible improvements versus the extra complexity and higher costs of using the various methods.
Expectations & Methods Content
Given the success of the previous four M Competitions, the considerable number of participants attracted, and the significant contributions made, fundamentally changing the field of forecasting, higher achievements have been from the M5 Competition aimed at the fast-growing data science community which will have easy access to the M5 dataset. The M5 was running using the Kaggle Platform, attracting close to 6,000 participants.