The M5 Competition

The M5 Competition, the latest of the M Competitions, ran from 2 March to 30 June 2020. It differed from the previous four ones in six important ways, some of which were suggested by the discussants of the M4 Competition.

  • It used hierarchical sales data, generously made available by Walmart, starting at the item level and aggregating to that of departments, product categories and stores in three geographical areas of the US: California, Texas, and Wisconsin.

  • Besides the time series data, it also included explanatory variables such as price, promotions, day of the week, and special events (e.g. Super Bowl, Valentine’s Day, and Orthodox Easter) that affect sales which are used to improve forecasting accuracy.

  • The distribution of uncertainty was assessed by asking participants to provide information on four indicative prediction intervals and the median.

  • The majority of the more than 42,840 time series display intermittency (sporadic sales including zeros).

  • Instead of a single competition to estimate both the point forecasts and the uncertainty distribution, there were two parallel tracks using the same dataset, the first requiring 28 days ahead point forecasts and the second 28 days ahead probabilistic forecasts for the median and four prediction intervals (50%, 67%, 95%, and 99%).

  • For the first time, it focused on series that display intermittency, i.e., sporadic demand including zeros.

Download Guidelines PDF

Aim

The aim of the M5 Competition is similar to the previous four: that is to identify the most appropriate method(s) for different types of situations requiring predictions and making uncertainty estimates. Its ultimate purpose is to advance the theory of forecasting and improve its utilization by businesses and non-profit organizations. Its other goal is to compare the accuracy/uncertainty of ML and DL methods vis-à-vis those of standard statistical ones, and assess possible improvements versus the extra complexity and higher costs of using the various methods.

Expectations & Methods Content

Given the success of the previous four M Competitions, the considerable number of participants attracted, and the significant contributions made, fundamentally changing the field of forecasting, similar or even higher achievements are expected from the M5 Competition that is aimed at the fast growing data science community which will have easy access to the M5 dataset. It will be run using the Kaggle Platform, with an expectancy that the number of participants will be in the several thousands.