In contrast to the previous M competition, M5 involves the sales of various products organized in a hierarchical fashion. This means that, businesswise, in order for a method to perform well, it must provide accurate forecasts across all hierarchical levels, especially for series of high aggregate sales (measured in US dollars). In other words, we expect from the best performing forecasting methods to derive lower forecasting errors for the series that are of more value for the company. To that end, the forecasting errors computed for each participating method will be weighted across the M5 series based on the aggregate sales that each series represents, i.e. a proxy of their actual value for the company in monetary terms.
Assume that two products of the same department, A and B, are sold in a store at WI. Product A, of price $1, displays 10 sales in the testing period, while product B, of price $2, displays 6 sales. The aggregate sales of product A will be $1*10=$10, while the aggregate sales of product B will be $2*6=$12. Assume also that a forecasting method was used to forecast the sales of product A, product B, and their aggregate sales, displaying errors EA, EB, and E, respectively. If the M5 dataset involved just those three series, the final score of the method would be
This weighting scheme can be expanded in order to consider more stores, geographical regions, product categories, and product departments, as previously described. Note that, based on the considered scheme, all hierarchical levels are equally weighted. This is because the total sales of a product, measured across all three States, are equal to the sum of the sales of this product when measured across all ten stores, or similarly, because the total sales of a product category of a store are equal to the sum of the sales of the departments that this category consists of, as well as the sum of the sales of the products of the corresponding departments.