**DISCLAIMER:** The aim of this glossary is to provide persons considering
whether to compete for the European Statistics Awards with an overview of
some of its essential elements to allow them to assess whether it would be
in their interest to form a team to submit entries. It does not constitute
authoritative information on the rules, terms and conditions for the
competitions of the European Statistics Awards Programme. Those rules,
terms and conditions will be made available to prospective teams once
a competition opens for registration to allow them to take an informed
decision.

All entries are assigned an
accuracy score. The
team with the entry
having the best accuracy score will win the 1^{st} accuracy award
(The 2^{nd} and 3^{rd} accuracy awards will be granted similarly).

For each entry, the accuracy score for a country is calculated as the sum of the five best country scores of that entry.

The panel will check the documentation of each entry, which will be checked to assess the completeness of its individual key elements, including:

- procedure description
- input data
- code (including inline comments)

and other documentation.

The country score of an entry is based on

- the volatility index for the country multiplied by
- the MSRE for that country across all monthly submissions of that entry.

An entry has a better country score if it has a small MSRE for countries that are more volatile (i.e., has a lower volatility index).

In case there are more than 6 monthly submissions made for a country, the minimum MSRE across all permutations with exactly 6 monthly submissions is obtained.

For entries competing exclusively for the Accuracy Awards, it suffices to submit a short description. For an entry to also be in the running for the Reproducibility Award, complete documentation of the entry is required at the end of the competition.

An entry is made of a submission of point estimates (i.e. nowcasted values) for at least 5 countries using a given approach.

Each team may use up to 5 different approaches for each time series, and thus submit 5 different nowcasts for that time series, organised in entries. Teams will have to take care to keep their entries separate, since each entry is evaluated in isolation.

Teams are free to change the model used for an entry up to five times during the duration of the competition. For the model to be eligible for evaluation, at least 6 separate monthly submissions must be made for that model. Furthermore, to score high on the integrity criterion, the same model should be consistently used for all submissions of an entry.

An adaptative approach based on the adjustment of (super)parameters taking contextual information into account is permissible. The compliance with the integrity criterion (and the extent to which model parameters changes can be linked to contextual information) will be assessed by the panel when evaluating entries for this criterion.

Teams that both are interested in competing for the Reproducibility Awards and in trying out very different models should therefore foresee different entries for different models.

For each entry, the panel will assess the model interpretability, i.e. the extent to which a human could understand and articulate the relationship between the model’s predictors and its outcome.

The competition leaderboard will keep track of the performance of the different entries of the teams over the course of the competition. For each time series, the leaderboard will be updated once the official release is available for all countries (thus with a considerable lag in relation to the reference period).

Mean square relative error (MSRE): average of the square of the relative
differences between the point estimate (i.e. the nowcasted value) (Y_{i})
and the first official release (R_{i}). The average runs over the different nowcasted
values provided for an entry for a given country
and for a given approach chosen by the team. Better nowcasting approaches are
those where the MSRE is closer to zero.

where C is the country in the provided entry and n is the number of different estimation periods.

In case there are more than 6 monthly submissions made, the minimum MSRE obtained across all permutations with exactly 6 monthly submissions are used.

In case of revision of the official release, only the first official release for a given period is used.

A nowcast is a point estimate of the value of a time series value, provided before the end of the reference period that the value refers to.

The panel will assess the possible external data used for each entry with respect to their openness, availability, coverage (for instance geographical and time coverage) and consistency.

The general task of the panel when assessing this criterion is to assess whether there would be any major obstacles to scaling up an entry (typically made for a few countries and a certain time period) to all (or most) countries of the European Statistical System.

For each entry, the panel will compare the model used to the state of the art, i.e. those pre-existing approaches that are closest to the model applied for the entry, and the extent to which the entry represents an improvement over these pre-existing approaches.

The panel will assign higher marks to entries that are bring novelty with respect to existing approaches, and lower marks to entries that rely on ‘baseline state-of-the-art’ models.

For each European Statistics Awards competition, a panel of independent expert evaluators has been appointed. The panel is responsible for rating each entry with regard to its Accuracy Score and Reproducibility Score.

On the basis of the scores, the European Commission (Eurostat) decides on the awarding of the Reproducibility Awards and Accuracy Awards.

Eurostat is assigning great importance to approaches having a potential to be scaled up for use in European statistics production. Therefore, entries for which teams submit additional documentation will be in the running for the Reproducibility Award.

The entry with the highest Reproducibility Score will receive the Reproducibility Award.

However, for each time series, only those entries are considered that:

- are in the best quartile with regard to their Accuracy Score (since there is expected to be little interest in reproducing underperforming models);
- have a Reproducibility Score above the minimum cut-off – both overall and for each individual reproducibility criterion.

If there are no such entries (best quartile) for a given time series, then no reproducibility prize is awarded.

The reproducibility score of an entry will be evaluated by a panel of expert evaluators, who will rate the entry according to its completeness, integrity, openness, originality, interpretability, and simplicity.

For each entry, the panel will evaluate the model used with respect to its predictive validity – but also the parsimony of the parameterisation, providing a higher score where the parametrisation can be linked to assumptions made. This criterion tends to penalise overfitting models that contain more parameters than can be justified by the data.

For each entry, at least six (6) monthly submissions of nowcasts over eight (8) months are required. Each submission consists of a single point estimate for the time series concerned – to be submitted via the online system ahead of the deadline set in the rules.

A volatility index for a given country and a given time series is a normalised measure of the volatility score. It is used to take into account the difficulty of nowcasting a time series when comparing different entries covering different countries.

For each time series, the volatility index for the countries is obtained by means of sorting the volatility scores, and normalising them by using a scaling function that maps the scores into a linear order. The function used to scale is

where v_{C} is the country’s volatility score, V_{min}
and V_{max} are the minimum and maximum country volatility scores,
respectively, f is the scaling function, and X_{min} and
X_{max} are the maximum and minimum values in which we map the
country volatility scores. For all competitions we use X_{min} = 0.5
and X_{max} = 2, and the natural logarithm is used as the scaling function f.

Note that the scaling function returns smaller weights for countries with higher volatility scores; this is to ensure (1) some level of normalisation of countries with high and low volatility rates, and (2) that teams take countries with higher volatility scores into consideration when preparing their entries.

For each country and each time series, the volatility index will be fixed for the entire duration of the nowcasting round and will be published once the nowcasting awards round opens for registrations.

The volatility score for a given country and a given time series is a measure of the variability of the time series calculated using historical data.

For each time series, a country’s volatility score v_{C}
is computed based on the historical variability of that time series.
Official available data from the Eurostat data portal were downloaded
and used to calculate the volatility scores:

- For tourism, the number of nights spent at tourist accommodation establishments were used (eurobase code: TOUR_OCC_NIM). The dataset is available here.
- For production volume in industry the PVI dataset was used (eurobase code: STS_INPR_M). The dataset is available here.
- For producer prices in industry (domestic market), the PPI dataset was used (eurobase code: STS_INPPD_M). The dataset is available here.

To compute the country’s volatility score for a time series, we base the scores on the GARCH(1,1) model applied to the data from the period 2010-2019. The model is able to take the seasonality of the time series to adjust the calculated score (so that time series with strong seasonal fluctuation but few fluctuations within seasons are assigned a high volatility score).