Authors: Lubos Hanus and Jozef Barunik
Abstract: e propose a distributional deep learning approach to probabilistic forecasting of economic time series. Being able to learn complex patterns from large amount of data, deep learning methods are useful for a decision making that depends on uncertainty of possibly large number of economic outcomes. Such predictions are also informative to decision makers facing asymmetric dependence of their loss on outcomes from possibly non-Gaussian and non-linear variables. We show the usefulness of the approach on the three distinct problems. First, we use deep learning to construct data-driven macroeconomic fan charts that reflect information contained by large number of variables. Second, we obtain uncertainty forecasts of irregular traffic data. Third, we illustrate gains in prediction of stock return distributions which are heavy tailed and suffer from low signal-to-noise ratio.
Jozef Barunik Associate Professor, Academy of Sciences and Charles University
Jozef Baruník is an Associate Professor at the Institute of Economic Studies, Charles University in Prague. He also serves as a head of the Econometrics department at the Czech Academy of Sciences. In his research, he develops mathematical models for understanding financial problems (such as measuring and managing financial risk), develops statistical methods and analyzes financial data. Especially, he is interested in asset pricing, high-frequency data, financial econometrics, machine learning, high-dimensional financial data sets (big data), and frequency domain econometrics (cyclical properties and behavior of economic variables).
Authors: Christian Brownlees and Jordi Llorens-Terrazas
Abstract: Empirical risk minimization is a standard principle for choosing algorithms in learning theory. In this paper we study the properties of empirical risk minimization for time series. The analysis is carried out in a general framework that covers different types of forecasting applications encountered in the literature. We are concerned with 1-step-ahead prediction of a univariate time series generated by a parameter-driven process. A class of recursive algorithms is available to forecast the time series. The algorithms are recursive in the sense that the forecast produced in a given period is a function of the lagged values of the forecast and of the time series. The relationship between the generating mechanism of the time series and the class of algorithms is unspecified. Our main result establishes that the algorithm chosen by empirical risk minimization achieves asymptotically the optimal predictive performance that is attainable within the class of algorithms.
Zuzana Prášková Professor, Charles University
Authors: Frantisek Cech and Michal Zitek
Abstract: We argue that marine fuels consumers and producers can reduce the uncertainty of their portfolios under the environmental regulations aimed at air pollution re- duction. Our results show that uncertainty can be reduced up to 72%. We also identify Gasoil futures as the universal hedging instrument to manage uncertainty.
Authors: Jens Perch Nielsen
Jens Perch Nielsen Professor, Bayes Business School
Actuary from Copenhagen and statistician from UC-Berkeley. Worked as appointed actuary in his young days and led various product development departments before specialising in research and development. He became research director of RSA with responsibilities in life as well as non-life in 1999. From 2006 until 2012 he worked as an entreprenuer and he is still co-owner and board member of Copenhagen based ScienceFirst, London based Operational Science and Cyprus based Emergent. He is co-author of more than 100 scientific papers in reviewed journals of actuarial science, economics, econometrics and statistics and also one book on quantitative operational risk modelling and associate editor of a number of journals.
Authors: Joakim Olsen, Arild Brandrud Næss, Pierre Lison
Abstract: This paper explores how to automatically measure the quality of human-generated summaries, based on a Norwegian corpus of real estate condition reports and their corresponding summaries. The proposed approach proceeds in two steps. First, the real estate reports and their associated summaries are automatically labelled using a set of heuristic rules gathered from human experts and aggregated using weak supervision. The aggregated labels are then employed to learn a neural model that takes a document and its summary as inputs and outputs a score reflecting the predicted quality of the summary. The neural model maps the document and its summary to a shared summary content space and computes the cosine similarity between the two document embeddings to predict the final summary quality score. The best performance is achieved by a CNN-based model with an accuracy (measured against the aggregated labels obtained via weak supervision) of 89.5%, compared to 72.6% for the best unsupervised model. Manual inspection of examples indicate that the weak supervision labels do capture important indicators of summary quality, but the correlation of those labels with human judgements remains to be validated. Our models of summary quality predict that approximately 30% of the real estate reports in the corpus have a summary of poor quality.
Authors: Sebastiano Vitali
Abstract: This work aims at studying the impact of the SARS-CoV-2 pandemic on the global financial markets. In particular, such impact is analysed through the changes of the shape of the implied volatility smile of the options written on several equity indexes and on several stocks. The implied volatility function is estimated using the market-based information of liquid options and applying a semi-parametric smoothing technique that exploits a kernel function and no-arbitrage conditions. Such approach is applied to an extensive set of data to study the evolution of the implied volatility functions through the months of the pandemic. We show, in several cases, a sudden and massive change in the shape of the implied volatility functions.
Authors: Kainat Khowaja
Abstract: In this paper, we show that the generalised random forest estimate, as function of x, is uniformly converging to true function. Athey et al (2020) show in their paper on Generalised Random Forests(GRF) that an estimate obtained from GRF at the point is asymptotically normal. Given a data set at hand, we find a critical value such for the uniform confidence bands using multiplier bootstrap, since it is well known that the standard approach via extreme value theory have a very slow asymptotic. We also demonstrate the construction of UCB on the same example given in the study of Athey et al (2020).
Authors: Julian Winkel
Abstract: Living in the Information Age, the power of data and statistics has never been more prevalent. Academics, architects, medical doctors, journalists, lawyers, programmers and many other professionals nowadays require an accurate application of statistical methods. Instead many branches are subject to a crisis of integrity, which is shown in improper use of statistical models, p-hacking, HARKing or failure to replicate results. We propose the use of a peer-to-peer education network, Quantinar, to spread statistical knowledge embedded with code in the form of Quantlets.
Authors: Petra Laketa, Stanislav Nagy and Dusan Pokorny
Abstract: Let $R^d$ be the $d$-dimensional Euclidean space and $mu$ a finite Borel measure on $R^d$. A halfspace depth of a given point $x$ with respect to $mu$ is defined as the infimum of $mu$-masses of all the closed halfspaces that contain $x$. As such, it measures the centrality of $x$ with respect to $mu$ and is used as a multivariate quantile. The notion of the halfspace depth found also application in machine learning. The existing literature on this interesting topic usually imposes restrictive assumptions on measure $mu$. We consider halfspace depth in general setting, for all finite Borel measures, with intention to collect partial results from the literature and give more general theoretical results. We specially focus on 1) when and how is it possible to reconstruct the underlying measure based on its halfspace depth function and 2) extending the so-called ray basis theorem, which gives an interesting characterization of the point with the maximal halfspace depth, called the halfspace median.
Petra Laketa PhD student, Charles University
Authors: Daniel Mittendorf
Abstract: We propose a procedure to model the time-varying relationship between a univariate response and a high-dimensional predictor set derived from textual data. In such settings, standard shrinkage approaches such as lasso can encounter computational and numerical issues. While text-specific regression approaches such as multinomial inverse regression and hurdle regression have been developed, these methods allow for neither time-varying parameters nor heteroskedasticity. Our proposed method first performs several random projections of the predictor matrix to a much lower-dimensional linear subspace. For each of these compressed predictor sets, a time-varying parameter state space model is estimated using fast Kalman filter recursions, allowing for heteroskedasticity in the response. These models' predictions are then averaged dynamically using Bayesian posterior model probabilities. The resulting procedure remains stable with hundreds of thousands of predictors and is trivially parallelisable. Intuitive variable importance measures can be computed naturally with little additional computational cost.
Daniel Mittendorf PhD student, Adam Smith Business School, University of Glasgow
Authors: Karel Kozmik
Karel Kozmik PhD student, Charles University
Authors: Petr Vejmelka
Petr Vejmelka PhD student, Charles University