Extreme returns and tail modelling of the CSI 300 index for the Chinese equity market

January 27, 2024November 23, 2023 by Shengyu ZHENG

Extreme returns and tail modelling of the CSI 300 index for the Chinese equity market

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024) describes the statistical behavior of extreme returns of the CSI 300 index for the Chinese equity market and explains how extreme value theory can be used to model the tails of its distribution.

The CSI 300 index for the Chinese equity market

The CSI 300 Index, or China Securities Index 300, is a comprehensive stock market benchmark that tracks the performance of the top 300 A-share stocks listed on the Shanghai and Shenzhen stock exchanges. Introduced in 2005, the index is designed to represent a broad and diverse spectrum of China’s leading companies across various sectors, including finance, technology, consumer goods, and manufacturing. The CSI 300 is a crucial indicator of the overall health and direction of the Chinese stock market, reflecting the dynamic growth and evolution of China’s economy.

The CSI 300 employs a free-float market capitalization-weighted methodology. This means that the index’s composition and movements are influenced by the market value of the freely tradable shares, providing a more accurate representation of the companies’ actual impact on the market. As China continues to play a significant role in the global economy, the CSI 300 has become a key reference point for investors seeking exposure to the Chinese market and monitoring economic trends in the dynamic economy. With its emphasis on the country’s most influential and traded stocks, the CSI 300 serves as an essential tool for both domestic and international investors navigating the complexities of the Chinese financial landscape.

In this article, we focus on the CSI 300 index of the timeframe from March 11th, 2021, to April 1st, 2023. Here we have a line chart depicting the evolution of the index level of this period.

Figure 1 below gives the evolution of the CSI 300 index from March 11th, 2021, to April 1st, 2023 on a daily basis.

Figure 1. Evolution of the CSI 300 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 2 below gives the evolution of the logarithmic returns of CSI 300 index from March 11th, 2021, to April 1st, 2023 on a daily basis. We observe concentration of volatility reflecting large price fluctuations in both directions (up and down movements). This alternation of periods of low and high volatility is well modeled by ARCH models.

Figure 2. Evolution of the CSI 300 index logarithmic returns.

Source: computation by the author (data: Yahoo! Finance website).

Summary statistics for the CSI 300 index

Table 1 below presents the summary statistics estimated for the CSI 300 index:

Table 1. Summary statistics for the CSI 300 index.
summary statistics of the CSI 300 index returns
Source: computation by the author (data: Yahoo! Finance website).

The mean, the standard deviation / variance, the skewness, and the kurtosis refer to the first, second, third and fourth moments of statistical distribution of returns respectively. We can conclude that during this timeframe, the CSI 300 index takes on a downward trend, with relatively important daily deviation, negative skewness and excess of kurtosis.

Tables 2 and 3 below present the top 10 negative daily returns and top 10 positive daily returns for the index over the period from March 11th, 2021, to April 1st, 2023.

Table 2. Top 10 negative daily returns for the CSI 300 index.
Top 10 negative returns of the CSI 300 index
Source: computation by the author (data: Yahoo! Finance website).

Table 3. Top 10 positive daily returns for the CSI 300 index.
Top 10 positive returns of the CSI 300 index
Source: computation by the author (data: Yahoo! Finance website).

Modelling of the tails

Here the tail modelling is conducted based on the Peak-over-Threshold (POT) approach which corresponds to a Generalized Pareto Distribution (GPD). Let us recall the theoretical background of this approach.

The POT approach takes into account all data entries above a designated high threshold u. The threshold exceedances could be fitted into a generalized Pareto distribution:

Illustration of the POT approach

An important issue for the POT-GPD approach is the threshold selection. An optimal threshold level can be derived by calibrating the tradeoff between bias and inefficiency. There exist several approaches to address this problematic, including a Monte Carlo simulation method inspired by the work of Jansen and de Vries (1991). In this article, to fit the GPD, we use the 2.5% quantile for the modelling of the negative tail and the 97.5% quantile for that of the positive tail.

Based on the POT-GPD approach with a fixed threshold selection, we arrive at the following modelling results for the GPD for negative extreme returns (Table 4) and positive extreme returns (Table 5) for the CSI 300 index:

Table 4. Estimate of the parameters of the GPD for negative daily returns for the CSI 300 index.
Modelling of negative extreme returns of the CSI 300 index
Source: computation by the author (data: Yahoo! Finance website).

Table 5. Estimate of the parameters of the GPD for positive daily returns for the CSI 300 index.
Modelling of positive extreme returns of the CSI 300 index
Source: computation by the author (data: Yahoo! Finance website).

Figure 3 represents the historical distribution of negative return exceedances and the estimated GPD for the left tail.

Figure 3. GPD for the left tail of the CSI 300 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Figures 4 represents the historical distribution of positive return exceedances and the estimated GPD for the right tail.

Figure 4. GPD for the right tail of the CSI 300 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Applications in risk management

Extreme Value Theory (EVT) as a statistical approach is used to analyze the tails of a distribution, focusing on extreme events or rare occurrences. EVT can be applied to various risk management techniques, including Value at Risk (VaR), Expected Shortfall (ES), and stress testing, to provide a more comprehensive understanding of extreme risks in financial markets.

Why should I be interested in this post?

Extreme Value Theory is a useful tool to model the tails of the evolution of a financial instrument. In the ever-evolving landscape of financial markets, being able to grasp the concept of EVT presents a unique edge to students who aspire to become an investment or risk manager. It not only provides a deeper insight into the dynamics of equity markets but also equips them with a practical skill set essential for risk analysis. By exploring how EVT refines risk measures like Value at Risk (VaR) and Expected Shortfall (ES) and its role in stress testing, students gain a valuable perspective on how financial institutions navigate during extreme events. In a world where financial crises and market volatility are recurrent, this post opens the door to a powerful analytical framework that contributes to informed decisions and financial stability.

Download R file to model extreme behavior of the index

You can find below an R file (file with txt format) to study extreme returns and model the distribution tails for the CSI 300 index.

Useful resources

Academic resources

Embrechts P., C. Klüppelberg and T. Mikosch (1997) Modelling Extremal Events for Insurance and Finance Springer-Verlag.

Embrechts P., R. Frey, McNeil A.J. (2022) Quantitative Risk Management Princeton University Press.

Gumbel, E. J. (1958) Statistics of extremes New York: Columbia University Press.

Longin F. (2016) Extreme events in finance: a handbook of extreme value theory and its applications Wiley Editions.

Other resources

Extreme Events in Finance

Chan S. Statistical tools for extreme value analysis

Rieder H. E. (2014) Extreme Value Theory: A primer (slides).

About the author

The article was written in November 2023 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024).

Extreme returns and tail modelling of the Nikkei 225 index for the Japanese equity market

November 23, 2023 by Shengyu ZHENG

Extreme returns and tail modelling of the Nikkei 225 index for the Japanese equity market

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024) describes the statistical behavior of extreme returns of the Nikkei 225 index for the Japanese equity market and explains how extreme value theory can be used to model the tails of its distribution.

The Nikkei 225 index for the Japanese equity market

The Nikkei 225, often simply referred to as the Nikkei, is a stock market index representing the performance of 225 major companies listed on the Tokyo Stock Exchange (TSE). Originating in 1950, this index has become a symbol of Japan’s economic prowess and serves as a crucial benchmark in the Asian financial markets. Comprising companies across diverse sectors such as technology, automotive, finance, and manufacturing, the Nikkei 225 offers a comprehensive snapshot of the Japanese economic landscape, reflecting the nation’s technological innovation, industrial strength, and global economic influence.

Utilizing a price-weighted methodology, the Nikkei 225 calculates its value based on stock prices rather than market capitalization, distinguishing it from many other indices. This approach means that higher-priced stocks have a more significant impact on the index’s movements. Investors and financial analysts worldwide closely monitor the Nikkei 225 for insights into Japan’s economic trends, market sentiment, and investment opportunities. As a vital indicator of the direction of the Japanese stock market, the Nikkei 225 continues to be a key reference point for making informed investment decisions and navigating the complexities of the global financial landscape.

In this article, we focus on the Nikkei 225 index of the timeframe from April 1st, 2015, to April 1st, 2023. Here we have a line chart depicting the evolution of the index level of this period.

Figure 1 below gives the evolution of the Nikkei 225 index from April 1, 2015 to April 1, 2023 on a daily basis.

Figure 1. Evolution of the Nikkei 225 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 2 below gives the evolution of the daily logarithmic returns of Nikkei 225 index from April 1, 2015 to April 1, 2023 on a daily basis. We observe concentration of volatility reflecting large price fluctuations in both directions (up and down movements). This alternation of periods of low and high volatility is well modeled by ARCH models.

Figure 2. Evolution of the Nikkei 225 index logarithmic returns.

Source: computation by the author (data: Yahoo! Finance website).

Summary statistics for the Nikkei index

Table 1 below presents the summary statistics estimated for the Nikkei 225 index:

Table 1. Summary statistics for the Nikkei 225 index.
summary statistics of the Nikkei 225 index returns
Source: computation by the author (data: Yahoo! Finance website).

The mean, the standard deviation / variance, the skewness, and the kurtosis refer to the first, second, third and fourth moments of statistical distribution of returns respectively. We can conclude that during this timeframe, the Nikkei 225 index takes on a slight upward trend, with relatively important daily deviation, negative skewness and excess of kurtosis.

Tables 2 and 3 below present the top 10 negative daily returns and top 10 positive daily returns for the index over the period from April 1, 2015 to April 1, 2023.

Table 2. Top 10 negative daily returns for the Nikkei 225 index.
Top 10 negative returns of the Nikkei 225 index
Source: computation by the author (data: Yahoo! Finance website).

Table 3. Top 10 positive daily returns for the Nikkei 225 index.
Top 10 positive returns of the Nikkei 225 index
Source: computation by the author (data: Yahoo! Finance website).

Modelling of the tails

Here the tail modelling is conducted based on the Peak-over-Threshold (POT) approach which corresponds to a Generalized Pareto Distribution (GPD). Let’s recall the theoretical background of this approach.

The POT approach takes into account all data entries above a designated high threshold u. The threshold exceedances could be fitted into a generalized Pareto distribution:

Illustration of the POT approach

Table 4. Estimate of the parameters of the GPD for negative daily returns for the Nikkei 225 index.
Modelling of negative extreme returns of the Nikkei 225 index
Source: computation by the author (data: Yahoo! Finance website).

Table 5. Estimate of the parameters of the GPD for positive daily returns for the Nikkei 225 index.
Modelling of positive extreme returns of the Nikkei 225 index
Source: computation by the author (data: Yahoo! Finance website).

Figure 3. GPD for the left tail of the Nikkei 225 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Figure 4. GPD for the right tail of the Nikkei 225 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

You can find below an R file (file with txt format) to study extreme returns and model the distribution tails for the Nikkei 225 index.

Useful resources

Academic resources

Embrechts P., C. Klüppelberg and T. Mikosch (1997) Modelling Extremal Events for Insurance and Finance Springer-Verlag.

Embrechts P., R. Frey, McNeil A.J. (2022) Quantitative Risk Management Princeton University Press.

Gumbel, E. J. (1958) Statistics of extremes New York: Columbia University Press.

Longin F. (2016) Extreme events in finance: a handbook of extreme value theory and its applications Wiley Editions.

Other resources

Extreme Events in Finance

Chan S. Statistical tools for extreme value analysis

Rieder H. E. (2014) Extreme Value Theory: A primer (slides).

About the author

The article was written in November 2023 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024).

Extreme returns and tail modelling of the FTSE 100 index for the UK equity market

November 23, 2023 by Shengyu ZHENG

Extreme returns and tail modelling of the FTSE 100 index for the UK equity market

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024) describes the statistical behavior of extreme returns of the FTSE 100 index for the UK equity market and explains how extreme value theory can be used to model the tails of its distribution.

The FTSE 100 index for the UK equity market

The FTSE 100 index, an acronym for the Financial Times Stock Exchange 100 Index, stands as a cornerstone of the UK financial landscape. Comprising the largest and most robust companies listed on the London Stock Exchange (LSE), this index is a barometer for the overall health and trajectory of the British stock market. Spanning diverse sectors such as finance, energy, healthcare, and consumer goods, the FTSE 100 encapsulates the economic pulse of the nation. The 100 companies in the index are chosen based on their market capitalization, with larger entities carrying more weight in the index’s calculation, making it a valuable tool for investors seeking a comprehensive snapshot of the UK’s economic performance.

Investors and analysts globally turn to the FTSE 100 for insights into market trends and economic stability in the UK. The index’s movements provide a useful reference point for decision-making, enabling investors to gauge the relative strength and weaknesses of different industries and the economy at large. Moreover, the FTSE 100 serves as a powerful benchmark for numerous financial instruments, including mutual funds, exchange-traded funds (ETFs), and other investment products. As a result, the index plays a pivotal role in shaping investment strategies and fostering a deeper understanding of the intricate dynamics that drive the British financial markets.

In this article, we focus on the FTSE 100 index of the timeframe from April 1st, 2015, to April 1st, 2023. Here we have a line chart depicting the evolution of the index level of this period.

Figure 1 below gives the evolution of the FTSE 100 index from April 1, 2015 to April 1, 2023 on a daily basis.

Figure 1. Evolution of the FTSE 100 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 2 below gives the evolution of the daily logarithmic returns of FTSE 100 index from April 1, 2015 to April 1, 2023. We observe concentration of volatility reflecting large price fluctuations in both directions (up and down movements). This alternation of periods of low and high volatility is well modeled by ARCH models.

Figure 2. Evolution of the FTSE 100 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Summary statistics for the FTSE 100 index

Table 1 below presents the summary statistics estimated for the FTSE 100 index:

Table 1. Summary statistics for the FTSE 100 index returns.
Summary statistics of the FTSE 100 index returns
Source: computation by the author (data: Yahoo! Finance website).

The mean, the standard deviation / variance, the skewness, and the kurtosis refer to the first, second, third and fourth moments of statistical distribution of returns respectively. We can conclude that during this timeframe, the FTSE 100 index takes on a slight upward trend, with relatively important daily deviation, negative skewness and excess of kurtosis.

Tables 2 and 3 below present the top 10 negative daily returns and top 10 positive daily returns for the index over the period from April 1, 2015 to April 1, 2023.

Table 2. Top 10 negative daily returns for the FTSE 100 index.
Top 10 negative returns of the FTSE 100 index
Source: computation by the author (data: Yahoo! Finance website).

Table 3. Top 10 positive daily returns for the FTSE 100 index.
Top 10 positive returns of the FTSE 100 index
Source: computation by the author (data: Yahoo! Finance website).

Modelling of the tails

The POT approach takes into account all data entries above a designated high threshold u. The threshold exceedances could be fitted into a generalized Pareto distribution:

Illustration of the POT approach

Table 4. Estimate of the parameters of the GPD for negative daily returns for the FTSE 100 index.

Source: computation by the author (data: Yahoo! Finance website).

Table 5. Estimate of the parameters of the GPD for positive daily returns for the FTSE 100 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 3. GPD for the left tail of the FTSE 100 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Figure 4. GPD for the right tail of the FTSE 100 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

You can find below an R file (file with txt format) to study extreme returns and model the distribution tails for the FTSE 100 index.

Useful resources

Academic resources

Embrechts P., C. Klüppelberg and T. Mikosch (1997) Modelling Extremal Events for Insurance and Finance Springer-Verlag.

Embrechts P., R. Frey, McNeil A.J. (2022) Quantitative Risk Management Princeton University Press.

Gumbel, E. J. (1958) Statistics of extremes New York: Columbia University Press.

Longin F. (2016) Extreme events in finance: a handbook of extreme value theory and its applications Wiley Editions.

Other resources

Extreme Events in Finance

Chan S. Statistical tools for extreme value analysis

Rieder H. E. (2014) Extreme Value Theory: A primer (slides).

About the author

The article was written in November 2023 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024).

Copula

February 17, 2026November 23, 2023 by Shengyu ZHENG

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024) presents copula, a statistical tool that is commonly used to model dependency of random variables.

Linear correlation

In the world stacked with various risks, a simplistic look of individual risks does not suffice, since the interactions between risks could add to or diminish the aggregate risk loading. As we often see in statistical modelling, linear correlation, as one of the simplest ways to look at dependency between random variables, is commonly used for this purpose.

Definition of linear correlation

To put it concisely, the linear correlation coefficient, denoted by ‘ρ(X,Y)’, takes values within the range of -1 to 1 and represents the linear correlation of two random variables X and Y. A positive ‘ρ(X,Y)’ indicates a positive linear relationship, signifying that as one variable increases, the other tends to increase as well. Conversely, a negative ‘ρ(X,Y)’ denotes a negative linear relationship, signifying that as one variable increases, the other tends to decrease. A correlation coefficient near zero implies a lack of linear relation.

Limitation of linear correlation

As a simplistic model, while having the advantage of easy application, linear correlation fails to capture the intricacy of the dependance structure between random variables. There exist three main limitations of linear correlation.

ρ(X,Y) only gives a scalar summary of linear dependence and it requires that both var(X) and var(Y) must exist and finite;
Given that assumption that X and Y are stochastically independent, it can be inferred that ρ(X,Y) = 0. Whereas, the converse does not stand for most of the cases (except if (X,Y) is a Gaussian random vector).
Linear correlation is not invariant with regard to strict increasing transformations. If T is such a transformation, ρ(T(X),T(Y)) ≠ ρ(X,Y)

Therefore, if we have in hand the marginal distributions of two random variables and their linear correlations, it does not suffice to determine the joint distribution.

Copula

A copula is a mathematical function that describes the dependence structure between multiple random variables, irrespective of their marginal distributions. It describes the interdependency that transcends linear relationships. Copulas are employed to model the joint distribution of variables by separating the marginal distributions from the dependence structure, allowing for a more flexible and comprehensive analysis of multivariate data. Essentially, copulas serve as a bridge between the individual distributions of variables and their joint distribution, enabling the characterization of their interdependence.

Definition of copula

A copula, denoted typically as C∶[0,1]^d→[0,1] , is a multivariate distribution function whose marginals are uniformly distributed on the unit interval. The parameter d is the number of variables. For a set of random variables U₁, …, U_d with cumulative distribution functions F₁, …, F_d, the copula function C satisfies:

C(F₁(u₁),…,F_d(u_d)) = ℙ(U₁≤u₁,…,U_d≤u_d)

Fréchet-Hoeffding bounds

The Fréchet–Hoeffding theorem states that copulas follow the bounds:

max{1 – d + ∑^d_i=1u_i} ≤ C(u) ≤ min{u₁, …, u_d}

In a bivariate case (dimension equals 2), the Fréchet–Hoeffding bounds are

max{u+v-1,0} ≤ C(u,v) ≤ min{u,v}

The upper bound corresponds to the case of comonotonicity (perfect positive dependence) and the lower bound corresponds to the case of countermonotonicity (perfect negative dependence).

Sklar’s theorem

Sklar’s theorem states that every multivariate cumulative distribution function of a random vector X can be expressed in terms of its marginals and a copula. The copula is unique if the marginal distributions are continuous. The theorem states also that the converse is true.

Sklar’s theorem shows how a unique copula C fully describes the dependence of X. The theorem provides a way to decompose a multivariate joint distribution function into its marginal distributions and a copula function.

Examples of copulas

Many types of dependence structures exist, and new copulas are being introduced by researchers. There are three standard classes of copulas that are commonly in use among practitioners: elliptical or normal copulas, Archimedean copulas, and extreme value copulas.

Elliptical or normal copulas

The Gaussian copula and the Student-t copula are among this category. Be reminded that the Gaussian copula played a notable role in the 2008 financial crisis, particularly in the context of mortgage-backed securities and collateralized debt obligations (CDOs). The assumption of normality and underestimation of systemic risk based on the Gaussian copula failed to account for the extreme risks in face of crisis.

Here is an example of a simulated normal copula with the parameter being 0.8.

Figure 1. Simulation of normal copula.

Source: computation by the author.

Archimedean copulas

Archimedean copulas are a class of copulas that have a particular mathematical structure based on Archimedean copula families. These copulas have a connection with certain mathematical functions known as Archimedean generators.

Here is an example of a simulated Clayton copula with the parameter being 3, which is from the category of Archimedean copulas

Figure 2. Simulation of Clayton copula.

Source: computation by the author.

Extreme value copulas

Extreme value copulas could overlap with the two other classes. They are a specialized class of copulas designed to model the tail dependence structure of multivariate extreme events. These copulas are particularly useful in situations where the focus is on capturing dependencies in the extreme upper or lower tails of the distribution.

Here is an example of a simulated Tawn copula with the parameter being 0.8, which is from the category of extreme value copulas

Figure 3. Simulation of Tawn copula.
Simulation of Clayton copula
Source: computation by the author.

Download R file to simulate copulas

You can find below an R file (file with txt format) to simulate the 3 copulas mentioned above.

Why should I be interested in this post?

Copulas are pivotal in risk management, offering a sophisticated approach to model the dependence among various risk factors. They play a crucial role in portfolio risk assessment, providing insights into how different assets behave together and enhancing the robustness of risk measures, especially in capturing tail dependencies. Copulas are also valuable in credit risk management, aiding in the assessment of joint default probabilities and contributing to an understanding of credit risks associated with diverse financial instruments. Their applications extend to insurance, operational risk management, and stress testing scenarios, providing a toolset for comprehensive risk evaluation and informed decision-making in dynamic financial environments.

Useful resources

Course notes from Quantitative Risk Management of Prof. Marie Kratz, ESSEC Business School.

About the author

The article was written in November 2023 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024).

Extreme returns and tail modelling of the S&P 500 index for the US equity market

October 20, 2023 by Shengyu ZHENG

Extreme returns and tail modelling of the S&P 500 index for the US equity market

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024) describes the statistical behavior of extreme returns of the S&P 500 index for the US equity market and explains how extreme value theory can be used to model the tails of its distribution.

The S&P 500 index for the US equity market

The S&P 500, or the Standard & Poor’s 500, is a renowned stock market index encompassing 500 of the largest publicly traded companies in the United States. These companies are selected based on factors like market capitalization and sector representation, making the index a diversified and reliable reflection of the U.S. stock market. It is a market capitalization-weighted index, where companies with larger market capitalization represent a greater influence on their performance. The S&P 500 is widely used as a benchmark to assess the health and trends of the U.S. economy and as a performance reference for individual stocks and investment products, including exchange-traded funds (ETF) and index funds. Its historical significance, economic indicator status, and global impact contribute to its status as a critical barometer of market conditions and overall economic health.

Characterized by its diversification and broad sector representation, the S&P 500 remains an essential tool for investors, policymakers, and economists to analyze market dynamics. This index’s performance, affected by economic data, geopolitical events, corporate earnings, and market sentiment, can provide valuable insights into the state of the U.S. stock market and the broader economy. Its rebalancing ensures that it remains current and representative of the ever-evolving landscape of American corporations. Overall, the S&P 500 plays a central role in shaping investment decisions and assessing the performance of the U.S. economy.

In this article, we focus on the S&P 500 index of the timeframe from April 1st, 2015, to April 1st, 2023. Here we have a line chart depicting the evolution of the index level of this period. We can observe the overall increase with remarkable drops during the covid crisis (2020) and the Russian invasion in Ukraine (2022).

Figure 1 below gives the evolution of the S&P 500 index from April 1, 2015 to April 1, 2023 on a daily basis.

Figure 1. Evolution of the S&P 500 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 2 below gives the evolution of the daily logarithmic returns of S&P 500 index from April 1, 2015 to April 1, 2023 on a daily basis. We observe concentration of volatility reflecting large price fluctuations in both directions (up and down movements). This alternation of periods of low and high volatility is well modeled by ARCH models.

Figure 2. Evolution of the S&P 500 index logarithmic returns.

Source: computation by the author (data: Yahoo! Finance website).

Summary statistics for the S&P 500 index

Table 1 below presents the summary statistics estimated for the S&P 500 index:

Table 1. Summary statistics for the S&P 500 index.
summary statistics of the S&P 500 index returns
Source: computation by the author (data: Yahoo! Finance website).

The mean, the standard deviation / variance, the skewness, and the kurtosis refer to the first, second, third and fourth moments of statistical distribution of returns respectively. We can conclude that during this timeframe, the S&P 500 index takes on a slight upward trend, with relatively important daily deviation, negative skewness and excess of kurtosis.

Tables 2 and 3 below present the top 10 negative daily returns and top 10 positive daily returns for the S&P 500 index over the period from April 1, 2015 to April 1, 2023.

Table 2. Top 10 negative daily returns for the S&P 500 index.
Top 10 negative returns of the S&P 500 index
Source: computation by the author (data: Yahoo! Finance website).

Table 3. Top 10 positive daily returns for the S&P 500 index.
Top 10 positive returns of the S&P 500 index
Source: computation by the author (data: Yahoo! Finance website).

Modelling of the tails

The POT approach takes into account all data entries above a designated high threshold u. The threshold exceedances could be fitted into a generalized Pareto distribution:

Illustration of the POT approach

Table 4. Estimate of the parameters of the GPD for negative daily returns for the S&P 500 index.

Source: computation by the author (data: Yahoo! Finance website).

Table 5. Estimate of the parameters of the GPD for positive daily returns for the S&P 500 index.

Source: computation by the author (data: Yahoo! Finance website).

Figure 3. GPD for the left tail of the S&P 500 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Figure 4. GPD for the right tail of the S&P 500 index returns.

Source: computation by the author (data: Yahoo! Finance website).

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

You can find below an R file (file with txt format) to study extreme returns and model the distribution tails for the S&P 500 index.

Useful resources

Academic resources

Embrechts P., C. Klüppelberg and T. Mikosch (1997) Modelling Extremal Events for Insurance and Finance Springer-Verlag.

Embrechts P., R. Frey, McNeil A.J. (2022) Quantitative Risk Management Princeton University Press.

Gumbel, E. J. (1958) Statistics of extremes New York: Columbia University Press.

Longin F. (2016) Extreme events in finance: a handbook of extreme value theory and its applications Wiley Editions.

Other resources

Extreme Events in Finance

Chan S. Statistical tools for extreme value analysis

Rieder H. E. (2014) Extreme Value Theory: A primer (slides).

About the author

The article was written in October 2023 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2024).

Extreme Value Theory: the Block-Maxima approach and the Peak-Over-Threshold approach

December 28, 2025October 9, 2022 by Shengyu ZHENG

In this article, Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2023) presents the extreme value theory (EVT) and two commonly used modelling approaches: block-maxima (BM) and peak-over-threshold (PoT).

Introduction

There are generally two approaches to identify and model the extrema of a random process: the block-maxima approach where the extrema follow a generalized extreme value distribution (BM-GEV), and the peak-over-threshold approach that fits the extrema in a generalized Pareto distribution (POT-GPD):

BM-GEV: The BM approach divides the observation period into nonoverlapping, continuous and equal intervals and collects the maximum entries of each interval. (Gumbel, 1958) Maxima from these blocks (intervals) can be fitted into a generalized extreme value (GEV) distribution.
POT-GPD: The POT approach selects the observations that exceed a certain high threshold. A generalized Pareto distribution (GPD) is usually used to approximate the observations selected with the POT approach. (Pickands III, 1975)

Figure 1. Illustration of the Block-Maxima approach
BM-GEV
Source: computation by the author.

Figure 2. Illustration of the Peak-Over-Threshold approach

POT-GPD
Source: computation by the author.

BM-GEV

Block-Maxima

Let’s take a step back and have a look again at the Central Limit Theorem (CLT):

Illustration of the POT approach

The CLT describes that the distribution of sample means approximates a normal distribution as the sample size gets larger. Similarly, the extreme value theory (EVT) studies the behavior of the extrema of samples.

The block maximum is defined as such:

Illustration of the POT approach

Generalized extreme value distribution (GEV)

Illustration of the POT approach

The GEV distributions have three subtypes corresponding to different tail feathers [von Misès (1936); Hosking et al. (1985)]:

Illustration of the POT approach

POT-GPD

The block maxima approach is under reproach for its inefficiency and wastefulness of data usage, and it has been largely superseded in practice by the peak-over-threshold (POT) approach. The POT approach makes use of all data entries above a designated high threshold u. The threshold exceedances could be fitted into a generalized Pareto distribution (GPD):

Illustration of the POT approach

Illustration of Block Maxima and Peak-Over-Threshold approaches of the Extreme Value Theory with R

We now present an illustration of the two approaches of the extreme value theory (EVT), the block maxima with the generalized extreme value distribution (BM-GEV) approach and the peak-over-threshold with the generalized Pareto distribution (POT-GPD) approach, realized with R with the daily return data of the S&P 500 index from January 01, 1970, to August 31, 2022.

Packages and Libraries

packages and libraries

Data loading, processing and preliminary inspection

Loading S&P 500 daily closing prices from January 01, 1970, to August 31, 2022 and transforming the daily prices to daily logarithm returns (multiplied by 100). Month and year information are also extracted from later use.

data loading

Checking the preliminary statistics of the daily logarithm series.

descriptive stats data

We can get the following basic statistics for the (logarithmic) daily returns of the S&P 500 index over the period from January 01, 1970, to August 31, 2022.

Table 1. Basic statistics of the daily return of the S&P 500 index.

Source: computation by the author.

In terms of daily return, we can observe that the distribution is negatively skewed, which mean the negative tail is longer. The kurtosis is far higher than that of a normal distribution, which means that extreme outcomes are more frequent compared with a normal distribution. the minimum daily return is even more than twice of the maximum daily return, which could be interpreted as more prominent downside risk.

Block maxima – Generalized extreme value distribution (BM-GEV)

We define each month as a block and get the maxima from each block to study the behavior of the block maxima. We can also have a look at the descriptive statistics for the monthly downside extrema variable.

block maxima

With the commands, we obtain the following basic statistics for the monthly minima variable:

Table 2. Basic statistics of the monthly minimal daily return of the S&P 500 index.

Source: computation by the author.

With the block extrema in hand, we can use the fevd() function from the extReme package to fit a GEV distribution. We can therefore get the following parameter estimations, with standard errors presented within brackets.

GEV

Table 3 gives the parameters estimation results of the generalized extreme value (GEV) for the monthly minimal daily returns of the S&P 500 index. The three parameters of the GEV distribution are the shape parameter, the location parameter and the scale parameter. For the period from January 01, 1970, to August 31, 2022, the estimation is based on 632 observations of monthly minimal daily returns.

Table 3. Parameters estimation results of GEV for the monthly minimal daily return of the S&P 500 index.

Source: computation by the author.

With the “plot” command, we are able to obtain the following diagrams.

The top two respectively compare empirical quantiles with model quantiles, and quantiles from model simulation with empirical quantiles. A good fit will yield a straight one-to-one line of points and in this case, the empirical quantiles fall in the 95% confidence bands.
The bottom left diagram is a density plot of empirical data and that of the fitted GEV distribution.
The bottom right diagram is a return period plot with 95% pointwise normal approximation confidence intervals. The return level plot consists of plotting the theoretical quantiles as a function of the return period with a logarithmic scale for the x-axis. For example, the 50-year return level is the level expected to be exceeded once every 50 years.

gev plots

Peak over threshold – Generalized Pareto distribution (POT-GPD)

With respect to the POT approach, the threshold selection is central, and it involves a delicate trade-off between variance and bias where too high a threshold would reduce the number of exceedances and too low a threshold would incur a bias for poor GPD fitting (Rieder, 2014). The selection process could be elaborated in a separate post and here we use the optimal threshold of 0.010 (0.010*100 in this case since we multiply the logarithm return by 100) for stock index downside extreme movement proposed by Beirlant et al. (2004).

POT

With the following commands, we get to fit the threshold exceedances to a generalized Pareto distribution, and we obtain the following parameter estimation results.

Table 4 gives the parameters estimation results of GPD for the daily return of the S&P 500 index with a threshold of -1%. In addition to the threshold, the two parameters of the GPD distribution are the shape parameter and the scale parameter. For the period from January 01, 1970, to August 31, 2022, the estimation is based on 1,669 observations of daily returns exceedances (12.66% of the total number of daily returns).

Table 4. Parameters estimation results of the generalized Pareto distribution (GPD) for the daily return negative exceedances of the S&P 500 index.

Source: computation by the author.

Download R file to understand the BM-GEV and POT-GPD approaches

You can find below an R file (file with txt format) to understand the BM-GEV and POT-GPD approaches.

Why should I be interested in this post

Financial crises arise alongside disruptive events such as pandemics, wars, or major market failures. The 2007-2008 financial crisis has been a recent and pertinent opportunity for market participants and academia to reflect on the causal factors to the crisis. The hindsight could be conducive to strengthening the market resilience faced with such events in the future and avoiding dire consequences that were previously witnessed. The Gaussian copula, a statistical tool used to manage the risk of the collateralized debt obligations (CDOs) that triggered the flare-up of the crisis, has been under serious reproach for its essential flaw to overlook the occurrence and the magnitude of extreme events. To effectively understand and cope with the extreme events, the extreme value theory (EVT), born in the 19th century, has regained its popularity and importance, especially amid the financial turmoil. Capital requirements for financial institutions, such as the Basel guidelines for banks and the Solvency II Directive for insurers, have their theoretical base in the EVT. It is therefore indispensable to be equipped with knowledge in the EVT for a better understanding of the multifold forms of risk that we are faced with.

Resources

Academic research (articles)

Aboura S. (2009) The extreme downside risk of the S&P 500 stock index. Journal of Financial Transformation, 2009, 26 (26), pp.104-107.

Gnedenko, B. (1943). Sur la distribution limite du terme maximum d’une série aléatoire. Annals of mathematics, 423–453.

Hosking, J. R. M., Wallis, J. R., & Wood, E. F. (1985) “Estimation of the generalized extreme-value distribution by the method of probability-weighted moments” Technometrics, 27(3), 251–261.

Longin F. (1996) The asymptotic distribution of extreme stock market returns Journal of Business, 63, 383-408.

Longin F. (2000) From VaR to stress testing : the extreme value approach Journal of Banking and Finance, 24, 1097-1130.

Longin F. et B. Solnik (2001) Extreme correlation of international equity markets Journal of Finance, 56, 651-678.

Mises, R. v. (1936). La distribution de la plus grande de n valeurs. Rev. math. Union interbalcanique, 1, 141–160.

Pickands III, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics, 3(1), 119– 131.

Academic research (books)

Embrechts P., C. Klüppelberg and T Mikosch (1997) Modelling Extremal Events for Insurance and Finance.

Embrechts P., R. Frey, McNeil A. J. (2022) Quantitative Risk Management, Princeton University Press.

Gumbel, E. J. (1958) Statistics of extremes. New York: Columbia University Press.

Longin F. (2016) Extreme events in finance: a handbook of extreme value theory and its applications Wiley Editions.

Other materials

Extreme Events in Finance

Rieder H. E. (2014) Extreme Value Theory: A primer (slides).

About the author

The article was written in October 2022 by Shengyu ZHENG (ESSEC Business School, Grande Ecole Program – Master in Management, 2020-2023).

Extreme returns and tail modelling of the CSI 300 index for the Chinese equity market

The CSI 300 index for the Chinese equity market

Summary statistics for the CSI 300 index

Modelling of the tails

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

Related posts on the SimTrade blog

About financial indexes

About portfolio management

About statistics

Useful resources

Academic resources

Other resources

About the author

Extreme returns and tail modelling of the Nikkei 225 index for the Japanese equity market

The Nikkei 225 index for the Japanese equity market

Summary statistics for the Nikkei index

Modelling of the tails

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

Related posts on the SimTrade blog

About financial indexes

About portfolio management

About statistics

Useful resources

Academic resources

Other resources

About the author

Extreme returns and tail modelling of the FTSE 100 index for the UK equity market

The FTSE 100 index for the UK equity market

Summary statistics for the FTSE 100 index

Modelling of the tails

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

Related posts on the SimTrade blog

About financial indexes

About portfolio management

About statistics

Useful resources

Academic resources

Other resources

About the author

Linear correlation

Definition of linear correlation

Limitation of linear correlation

Copula

Definition of copula

Fréchet-Hoeffding bounds

Sklar’s theorem

Examples of copulas

Elliptical or normal copulas

Archimedean copulas

Extreme value copulas

Download R file to simulate copulas

Why should I be interested in this post?

Related posts on the SimTrade blog

Useful resources

About the author

Extreme returns and tail modelling of the S&P 500 index for the US equity market

The S&P 500 index for the US equity market

Summary statistics for the S&P 500 index

Modelling of the tails

Applications in risk management

Why should I be interested in this post?

Download R file to model extreme behavior of the index

Related posts on the SimTrade blog

About financial indexes

About portfolio management

About statistics

Useful resources

Academic resources

Other resources

About the author

Introduction

BM-GEV

Block-Maxima

Generalized extreme value distribution (GEV)