T-II-3

From ESPCI Wiki
Jump to navigation Jump to search


Extreme Statistics

Generically, finding the distribution of the maximum of a set of random variables is a non-trivial problem, which appears in many contexts ranging from the maximal height of water in a river to fluctuations in stock markets We consider N independent random variables drawn from the same distribution . We denote

It is useful to use the following notations for the cumulative distributions

Let us denote by the distribution of and by its cumulative distribution.

  • Write in terms of . (Help: Start to write this relation for ).

This is the fundamental relation of Extreme statistics and we analyze its consequences in the large N limit where, analogously to the central limit theorem, extremes statistics display universal features.

  • In particular shows that in the large N limit we can write

In the present exercise, we first study the case of the exponential distribution. In a second step we generalize our results to a larger class of distributions.


Exponential distribution

The exponential distribution is one of the fundamental continuous distributions, and already for this reason worthy of study. Among many other places, it appears in the Poisson process. The distribution writes:

where both and are positive numbers.

Preliminaries: the central limit

  • compute the mean value and the variance of this distribution
  • consider , the sum of N independent, exponentially distributed, random variables. How is distributed?


We write in a more convenient way

where the location of the distribution and is the width of the distribution of . Both numbers depend on . Finally, is a random number and its distribution, becomes independent of in the large "N" limit. In other words this means that the distribution of is significantly different from zero when the value of is around , in a region of size .


  • From the central limit theorem which is the natural choice for and ? Write the distribution

The Maxima

Consider now the case

  • Write and . (Remember that is a positive number.)
  • Write and .
  • Plot for different values of N.


We want now to give a natural definition for the number and .

Consider . If you draw N independent exponential variables, how many variables (in average) will be greater than ? Repeat the same exercise with such that

  • Justify that can be estimated from
  • Compute for the exponential distribution and justify that

In the large N limit, the distribution becomes independent.

  • Show that in this limit its cumulative takes the from

This is the cumulative distribution of the famous Gumbel distribution.

Let us remark that the precise definition of and fix the mean and the variance of the rescaled distribution At variance with the central limit case the mean will be different from zero and the variance different from one.

  • Compute the mean, the variance and the asymptotic behavior of the Gumbel distribution. Draw the distribution. Explain why is a special point

Generic case: Universality of the Gumbel distribution

The Gumbel distribution is the limit distribution of the maxima of a large class of function. We can say that the Gumbel distribution plays, for extreme statistics, the same role of the Gaussian distribution for the central limit theorem.

By contrast the behavior of and as a function of strongly depend on the particular distributions . We discuss here a family of distribution characterized by a fast decay for large

where The key point is to be able to determine such that

  • For shows

Otherwise should be determined asymptotically for large

  • Show that
  • Show that in general and compute as a function of for large .
  • Show that the maximum distribution take the form

with Gumbel distributed

  • Identify and discuss its behavior as a function of

If the distribution is defined on the entire real axis and is characterized by the same fast decay, it is easy to generalize this result also for the distribution of the minima.

  • Write the Gumbel distribution for the minima

Minimum of exponential random numbers

The Gumbel distribution is not the only distribution for the extremes. Consider the simple case of the minima of the exponential distribution

  • Show analytically that the distribution function for the minimum of exponential random numbers with parameters is again an exponential random number with parameter :


Program this in Python, produce a histogram and compare with the exact result.

  • Look on the web which are the possible extreme distributions for independent and identically distributed variable