# Séminaire de Physique Statistique du LPTMS : Misha Tamm (Tallinn Univ.)

## Type-token distributions beyond the Zipf law: a simple model with choice

### Misha Tamm (Tallinn University, Estonia)

Zipf law is one of the very few regularities believed to permeate quantitative social sciences. This law describes distribution of tokens (objects) between a given set of types (classes), it states that if types are ordered in descending order of popularity (number of tokens in them), then the size of a type is proportional to its rank in some negative power. This law is well established in wealth distribution (Pareto law), city science (Zipf law for city sizes), linguistics (distribution of word frequencies), etc. The same law is believed to hold for the distribution of demand for cultural objects like movies, books, singles, etc. In this talk I will start with showing (based on film industry data) that this is not always the case, and in fact, a different universal behavior seems to be often observed in the data: namely, in the cases I will show the size (popularity) decays exponentially with rank. This motivates us to ask what are possible microscopic mechanisms that lead to such a behavior.

To address this question I will discuss a simple statistical model as follows. Consider a set of M initially empty classes (types) and sequentially add N tokens to these types according to the following procedure: first, choose a subset (batch) of A types at random, then determine which of the classes in this subset currently has the largest number of tokens (if there is a draw between two or more leaders, then choose one of them at random) and add an additional token to this class. I will show that this model is exactly solvable and in the limit of the large number of classes M and tokens per class N/M the size distribution of classes the size distribution of classes converges to a simple scaling form. Moreover, if the size of the batch A is also large, this form converges to the observed exponential rank-size distribution.

Finally, I will discuss the generalizations of the model in which (a) butch is chosen according to some preferential choice rule and (b) instead of adding a token to the most popular class in the batch, it is given to the least popular one.

Date/Time : 25/05/2023 - 11:00