"Abstract: We formalize and study a generalized form of the Pareto principle or '20/80–rule' as a property of bounded cumulative processes. Modeling such processes by non-negative gain densities, we first show that any such process satisfies a generalized Pareto principle of the form 'fraction p of inputs yields fraction 1 − p of outputs'. To obtain a non-trivial and unique characterization, we define the generalized Pareto principle via the decreasing rearrangement of the gain density function. Within this framework, we analyze both constructed gain densities that exemplify the framework and its imposed restrictions, as well as distribution families commonly encountered in datasets, including power-law, exponential, and normal distributions. Finally, we predict commonly encountered ranges for the generalized Pareto principle and discuss the implications of elevating a structural property into a prescriptive role."
Abstract: "Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the detection and characterization of power laws is complicated by the large fluctuations that occur in the tail of the distribution -- the part of the distribution representing large but rare events -- and by the difficulty of identifying the range over which power-law behavior holds. Commonly used methods for analyzing power-law data, such as least-squares fitting, can produce substantially inaccurate estimates of parameters for power-law distributions, and even in cases where such methods return accurate answers they are still unsatisfactory because they give no indication of whether the data obey a power law at all. Here we present a principled statistical framework for discerning and quantifying power-law behavior in empirical data. Our approach combines maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov statistic and likelihood ratios. We evaluate the effectiveness of the approach with tests on synthetic data and give critical comparisons to previous approaches. We also apply the proposed methods to twenty-four real-world data sets from a range of different disciplines, each of which has been conjectured to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out."
Referenced in the paper: "Inequality Measures: The Kolkata index in comparison with other measures" - https://arxiv.org/abs/2005.08762
Abstract: "We provide a survey of the Kolkata index of social inequality, focusing in particular on income inequality. Based on the observation that inequality functions (such as the Lorenz function), giving the measures of income or wealth against that of the population, to be generally nonlinear, we show that the fixed point (like Kolkata index k) of such a nonlinear function (or related, like the complementary Lorenz function) offer better measure of inequality than the average quantities (like Gini index). Indeed the Kolkata index can be viewed as a generalized Hirsch index for a normalized inequality function and gives the fraction k of the total wealth possessed by the rich (1-k) fraction of the population. We analyze the structures of the inequality indices for both continuous and discrete income distributions. We also compare the Kolkata index to some other measures like the Gini coefficient and the Pietra index. Lastly, we provide some empirical studies which illustrate the differences between the Kolkata index and the Gini coefficient."
bikenaga•1h ago
Referenced in the paper: "Power-law distributions in empirical data" - https://arxiv.org/abs/0706.1062
Abstract: "Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the detection and characterization of power laws is complicated by the large fluctuations that occur in the tail of the distribution -- the part of the distribution representing large but rare events -- and by the difficulty of identifying the range over which power-law behavior holds. Commonly used methods for analyzing power-law data, such as least-squares fitting, can produce substantially inaccurate estimates of parameters for power-law distributions, and even in cases where such methods return accurate answers they are still unsatisfactory because they give no indication of whether the data obey a power law at all. Here we present a principled statistical framework for discerning and quantifying power-law behavior in empirical data. Our approach combines maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov statistic and likelihood ratios. We evaluate the effectiveness of the approach with tests on synthetic data and give critical comparisons to previous approaches. We also apply the proposed methods to twenty-four real-world data sets from a range of different disciplines, each of which has been conjectured to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out."
Referenced in the paper: "Inequality Measures: The Kolkata index in comparison with other measures" - https://arxiv.org/abs/2005.08762
Abstract: "We provide a survey of the Kolkata index of social inequality, focusing in particular on income inequality. Based on the observation that inequality functions (such as the Lorenz function), giving the measures of income or wealth against that of the population, to be generally nonlinear, we show that the fixed point (like Kolkata index k) of such a nonlinear function (or related, like the complementary Lorenz function) offer better measure of inequality than the average quantities (like Gini index). Indeed the Kolkata index can be viewed as a generalized Hirsch index for a normalized inequality function and gives the fraction k of the total wealth possessed by the rich (1-k) fraction of the population. We analyze the structures of the inequality indices for both continuous and discrete income distributions. We also compare the Kolkata index to some other measures like the Gini coefficient and the Pietra index. Lastly, we provide some empirical studies which illustrate the differences between the Kolkata index and the Gini coefficient."
[edited: added two references]