### Discussion Paper

## Abstract

We propose a new method for estimating the power-law exponent of a firm size variable, such as annual sales. Our focus is on how to empirically identify a range in which a firm size variable follows a power-law distribution. As is well known, a firm size variable follows a power-law distribution only beyond some threshold. On the other hand, in almost all empirical exercises, the right end part of a distribution deviates from a power-law due to finite size effect. We modify the method proposed by Malevergne et al. (2011) so that we can identify both of the lower and the upper thresholds and then estimate the power-law exponent using observations only in the range defined by the two thresholds. We apply this new method to various firm size variables, including annual sales, the number of workers, and tangible fixed assets for firms in more than thirty countries.

Paper submitted to the special issue

New Approaches in Quantitative Modeling of Financial Markets

## Comments and Questions

Nice paper. Two small comments:

1) I would like to ask if this methodology could be improved by adopting the rank-1/2 à la Gabaix-Ibragimov (2007).

2) Larger plots in Figure 1 would help the view on the different years.

References:

Gabaix-Ibragimov (2007): http://www.nber.org/papers/t0342.pdf

Thank you very much for your comment.

I am Shouji Fujimoto, one of the authors of this paper.

I have read Gabaix-Ibragimov (2007), the paper suggested by you.

I find the suggested paper contains interesting and relevant information to me.

1)

I think there is an important difference ...[more]

... between ours and Gabaix-Ibragimov (2007) in terms of its purpose.

In Gabaix-Ibragimov (2007), the power-law region (i.e. the region that follows a power law distribution) is treated as given.

They do not discuss how to identify it.

In our case, however, we pay particular attention to how to empirically identify the power-law region.

Also, I do not think our result would change substantially even if we apply their method to our dataset.

The problem addressed by them is more serious when the number of observations is small, which is not our case.

2)

Thank you very much for your advice.

I entrust the editors, because I have already contributed it.

See attached file

I want to thank the referee for the careful reading and instructive comments and suggestions. My responses are as follows (I will send the revised version to the editor.):

1. In my opinion, behavior of the test statistics (or p-value) should be shown in the paper. I propose authors ...[more]

... to include the Figure which will plot the p-values for all k-th, 2k-th, 3k-th test statistics. This would bring more insight on how the method works on the tested data.

Answer: Thank you for the good proposal. I have included the Figures of the behavior of p-value.

2-4, 6, 8, 10

Answer: Thank you for the comments very much. I have changed them.

5. p.6., line 11: I would propose to use 2 different lines for upper and lower thresholds in Fig.2. Now authors describe "two vertical dotted lines", but I can see also some dashed line so this is not clear from the Figure. This applies also to Figure caption.

Answer: Thank you for the good advice. I have replaced them by 2 different lines. The solid line is used for the lower threshold. The dashed line is used for the upper threshold.

7. p.8. Eq. 17: Where did the constant from Eq. 16 disappear?

Answer: Thank you for the comment. I have inserted a description about the constant.

9. p.10, line 11: use different word instead of "thing".

Answer: Thank you for the advice. I have used "results" instead of "thing".

See attached file

I want to thank the referee for the careful reading and instructive comments and suggestions. My responses are as follows (I will send the revised version to the editor.):

1) Since authors methodology deals with the problem of estimate an interval in which perform a straight line fit, they ...[more]

... apply a lower and a higher cut off to the empirical data. Higher cut off eliminates the most extreme events from data, surely the most noisy, but may be the most interesting. A brief discussion of this fact on the quality of the fit will be welcomed.

Answer: Thank you for the suggestion very much. I agree with this opinion. I have added the discussion in a footnote of the conclusion section.

2) There exist previous work about the use of the Anderson-Darling statistic ( which is a sensitive statistic for detecting departures in the tails from the hypothesized probability distribution) that could be a useful reference. In these works the procedure of estimating an optimal cutting off point for fitting power law distributions and fitting a power law distribution from left censored samples are discussed. Both works of relevance for the methodology presented in the manuscript.

Answer: Thank you for the advice very much. The purpose of the works is almost same as our study. I have added them in references, and have cited them in a footnote of the introduction section.