Discussion Paper
No. 2016-48 | December 05, 2016
Stan J. Liebowitz
A Replication of Four Quasi-Experiments and Three Facts from ‘The Effect of File Sharing on Record Sales: An Empirical Analysis’ (Journal of Political Economy, 2007)
(Published in Replication Study)


The influential piracy paper by Professors Oberholzer-Gee and Strumpf, although mainly based on proprietary data, contained an “important complement” to the main results, consisting of four “quasi-experiments” using publicly available data. This replication examines all of these quasi-experiments, first, by narrowly using identical data and statistical methods, as well as in a broader sense by extending or augmenting the data or methods. This study concludes that none of the four quasi-experiments provide evidence in support of OS' hypothesis that file-sharing has not harmed record sales.Note of the Editor: The findings should be viewed as tentative until the paper has completed the review process and been published as an article.

Data Set

JEL Classification:

Z1, O3, L8


Cite As

[Please cite the corresponding journal article] Stan J. Liebowitz (2016). A Replication of Four Quasi-Experiments and Three Facts from ‘The Effect of File Sharing on Record Sales: An Empirical Analysis’ (Journal of Political Economy, 2007). Economics Discussion Papers, No 2016-48, Kiel Institute for the World Economy. http://www.economics-ejournal.org/economics/discussionpapers/2016-48

Comments and Questions

Anonymous - Reader Comment
January 24, 2017 - 09:09
The analysis of Oberholzer-Gee and Strumpf (OS) was always a clear outlier in the literature on the effect of downloading on legitimate sales of pre-recorded music. It is, as far as I am aware, the only study out of a population of perhaps 10 to 15 major papers, all published in peer-reviewed journals, that does not find a significant negative impact of downloading on record sales. Prof. Liebowitz’s very careful and meticulous review of the OS paper seems to have pinpointed a host of failures in the data used and in the analysis carried out. I find Prof. Liebowitz’s arguments to be well constructed and persuasive, and in the end totally in accordance with both logic, and with the message that the rest of the published literature on this topic have repeatedly insisted upon.

Anonymous - Referee Report 1
January 24, 2017 - 09:37
see attached file

Stan Liebowitz - Response to Referee 1
March 14, 2017 - 21:00
1. The reviewer is correct to wonder whether changes in the number of pirates is a good proxy for summer changes in the number of files being pirated, particularly when OS claim that students lose high speed Internet connects in the summer, with the implication that they might download less in the summer even if they still download . But I have at least two response. First, OS used the number of pirates in their quasi-experiment. Since I am performing an identical replication, I used the identical data used by Oberholzer-Gee and Strumpf. Another point is that most college students in the U.S. did not have high speed Internet at their residences during the school year, since most college students do not live in dormitories. Thus, they would not lose high speed access during the summer. Approximately half of college students go to community colleges, which do not have dorms. Of the other half, statistics indicate that maybe 20% live in dorms, with most others living off campus or with their parents. Finally, it is not clear that there is a better available measure than the number of pirates. We have to work with what is available. 2. I believe that the difference between my East/West Coast results and the results of OS are greater than the reviewer realizes. While it is true that market shares of music sales in the East versus and West Cost do not change very much in absolute amounts over the four or five year periods, the absolute change is largely irrelevant to the quasi-experiment, making the OS reliance on the absolute change inexplicable. The East share increasingly declined relative to the West share each and every year, in a manner consistent with the ‘piracy harms sales’ hypothesis. Although the referee wonders whether OS ran any sort of regression, OS provide no indication that they did. But if a regression is run, the results are significant. Regressing the ratio of East Coast to West Coast market shares on a time trend, you get statistically significant support for the piracy hypothesis with a (robust) standard error (t=6.8) even though there are only 6 observations. I will put this in the new version of the paper, along with a clearer explanation. 3. The reviewer is correct that unemployment might not be conceptually superior to other variables related to consumer income. But it is the only one of these variables that I am aware of that is available monthly, making its format consistent with the other data, and there is little reason to believe that other possible measures would be more useful, or provide different results. I also agree that the results for individual years (as opposed to groups of years) would be useful, and that is why I included them in the appendix, but perhaps they should be moved to the main text. And I will try to better explain that the magnitudes of sales declines were relatively small in the first few years after Napster. 4. The author is not convinced that my replication of “facts” is worthwhile. He does bring up an interesting point about the fact that is most difficult to pin down—viz that some “major” markets had presumably experienced sales increases in the years after Napster. I hadn’t contemplated that OS might have been thinking about U.S. MSAs instead of countries. But just as was the case for countries, only 7 of the top 100 MSAs (Nielsen DMAs) failed to experience a sales decline (by 2003), and the largest of those was number 15 in size. Also, I did examine quantities and revenues separately and the results do not differ. So while we cannot be completely sure that OS are incorrect on this point, we can be pretty confident. The other two OS “facts,” however, are clearly incorrect. I think these facts are important because I believe they shed some light on whether OS were careful and unbiased in their analysis. I believe the answer to that question is “no.”

Anonymous - Invited reader report
February 03, 2017 - 11:51
see attached file

Anonymous - Referee report 2
February 22, 2017 - 14:21
1. Section 2: Explain how shares are computed. Is it WEST/(EAST+WEST)? What about the other regions? 2. Section 2, analysis of Table 2: I do not agree with the comment below Table 2 that shows that the difference between East and West sales was smaller in the P2P area (-7.77%) than before (-9.75%). The comparisons should be done before 1999 and after 2000. The current comparisons mix pre and post P2P periods. 3. Table 3 and Footnote 12: all comparison are made to the year 2000, which is a worldwide peak in the the number of sales. Please compare the sales decline to the first year of the dataset 2002. 4. Comment on the last column of Table 3: why include unemployment rate (measured in numbers not in percentage I assume, this should be clarified) and not income or other measures of growth? It is hard to believe that the inital coefficient on piracy (-0.43) is multiplied by 5 by adding a single variable. 5. Table 4 and 5: Again do not compare to year 2000, but to year 2002. 6. Section 6 includes too many guesses and too little facts to be named fact checking. The section should discuss the growth of digital sales. 7. The tone of the article is not neutral. If Mr Liebowitz has been hired by record companies, this should be stated on the first page of the document, as well as any other potential conflicts of interest.

Stan Liebowitz - Response to Referee 2
March 14, 2017 - 20:56
Response to referee report 2 (these section numbers match those of the referee): 1. The referee wants to know whether geographic market shares were calculated using only the East and West, or whether all four time zones were included. In the replication, I refer the reader to the Appendix for details of the calculation. The Appendix shows that the market shares for the East vary from 50.73 to 52.10, depending on year, and the shares for the West vary from 16.59 to 18.09. Since these do not sum to 100, I thought it was apparent, as is the case, that the middle time zones (Mountain and Central), are included in these calculations of market shares. I will try to make this point clearer in the next version of the paper. 2. The referee seems to misunderstand Table 2 which does compare pre and post P2P periods. The point of the quasi-experiment was to compare changes in the market shares of the East and West coasts after P2P became common. Every row in Table 2 makes such a comparison. The first years are either 1998, 1999 or 2000, as potential years prior to P2P. The latter years are either 2002 or 2003 as years after P2P. OS used years 1998-2002. In every one of these periods, including the one used by OS, the East fell relative to the West, although OS never report this fact. [As I noted to the other reviewer, the trend is statistically significant.]I will try to make this discussion clearer in the next version of the paper. 3. The year 2000 is the year that Napster grew from virtually zero into an important phenomenon where it could start to impact the sales of records in the U.S. Therefore, it is reasonable to compare sales in later years to pre P2P years of 2000 (or 1999). The analysis I perform is completely standard, applying the regression coefficient on P2P to the average number of pirates in the years for which we have complete data—2003, 2004, and 2005. The referee does not explain why he/she would like the comparison to be made to 2002 which is two years after P2P began and for which we do not have complete data with which to gauge the impact of piracy that year since piracy data start in August of 2002. 4. The data used by OS in this quasi-experiment (and in my replication) are monthly. Unemployment is measured monthly whereas GDP or income are measured quarterly, which is a key advantage for using unemployment. The reviewer is incorrect in believing that the unemployment rate is not a percentage, as should be clear from the term “rate.” Finally, the reviewer might find it hard to believe that the inclusion of the unemployment rate has such a large impact, but I will include the raw data for anyone to double check, and this reviewer is free to do so. 5. OS use the period 1999-2005 in this quasi-experiment and I am trying to replicate their results. There is no logical reason that I can see to begin the analysis in the year 2002, nor does the reviewer provide such a reason. Further, if one wishes to examine the relationship between file sharing intensity and lost sales it makes sense to use all the data available since file sharing began, which was not 2002. 6. OS state three “facts” in support of their thesis which I dispute in my replication. Because OS did not provide any support for their “facts” it is difficult to directly check them. Nevertheless, their claim that the U.S. 2005 decline in record sales relative to 2004 was due to one firm, is clearly incorrect as is their claim that record sales rose in four of the five largest international markets. I am not aware of any dispute on these points, and clearly the referee does not provide any contrary evidence. The only uncertainty, due to the imprecise nature of the word “major,” is the OS claim that some “major” markets had flat or rising sales, but I believe that my evidence makes clear that that this claim is incorrect also, under reasonable meanings of the term “major.” 7. It is ironic that the author of this review accuses my replication of having a biased tone. Surely, the tone of this referee report is far more hostile than the tone of my replication. I will also point out that if my replication were prejudiced, as the referee implies, it should have been easy for the referee to find errors or misrepresentations in my analysis, which, based on this report, the referee was unable to do.

Robert W. Reed - Decision letter
May 03, 2017 - 15:04
see attached file