No. 2017-55 | August 21, 2017
A replication of willingness-to-pay estimates in ‘An adding up test on contingent valuations of river and lake quality’ (Land Economics, 2015)


Desvousges, Mathews and Train (2015, http://le.uwpress.org/content/91/3/556.refs) find that their contingent valuation method (CVM) survey data does not pass the adding up test using a nonparametric estimate of mean willingness-to-pay. Their data suffers from non-monotocity, flat bid curve and fat tails problems, each of which can cause willingness-to-pay estimates to be sensitive to the approach chosen to measure the central tendency. Using additional parametric approaches that are standard in the literature, I find that willingness to pay for the whole is not statistically different from the sum of the parts in two of three additional estimates. In additional robustness checks, all six of the additional tests find that the WTP estimates do not reject the adding up hypothesis. The negative result in Desvousges, Mathews and Train (2015) is not robust to these alternative approaches to willingness-to-pay estimation.


To read the original authors’ response to this replication study, see “Original authors’ feedback” in the Comments and Questions section below.

John C. Whitehead (2017). A replication of willingness-to-pay estimates in ‘An adding up test on contingent valuations of river and lake quality’ (Land Economics, 2015). Economics Discussion Papers, No 2017-55, Kiel Institute for the World Economy. http://www.economics-ejournal.org/economics/discussionpapers/2017-55

William Desvousges, Kristy Mathews, and Kenneth Train - Original authors' feedback
August 21, 2017 - 09:17

Whitehead estimates distributions of WTP that contain a large share of negative values, which implies that many people are willing to pay to prevent the environmental improvement, as well as having infinite mean WTPs, which implies that some people are willing to pay more than any stated amount for the ...[more]

... improvement. These unrealistic estimates do not provide information about adding-up. Rather, Whitehead’s analysis highlights the general problem in CV studies of inadequate response to the cost prompts.

Please see the detailed reply in the attached file.

Anonymous - Referee Report 1
September 18, 2017 - 10:05

This is a nicely done and very thorough replication of a controversial paper which shows that its results are not robust to commonly used alternative econometric specifications. Most of the issues that need corrected with the paper are in the beginning or end of the paper and are mostly to ...[more]

... help a reader less familiar with the original papers and topics at issue.

1. The major and easily fixable problem with the paper is that the first paragraph is extremely terse and is not adequate for a reader not closely familiar with the Desvousges, Mathews and Train (2012) paper and Chapman et al. (2016). A little more detail on what Desvouges et al. did would fix this. The argument in Chapman et al. (2016) that the comparison being made is not correct could be put in the current footnote 1 which could be expanded to provide some intuition.

2. p. 2, end of second paragraph, not sure that the first part of the last sentence on “fat tails” is the correct description. The main issue is that because the high amounts necessary to drive the percent WTP to zero are not plausible, a good contingent valuation (CV) does not ask them. The problem can be fixed by adding “asked” in front of “does not cause” in the previous sentence and changing the last sentence to “As such, the WTP estimate will always be sensitive to the assumptions of the estimation approach used.”

3. p. 2, third paragraph, the ABERS estimator is a “special case” of the “more general” Turnbull (see Turnbull, 1976). Drop the sentence beginning “Both nonparameteric …”. There is only one nonparametric approach (ML subject to a weak monotonicity assumption with variants of the Turnbull depending on the pattern of censoring and truncation with ABERS being the original paper focused on only one of these patterns) and, as is shown in Table 1 of this paper, rarely does the Turnbull approach “obscure” data quality issues. Throughout the rest of the paragraph/paper resolve the ABERS/Turnbull confusion. (Note there is a problem in formula in the Haab and McConnell book in that does not correctly perform the pooled-adjacent low-bound for some patterns of monotonicity violations.)

4. p. 2, change last sentence to “… studies on conducting sensitivity analysis for WTP estimation approaches.”

5. p. 3, top of page, need a couple more sentences for most readers that explain the nature of the data in Table 1.

6. p. 3, paragraph beginning “Even when …” change “not different” to “not statistically different”. This paragraph overall is confusing because the situation being illustrated is not the case where strong monotonicity is meet. Further, what the reader needs to know here although economic theory suggests (weak) monotonicity should hold sampling variability can cause such a constraint to violated in empirical samples (and as this paper notes but at the end, particularly in samples sizes for individual bid points as small as those in DMT).

7. p. 4, bottom of the page (and footnote 2). It is not clearly why DMT use a bootstrap approach when the parametric standard errors are well-defined and it is hard to see why there should be any substantial deviation from the assumptions need to estimate the probability of a yes response at each bid point. DMT do not describe how they perform the boostrap calculation of the differences but this is a non-standard bootstrap if it correctly takes into account the stratification caused by random assignment to bid points.

8. p. 7, first sentence of the conclusion. Would change the first sentence to: “While it is not clear that the adding-up test DMT advocate should hold (Chapman et al., 2016; Whitehead, 2016) in this case, this replication of DMT shows that it cannot be rejected under two of three alternatives and commonly used parametric econometric specifications.

9. p. 8, would drop the sentence: “Many of the problems …”, as it speculates on the size of DMT’s research budget, which was likely large enough to support much large sample sizes.