### Discussion Paper

No. 2019-25 |
March 15, 2019

Beyond quantified ignorance: rebuilding rationality without the bias bias

(Published in Special Issue
Bio-psycho-social foundations of macroeconomics)

## Abstract

If we reassess the rationality question under the assumption that the uncertainty of the natural world is largely unquantifiable, where do we end up? In this article the author argues that we arrive at a statistical, normative, and cognitive theory of ecological rationality. The main casualty of this rebuilding process is optimality. Once we view optimality as a formal implication of quantified uncertainty rather than an ecologically meaningful objective, the rationality question shifts from being axiomatic/probabilistic in nature to being algorithmic/predictive in nature. These distinct views on rationality mirror fundamental and long-standing divisions in statistics.

## Comments and Questions

Is the contribution of the paper potentially significant?

In my opinion, absolutely. The paper articulates a subtle and elegant observation about the nature of statistical modelling (and in fact all modelling): that error can be broken down into variance and bias. The point is, as I see it, that ...[more]

... a complex model may have high error under conditions of uncertainty, while misleadingly conveying low bias, which can be mistaken for low error. In situations of real, radical uncertainty about what the "correct" solution is, the "bias bias" the author speaks of is the tendency to prefer models with low bias over those with low variance, even though both alternatives can equally contribute to error. Note that simple models are (by definition) easier to understand, and that is (in my opinion) important to the author's point. Bias bias is implicitly preferring more complex models because they appear more mathematically consistent, even though their complexity belies variability in the answers they might provide. Moreover, that variability could relate to "bias" in another sense of that word.

A somewhat unfortunate aspect of the bias/variance breakdown is that “bias,” in the casual meaning of that word (e.g., prejudice), may be worse and more hidden in complex models, precisely because of what this paper indicates. That is to say, in models like deep learning networks, biases (intentional or otherwise) in the selection of features, processing steps, training data, etc., can be hidden in model complexity. Low bias (in the sense used in the paper) only acts to further hide these effects in complex models, by conveying a sense of mathematical sophistication and reliability. A simpler model might be equally wrong, but the biases in the casual sense of the word will be less hidden there.

However, that is one amongst a broad variety of problems with language in the field of statistics, and the author need not address that in this paper. The paper is clear and internally consistent on this point, so that is not a complaint.

A valuable aspect of the paper is the offering of four common and understandable examples where a complex model misleads, and a simpler model is more appropriate under conditions of high uncertainty. In these examples, the author shows that technically the simpler models do better.

Is the analysis correct?

Yes, and given the gravity and importance of the issue, and the clarity of the paper’s articulation of that issue, I most certainly feel it should be published and widely read.

I have a question I’d like to ask the author, but that is off point for the paper, per se. That question is whether the author feels there is an intrinsic value to simpler models because of their higher variance, in that such models may allow greater human adaptivity, for psychological reasons. That is to say, when one contemplates such models as a part of human reasoning, perhaps higher variance allows for psychologically easier switching of models when new data fails to agree with existing models. Lower bias models may provide a sense of confidence that is more difficult to shake as data fails to conform with current assumptions.

I’ve approached that question as a matter of psychology, but it would also be useful to consider it as a technical question about algorithmic modelling. Do simpler (and other higher) bias models prove to be more adaptable (in terms of computational effort) when the data stream is from a non-stationary source?

As I said, those are questions I’d like to discuss with the author, but I don’t feel they need addressing in this paper, which is quite informative and self-contained.