A Note on Observational Equivalence of Micro Assumptions on Macro Level

We set up a simplistic agent-based model where agents learn with reinforcement observing an incomplete set of variables. The model is employed to generate an artificial data-set that is used to estimate standard macro econometric models. We show that the results are qualitatively indistinguishable (in terms of the signs and significance of the coefficients and impulse-responses) from the results obtained with a data-set that emerges in a genuinely rational system.


Introduction
The development of microfounded models has become a common topic in the economic literature. Notably, these models usually only serve as theoretical foundations and play little role in empirical analysis or forecasting. As pointed out by Wren-Lewis (2018), the researchers typically choose what developments they want to explain and develop the model accordingly.
The task is claimed to be successfully accomplished if a researcher is able to produce a model that replicates some empirical stylized facts. These empirical descriptors are usually presented in terms of simple measures (i.e. regression coefficients or impulse-response functions from a vector autoregression model) estimated using aggregate data. Following the famous papers by Lucas (1972) and Sargent (1973), the rational expectations approach became the central methodology for providing the microfoundations of economic models. The reasons for such dominance are not entirely clear. Although the rationality of expectations is a convenient assumption, whether it actually holds in practice is an open question. As outlined in the seminal literature on bounded rationality (see e.g. Simon 1955) and more recently discussed by Gigerenzer and Gaissmaier (2011), Dosi et al. (2017), Caverzasi and Russo (2018) and Haldane and Turrell (2018), faced with uncertainty, it is often rational for agents to rely on simpler decision rules. These are often called "rules of thumb" or heuristics. Use of such rules is sometimes thought to be arbitrary, sub-optimal or irrational. Yet in a world of uncertainty, that is far from clear. Heuristics may be the most robust means of making decisions in a world of uncertainty.
The ability of rational expectations models to reproduce some of the stylized facts observed in empirical aggregate data is far from unique. It is well known that seemingly simple systems may produce intricate and often efficient developments of the modelled variables. See for example seminal work by Arthur (1994), as well as Gode and Sunder (1993) and Shaikh (2016).
As regards examples with a closer relation to macroeconomics, Andolfatto et al. (2008) and Ilek (2017) use Monte Carlo experiments to show that a model in which the expectations are not rational may generate artificial datasets where rationality would not be rejected by the textbook tests.
We contribute to this discussion by providing an example of how the aggregate indicators from a world populated by bounded-rational agents may be indistinguishable from rational developments based on the conventional econometric models. Unlike Andolfatto et al. (2008), Ilek (2017) or Dosi et al. (2017), we use a radically different set-up to model bounded-rational agents. For this purpose, in the spirit of Schuster (2012), we set up an agent-based model populated by agents who learn with reinforcement. In such model (where the agents simply make one binary choice) the concept of expected value does not even exist. We generate several sets of artificial observations using alternative specifications of the learning algorithm and investigate the properties of the standard econometric models estimated using these datasets. Arguably, this approach is a good illustration of how undemanding are the requirements for the "correct" correlations between macro variables to emerge.
The rest of this paper is structured as follows. Section 2 outlines the modelling set up. Section 3 presents the results of the experiments. Section 4 concludes.

The model
Our example is the market of a good where in each period the producers need to decide on whether to participate based on their costs and expected price. There are n agents. Each of them may produce q n goods. The agents only have a binary choice: produce q n or zero. q n are agent-specific parameters determined as q n = s n Q max (where Q max represents the maximum output) and the shares s n are determined randomly (first drawn from the uniform distribution ∈ (0,1) and then normalized s n = /∑ so that the sum equals unity).
The agents incur costs (c n,t ): c n,t = λ n C t + ε n,t where C t is the trend component and ε n,t are random agent-specific innovations. The trend follows the exogenous autoregressive process The price P t of the good is determined by where Q t is the sum of the q n of the agents that decided to participate in the market. Here, D t is the demand for the good. It follows an exogenous autoregressive process: An agent's profit (w n,t ) is determined by w n,t = q n (P t -c n,t ) Note that q n = 0 when an agent decides not to participate in the market. The values of the parameters are presented in the Annex 1.
In the next subsections we describe the alternative algorithms the agents use to decide on their market participation.

Rational agents
The first type of agents know all the data generating processes, distributions of parameters and past values of global variables. They also assume that so do all other agents. We label this type of agents as rational.
In the beginning of the period all agents get to know the realisation of their costs (c n,t ). They are also able to calculate the expected values of demand ( ) and trend costs ( ) using the known values of , , α 0 , α 1 , β 0 and β 1 . Next the agents calculate the expected output ( ) and price ( ).

Economics: The Open-Access, Open-Assessment E-Journal 13 (2020-3)
www.economics-ejournal.org 4 The stylised demand curve may be expressed as = / . (1) Note that the individual agents' costs are uniformly distribute from λ min to λ max . Accordingly if < λ min zero goods will be produced. If λ max maximum (Q max ) goods will be produced. In other cases the share of goods supplied (out of Q max ) will approximately be proportional to the ratio of a current margin ( λ min ) to a maximum margin ( λ max λ min ). Therefore the stylised supply curve may be expressed as = ( 2 ) The system of (1) and (2) may be solved for given and (see Figure 1 for a visualization) and is calculated. Agents with c n,t < will participate in the market.

Learning agents
An alternative paradigm is the learning procedure where the agents do not know the underlying data generating processes. Instead, the strategies that lead to losses tend to be abandoned, while strategies that lead to profit tend to be preferred. In this paper we employ the reinforcement learning approach outlined in Sutton and Barto (1998) and implemented in an agent-based framework by Schuster (2012). For illustrative purposes, we have intentionally selected a concept that is simplistic and conceptually very dif-ferent from the standard rationality assumption. This algorithm proposes a simple generic decision model for boundedly rational, adaptive artificial agents. It assumes that the agents start with very limited information about the world, and possess no causal model of how their actions affect themselves or their environment. The formal description of the algorithm follows.
The agents perceive the environment as being described by a collection of k attributes {att 1 …att k }. Each attribute is represented by an observed variable and can take seven discrete values from extremely low to extremely high. 1 Accordingly, each situation may be classified by the agents as being in one of s = 7 k possible states.
In this paper we employ two types of learning agents.
The first type of agents use three (k = 3) variables as state descriptors: current agent-specific costs in relation to the past price (c n,t -), past trend costs (C t-1 ), and demand (D t-1 ) indicators. We label this type of agent as learning agents with full information.
The second type of agents only use their current agent-specific costs in relation to the past price (c n,t -) as the state descriptor. We label this type of agent as learning agents with limited information.
After the state is classified the agents choose whether to particicpate in the market. Each agent will produce q n goods with probability where f s,n,t is the attractiveness (fitness) of participation in the currently observed state. This parameter is initially set to zero and in the subsequent periods, after the price, output and profits (w n,t ) are determined, the agents that participated in the market update this value: The parameters of the learning algorithm are reported in the Annex 1.

The experiments
We generate 5 artificial datasets using the models populated by the following types of agents: • Rational agents.
• Learning agents with full information.
• Learning agents with limited information.
• Mixed strategies. The model is simultaneously populated (in equal proportions) by the three types of agents mentioned above. 2 _________________________ 1 The numeric ranges are presented in the Annex 1. The ranges were calibrated to ensure that observations were roughly equally distributed across possible states.
2 The "rational" agents still assume that all other agents are rational. In this respect the agents are only pseudorational.
• Random strategies. The agents choose to participate in the market with the invariant probability of 0.5. We report the results for this system for illustrative purposes to demonstrate the role of the decision making algorithms.
We conduct 500 independent model runs each producing 5000 observations. We discard the first 2000 observations and only use the remaining 3000 after the learning agents' systems have already arrived at the steady state.
The descriptive statistics for the obtained artificial series of aggregate output, prices and average profit per agent are reported in Table 1. The results show that the market participation rate in the learning agents' system is somewhat lower than in the rational world (although it increases if different types of agents coexist in one system). This is reflected in lower output, higher prices and lower profitability (although rational agents do not significantly outperform the learning agents). The profits are also somewhat more volatile in the learning agents' systems.
Obviously, these findings are model-dependent and as such serve little purpose other than to demonstrate there are noticeable differences between the dynamics emerging in the different systems. But are these differences sufficient to distinguish between rational and boundedly rational worlds by estimating the standard macro econometric models? We examine this issue in the next sub-sections by using the artificial datasets to estimate such models. 3 Profit per agent 6.9 (0.1) 6.1 (1.2) 6.5 (0.7) Aggregate 5.7 (0.6) Rational: 6 (0.7) Learning (both types): 5.5 (0.7) -8.4 (3.2)

GMM regressions
We start by regressing the aggregate output variable (Q t ) on trend costs (C t ) and demand (D t ) variables. 4 The estimation is conducted via the Generalized Method of Moments (GMM) and _________________________ 3 Note that since there is no observable measure of the forecasted developments in this model the conventional rationality test used by e.g.
Ilek (2017) is not applicable. Instead we conduct other experiments that can be implicitly interpreted as an evidence of rationality.
4 For estimation of the models presented in Section 3.1 and 3.2 we pool the datasets across all 500 model runs and use the logs of the variables. There is a direct mechanical link between the two endogenous variables (output and price). There is no additional information contained in the models estimated for prices as the dependent variable. Therefore we only report the model estimates for aggregate output.
the lags of the dependent and the explanatory variables are used as the instruments. The obtained coefficients are conventionally interpreted in the literature as the representation of agents' reactions to the fluctuations of the expected values of the variables in question (see, e.g. Gali and Gertler 1999).
The results for the datasets generated via alternative models are presented in Table 2. The coefficients in all systems have the expected sign. 5 Interestingly, even the model estimated for the limited information agents indicate that output 'reacts' to fluctuations of demand, although we know that formally this is not the case. This is not surprising since in the model (as well as, arguably, in reality) information about the price level indirectly provides information about the level of demand. The models' fit (R 2 ) is not informative for distinguishing between rational and boundedly rational agents. Notably, the autocorrelation of the residuals is low in the system of the learning agents, although such a result is conventionally regarded as an indicator of rationality (Rich 1989

Impulse-response analysis
We proceed by estimating the conventional vector autoregression (VAR) models 7 where Y t is a vector of time series comprising output (Q t ), trend costs (C t ) and demand (D t ) variables; B(L) is a matrix polynomial in the lag operator L; u t is a vector of residuals; and e t is a vector of independent structural innovations. The lag length is 3. The identification scheme (matrix A) of independent innovations is structured as follows: C t and D t cannot be affected by _________________________ 5 All reported coefficients are statistically significant at the 1% level.
6 Note that in this model the agents are not fully rational. They only know the distributions of λ n and q n (but not the agent-specific values and their interplay) and do not attempt to correct for that.
7 Note that VAR models are convenient data descriptors and may be used to compare the output of various classes of models (see e.g. Minford et al. 2016).
any contemporaneous innovations, but the residuals of C t and D t may contemporaneously affect Q t . The impulse-response functions estimated for alternative datasets are reported in Figures 2  and 3. The results show that in all cases output is 'affected' by the innovations in the exogenous variables. Although the magnitude of the responses is somewhat different across the datasets, the general pattern is very similar. 8 In summary, our experiments show that the appropriate (i.e. corresponding to rational behaviour) correlations between the endogenous and exogenous variables are very likely to emerge even when the developments of exogenous variables are not known to the agents. Even though the agents do not know the underlying data generating process, they may efficiently adapt through reinforcement learning. Also note that even when the agents do not directly observe the developments of the exogenous variables, the information about these developments is contained in the observed endogenous variable (price). This information proves to be sufficient for the emergence of the corresponding correlations. Formal statistical tests and sensitivity analysis for these results are presented in Annex 2.

VAR-based forecasts
We also report the RMSEs of the VAR-based one step ahead forecasts of output as ratio to the forecasts obtained only basing on the previous output developments (i.e. the AR-model). In all systems information on aggregate demand and trend costs improves (to the roughly same extent) the forecasts' accuracy.

Conclusions
Developing microfounded models that are based on the rational expectations hypothesis and presenting these models as a theoretical foundation for empirically estimated macroeconometric models has become a common practice in the economic literature. Yet, the fact that the empirically established sets of correlations between macro variables are in line with those derived theoretically neither is validated nor validates the exact micro assumptions employed in the theoretical model. We provided an example to illustrate this point. We demonstrated that a simplistic learning algorithm employing a minimal set of observed indicators is sufficient to produce a set of correlations between macro variables that is indistinguishable (via standard macro econometric models) from the set of correlations that emerges in a genuinely rational system.

Annex 2
We conduct the formal statistical tests to assess the chances of being able to distinguish the results of econometric modelling obtained for alternative systems via the following algorithm. We generate the artificial datasets (containing 500 observations) using the system with rational agents and one of the alternative systems with learning agents and estimate two sets of the econometric models described earlier. Our goal is to compare the coefficients in the GMM regressions and VAR-based responses of output to costs and demand shocks (at different horizons). Using bootstrapping we generate the collection of 1000 of these estimates (i.e. the values of regression coefficients and impulse responses) for each system. Next, we calculate the collection of pairwise discrepancies between the alternative estimates. If zero is not within the 5 th -95 th percentiles' band of such distribution we conclude that the hypothesis of the equality of the estimates is rejected. We repeat this exercise 100 times and report the share of rejections (rejection rates).
The results are reported in Tables 6 and 7. Only the contemporaneous VAR-based responses of output to costs shocks are clearly different in the rational and full information learning agents' systems. The rejections rates for other indicators are relatively low indicating that the low probability being able to see the difference between the alternative systems in terms of the results obtained from the econometric models.
We proceed by examining the sensitivity of the results to the models' parameters. We consider two aspects of the models' parametrization. The first one is the degree of agents' heterogeneity represented by the variance of the agent-specific cost factors (λ n ) and of the agentspecific cost innovations (ε n,t ). The second one is the learning algorithm represented by the sensitivity to changes in the fitness function ( ) and the time discount in the fitness function (μ). The rest of the parameters simply govern the law of motion of the exogenous variables. We have calculated the rejection rates under alternative parameters and did not find the results to be sensitive to the parametrization. As an example we report the rejection rates for the VAR-based response of output to demand shock (the horizon is set to 2) for the case of learning agents with limited information (Tables 8-11).