@ZacGross Models can't do that. Caus/corr problem needs knowledge of data generation. Model=assumed knowledge. Value is in predictive power
— Cameron Murray (@Rumplestatskin) October 12, 2013
It is a very common attitude in economics. Models, their solutions and any data correlations consistent with those solutions, are believed to constitute evidence that the assumptions embedded in the model accurately capture causal relations of some real life phenomena.
But of course that’s not the case. The key value of a scientific model is in its ability to predict outcomes in new situations, but also to generate new questions and directions for research. The model is not the answer, its a tool for discovery.
I have been reading Australian sociologist Duncan Watts’ book Everything is Obvious, which reminded me of the importance of evidence and the limitations of the model-building and correlation approach that almost defines economics.
Watts, a physicist turned sociologist whose work on networks is revolutionising the discipline, is completely frank about the near impossibility of determining causality in the one-shot experiment that is real life. In the section ‘Whoever tells the best story wins’, he concludes that
Part of the problem is also that social scientists, like everyone else, participate in social life and so feel as if they can understand why people do what they do simply by thinking about it. It is not surprising, therefore, that many social scientific explanations suffer from the same weaknesses—ex post facto assertions of rationality, representative individuals, special people, and correlation substituting for causation—that pervade our commonsense explanations as well.
No matter how much your model appeals to your intuitive reasoning, or how well it fits the data, it cannot be shown to be of scientific value unless it offers useful predictions. For the economists out there just consider that models of constrained optimisation are simply a bunch of simultaneous equations, which read equally well in reverse (as do correlations). Moreover, micro-models of this persuasion almost always overlook methods of aggregation, leaving us to guess what sort of aggregate patterns should occur in the data.
A discussion on the use of economic models would be incomplete without referring to Milton Friedman’s views that the reality of assumptions are unrelated to the usefulness of a model.
Consider the problem of predicting the shots made by an expert billiard player. It seems not at all unreasonable that excellent predictions would be yielded by the hypothesis that the billiard player made his shots as if he knew the complicated mathematical formulas that would give the optimum directions of travel, could estimate accurately by eye the angles, etc., describing the location of the balls, could make lightning calculations from the formulas, and could then make the balls travel in the direction indicated by the formulas. Our confidence in this hypothesis is not based on the belief that billiard players, even expert ones, can or do go through the process described; it derives rather from the belief that, unless in some way or other they were capable of reaching essentially the same result, they would not in fact be expert billiard players.
My reading of this passage is that models should be judged on their predictive powers rather than their assumptions. Yet it also implies that if more plausible assumptions are possible that yield similar predictions, perhaps these generate more plausible models.
If I were to propose a model of expert billiard play I wouldn’t start with the laws of physics but rather with a model of learning by trial and error. This simple model not only has more plausible assumptions, but predicts ‘expertness’ in billiards correlates with practice. It is also a general model applicable to such games as lawn bowls, where Friedman’s calculating-man model would require significant modifications to account for the weighted bowls. Friedman’s model is merely an assumption about the data-generating process. It translates to “if I know the data-generation process from the point when a ball is struck, I can use that knowledge to make a useful model that includes a prior point in time”.
To reiterate, data can’t verify, support or prove (or even contradict) the causal assumptions in a model unless we have controlled part, or all, of the data generation process (either through experiment, natural, field or otherwise).
Meanwhile, we have a whole field of econometrics that attempts to match models to data - refining the art of assumption-hiding and promoting the illusion of causality testing. For example, Angrist and Pischke’s book Mostly Harmless Econometrics: An Empiricist’s Companion is very loose with notions of causality. They say
Two things distinguish the discipline of econometrics from the older sister field of statistics. One is the lack of shyness about causality. Causal inference has always been the name of the game in applied econometrics. Statistician Paul Holland (1986) cautions that there can be “no causation without manipulation,” a maxim that would seem to rule out causal inference from nonexperimental data. Less thoughtful observers fall back on the truism that “correlation is not causality.” Like most people who work with data for a living, we believe that correlation can sometimes provide pretty good evidence of a causal relation, even when the variable of interest is not being manipulated by the researcher of experimenter.
They go on in the quoted chapter to discuss the use of instrumental variables methods address part of the causality problem. But recall the requirements of a useful instrument
a variable (the instrument, which we’ll call Zi), that is correlated with the causal variable of interest Si, but uncorrelated with any other determinants of the dependent variable.
If you are thinking a little here you would realise we have simply introduced a second layer of model assumptions about the true data-generation process. You may believe there is a valid reason to do this, but again, the model can’t say whether this reason is sound or not. You are simply deferring one assumption about the nature of the world to an alternative, and perhaps more plausible assumption.
What is more interesting is that founders of the instrumental variables method where challenged in the 1920s by the problem of causal inference in a model of markets with supply and demand curves. Since price is the simultaneous solution to supply and demand in the model there was no way to differentiate relative movements of the curves. Such problems persist to this day when applying demand/supply models to market analysis.
Models aren't quite the scientific tools economics often believe them to be. At best they offer plausible stories about a particular phenomena and provide some predictive power. The religious attachment of the economics discipline to its core models is at times quite astounding.
It is genuinely challenging for social scientists to make gains in knowledge under the uncontrollable conditions of real life, and I can only hope that the future of research involves far more experimentation, either in the lab or in the field. In the mean time I hope the profession can be far more honest about the limits to knowledge, more humble in its policy recommendations, and more open to competing views of the world whose claims often stand on equal scientific footing.
It is genuinely challenging for social scientists to make gains in knowledge under the uncontrollable conditions of real life, and I can only hope that the future of research involves far more experimentation, either in the lab or in the field. In the mean time I hope the profession can be far more honest about the limits to knowledge, more humble in its policy recommendations, and more open to competing views of the world whose claims often stand on equal scientific footing.